Data Science Workshop (DSW) provides a cloud-based integrated development environment (IDE) for AI development. It includes multiple built-in development environments. If you are familiar with Notebook or VSCode, you can quickly start developing models. This topic describes the parameters that you can configure when creating an instance and provides solutions to common issues that might occur when you start or release an instance.
Prerequisites
You must activate Platform for AI (PAI) and create a workspace. To do this, log on to the PAI console with your Alibaba Cloud account. In the upper-left corner, select a region, and then authorize and activate PAI. For more information, see Activate PAI and create a workspace.
Create an instance using the console
Important When you create an instance that uses public resources, you are charged for the instance's running time. Billing stops after you stop or delete the instance. For more information about billing, see Billing of Data Science Workshop (DSW).
Go to the DSW page.
Log on to the PAI console.
On the Overview page, select a region.
In the navigation pane on the left, click Workspaces. On the Workspaces page, click the name of the workspace that you want to use.
In the navigation pane on the left of the workspace, choose .
Click Create Instance.
On the Configure Instance page, configure the following key parameters.
Basic information
Parameter | Description |
Instance Name | Configure the name of the DSW instance based on the on-screen prompts. |
Tag | Add tags to the instance as needed. This helps you find, locate, manage, and split bills for resources across different dimensions. |
Resource information
Parameter | Description |
Resource Type | |
Environment context
| Description |
Image | In addition to Official Image, the following image types are supported: Custom Image: You can use a custom image that is added to PAI. The image repository must be set to public pull, or the image must be stored in Container Registry (ACR). For more information, see Custom images. Image URL: You can configure the URL of a custom or official image that can be accessed over the internet. If it is a private image URL, click Enter Username And Password and configure the username and password for the image repository. To accelerate image pulling, see Image acceleration.
|
System Disk | Used to store files during development. If you set Resource Type to Public Resources, or if you set Resource Quota to subscription general computing resources (with ≥ 2 CPU cores and ≥ 4 GB of memory, or with a GPU configured), each instance receives a free system disk quota of 100 GiB. You can scale out the disk. The price for scaling out is displayed on the console page.
Warning If you use only the free system disk quota and the instance is stopped for more than 15 days, the content on the disk is deleted. You cannot scale in a disk after it has been scaled out. Scale out the disk as needed. After you scale out the system disk (free + paid), it is no longer subject to the 15-day stop limit for release. However, it will continue to incur charges. When an instance is deleted, its system disk is released at the same time. Before you delete the instance, make sure to back up all necessary data.
If you need persistent storage, you can configure Mount Dataset or Mount Storage Path. |
Mount Dataset | You can use this to store datasets that need to be read or to persistently store files from the development process. The following two dataset types are supported: Custom Dataset: You can create a custom dataset to store data files required for training. You can set whether the dataset is Read-only and select a dataset version from the Version List. Public Dataset: PAI provides pre-configured public datasets that only support read-only mounting.
Mount Path: The path where the dataset is mounted to DSW, such as /mnt/data . You can retrieve the dataset from this path in your code.
Note The mount paths for multiple datasets cannot be the same. If you configure a CPFS dataset, you must configure the network settings and select the same virtual private cloud (VPC) as the CPFS file system. Otherwise, the DSW instance may fail to be created. When you select a dedicated resource group, the first dataset must be a NAS dataset. It will be mounted to both your specified path and the default DSW working directory /mnt/workspace/.
For more information about mounting, see Mount a dataset, OSS bucket, NAS file system, or CPFS file system. |
Mount Storage Path | You can also use a storage type mount to store datasets that need to be read or to persistently store files from the development process. Supported types: OSS, General-purpose NAS file system, Extreme NAS file system, CPFS, and AI-Computing CPFS. Mount Path: The path where the dataset is mounted to DSW, such as /mnt/data . You can retrieve the dataset from this path in your code.
For more information about mounting, see Mount a dataset, OSS bucket, NAS file system, or CPFS file system. |
Working Directory | The startup path for Notebook and WebIDE, mounted to /mnt/workspace . |
Show More Configurations
Parameter | Description |
Custom Startup Script | Used to customize the environment or perform initialization tasks during instance startup. The custom script runs after the image and resources are ready, but before development applications such as JupyterLab and Code Server start.
Note Running a custom script increases the instance startup time. The timeout period for the custom script is 3 minutes. Do not run long-running tasks, such as downloading images, in the custom script. |
Environment Variable | Used for main container startup, system processes, and user processes. You can add custom environment variables or overwrite default system variables as needed. Note: Do not modify the following environment variables: # Modifications will not take effect
USER_NAME # Will be overwritten by service logic
# Do not modify these system variables, as it may affect normal use
JUPYTER_NAME: Constructed from instance information by default. Can be used to modify the JupyterLab URL access path.
JUPYTER_COMMAND: Jupyter startup instruction, set to 'lab' by default to start JupyterLab.
JUPYTER_SERVER_ADDR: JupyterLab service listener address, defaults to 0.0.0.0.
JUPYTER_SERVER_PORT: JupyterLab service listener port, defaults to 8088.
JUPYTER_SERVER_AUTH: JupyterLab access password, empty by default.
JUPYTER_SERVER_ROOT: Jupyter working directory, has lower priority than WORKSPACE_DIR.
CODE_SERVER_ADDR: code-server service listener address, defaults to 0.0.0.0.
CODE_SERVER_PORT: code-server service listener port, defaults to 8082.
CODE_SERVER_AUTH: code-server access password, empty by default.
WORKSPACE_DIR: The system sets this environment variable based on the working directory parameter set when the instance was created. It can change the startup directory for Jupyter and code-server. An error may occur if the path does not exist.
|
Advanced Configuration | Allows users to adjust certain secure kernel parameters required by their services. Currently, only instances in Lingjun resource groups support this setting. For parameter details, see the table below. |
Advanced Configuration Parameter | Default Value | Description | Notes |
VmMaxMapCount | 65530 | Sets the maximum number of memory map areas a process can have. For example, you can set it to 1024000. | Values less than 65530 will not take effect. Excessively high values may lead to wasted memory resources. |
Network information
Parameter | Description |
VPC Configuration | This parameter is supported only when Resource Quota is set to Public Resources. If you want to use a DSW instance within a VPC, create a VPC in the same region as the DSW instance and configure this parameter. You also need to configure a VSwitch and a Security Group. For more information about configuration policies for different scenarios, see Network configurations. |
Public Access Gateway | The following configuration methods are supported: Public Gateway: The network bandwidth is limited. The network speed may not meet your needs during high concurrency or when downloading large files. Private Gateway: To address the bandwidth limitations of the public gateway, you can create an Internet NAT gateway in the DSW instance's VPC, attach an Elastic IP Address (EIP), and configure an SNAT entry. For more information, see Improve public network access speed using a private gateway.
The following parameters can be configured only when a CPFS dataset is selected for Mount Configuration:
Note If you select a CPFS dataset for the mount configuration, you must configure a VPC, and the selected VPC must be the same as the one used by the CPFS file system. |
Access configuration
Parameter | Description |
Enable SSH | Used to remotely connect to the instance. You can configure this after selecting a VPC. If you have configured a custom image, make sure that sshd is installed in the custom image. |
SSH Public Key | You can configure this parameter after you turn on the SSH Configuration switch.
Note To support both logon from within a VPC and logon from the internet, you must add the public keys of multiple clients. Add the public keys one by one, separated by line breaks. You can add up to 10 public keys. |
SSH Access Method | You can configure this parameter after you turn on the SSH Configuration switch. Access Within VPC: This access method is supported by default. You can remotely connect to the DSW instance using Secure Shell (SSH) from another terminal within the VPC, such as an ECS instance. Public Access: Select this option to add public access. You can then remotely connect to the instance using SSH from a local command line or another terminal.
|
Custom Service | Used to access services running in DSW from the internet. For more information, see Access services in an instance over the internet. |
Create VPC Internal Access Domain Name | Creates a built-in authoritative domain name (Private Zone). You can use this domain name within the VPC to access the SSH service or other custom services of the current instance, avoiding the inconvenience of using a changing instance IP address. Note that creating a built-in authoritative domain name will incur charges. For more information, see Billing of Alibaba Cloud DNS. |
Roles and permissions
Parameter | Description |
Visibility | You can select Visible Only To Instance Owner or Publicly Visible Within Workspace. |
Instance Owner | Only workspace administrators can change the instance owner. |
Show More Configurations
Parameter | Description |
Instance RAM Role | When accessing other cloud resources from a DSW instance, you can associate a Resource Access Management (RAM) role with the instance. This method uses Security Token Service (STS) temporary credentials to access other cloud resources, eliminating the need to configure a long-term AccessKey and effectively reducing the risk of key leakage. You can configure the instance RAM role as follows: PAI Default Role: Has permissions to access internal PAI products, MaxCompute, and Object Storage Service (OSS). Temporary access credentials issued based on the PAI default role will have the same permissions as the DSW instance owner when accessing internal PAI products and MaxCompute tables. When accessing OSS, it can only access the default storage path bucket configured for the current workspace. Custom Role: If you want customized or more fine-grained permission management, you can configure a custom role. Do Not Associate Role: If you want to access other cloud products directly using an AccessKey, you can choose not to associate a role.
For more information about configuring an instance RAM role, see Configure an instance RAM role for a DSW instance. |
After you confirm that the configurations are correct, click OK.
FAQ about instance startup and release
Instance startup
1. DSW instance fails to start
Troubleshooting: Click the DSW instance name and view the error message on the Events tab.

Common errors that cause an instance to fail to start include the following:
Your requested resource type [ecs.******] is not enough currently, please try other regions or other resource types
Cause: The selected resource type is in high demand in the current region, which prevents instance creation.
Solution: Try again later, or switch to a different resource type or region.
Your resource usage has exceeded the default limitation. Please contact us via ticket system to raise the limitation.
Cause: Each Alibaba Cloud account can create instances with a maximum of two GPUs per region. The creation fails if the selected specification exceeds this limit.
Solution: To increase the quota, submit a ticket.
Sales of this resource are temporarily suspended in the specified zone. We recommend that you use the multi-zone creation function to avoid the risk of insufficient resource.
Cause: Sales of this resource are temporarily suspended in the specified zone. Solution: You can try the following operations to mitigate the risk of insufficient resources:
Switch to another region.
Adjust the resource specification of the instance.
Try to start the instance during off-peak hours.
CommodityInstanceNotAvailableError: Commodity instance has been released due to prolonged arrears at past. Please create a new instance for use
The charge of current ECI instance has been stopped, but the related resources are still being cleaned.
Cause: Trial resources are public resources. If you start a DSW instance during peak hours, it may take more than 30 minutes to start. If the resources cannot be pulled within one hour, the system prompts that the selected specification is unavailable in the current region.
Solution: Try one of the following operations:
Switch the region.
Change the resource specification of the instance. You cannot modify the specification of a pending instance. You must manually stop the instance before you can change the specification.
Start the instance during off-peak hours, such as outside of business hours.
If none of these methods resolve the issue, contact your account manager for assistance.
The cluster resources are fully utilized. Please try later or other regions.
Create ECI failed because the specified instance is out of stock. It is recommended to use the multi-zone creation function to avoid the risk of stockout.
Cause: The specified computing resource is out of stock.
Solution: Try one of the following operations:
Switch the region.
Change the resource specification of the instance. You cannot modify the specification of a pending instance. You must manually stop the instance before you can change the specification.
Start the instance during off-peak hours, such as outside of business hours.
If none of these methods resolve the issue, contact your account manager for assistance.
back-off 10s restarting failed container=dsw-notebook pod
Cause: The system disk is full. You need to expand the system disk.
To check system disk usage:


Solution: Expand the system disk by clicking Change Configuration:

Important After the system disk is expanded, it is billed continuously, regardless of whether the instance is running. To stop all billing for the DSW instance, you must delete it. Before you delete the instance, make sure to back up all necessary data.
the available zone with vSwitch is out of stock
Cause: A VPC was configured when the DSW instance was created. Because the vSwitch in the VPC has a zone attribute, configuring the vSwitch limits the search for computing resources to that zone, which can lead to resource shortages.
Solution: Change the configuration of the DSW instance to leave the VPC field empty.

Note To use a VPC, we recommend switching to another zone and creating a new vSwitch and DSW instance. This expands the range of available resources and helps avoid stock shortages.
Startup failed with the message "Workspace member not found"
This error indicates that the account you are using is not a member of the target workspace. Contact the workspace administrator to add your account as a member.
Other reasons for startup failure:
Instance creation fails due to overdue payments
If your account has an overdue payment, you cannot create a DSW instance. Vouchers cannot be used to offset the overdue amount. You can log on to the Expenses and Costs console to check for overdue payments.
2. Can I execute a Python file when a DSW instance starts?
No. DSW does not currently support executing Python files on startup.
3. Cannot find the DSW instance?
If you cannot find your instance, try switching to different regions and workspaces.

4. What should I do if the DSW page is abnormal or cannot be operated?
If you encounter a blank page, a Notebook that is stuck loading, or a Terminal that does not accept commands, the issue is usually related to your local environment. Try the following troubleshooting steps:
Clear your browser cache and try again.
Use your browser's incognito or private mode to access the page.
Change your network environment. For example, switch from your company's internal network to a mobile hotspot to rule out firewall restrictions.
Try using a different browser, such as Chrome or Firefox.
5. Will data be lost if I stop, restart, change the specification, or change the image of a DSW instance?
Stopping or restarting an instance: No data is lost. After an instance is stopped or restarted, all packages installed using pip
, code files, and other data stored on the instance disk are retained.
Changing the instance specification: No data is lost. Adjusting the instance specification, such as the CPU, memory, or GPU, does not affect the disk data.
Changing the instance image: Some data might be lost. Changing the image does not affect mounted datasets or data in OSS, but the content on the system disk might be reset. Therefore, when you change the instance image, you must save your instance data. For example, you can copy or move the data to a dataset or to OSS. For more information, see Mount a dataset, OSS bucket, NAS file system, or CPFS file system.
Instance stop/delete/release
1. How do I release a DSW instance?
On the DSW instance list page, click Stop or Delete for the target instance.

Note: If the system disk was expanded when the DSW instance was created, the system disk is billed continuously, regardless of whether the instance is running. To stop all billing for the DSW instance, you must delete the instance.
2. Why can't I find my DSW instance?
If you cannot find your instance, try switching to different regions and workspaces.

3. How do I release a free trial resource plan?
You do not need to stop or delete free trial resource plans.
4. How do I completely stop billing for a DSW instance? What is the difference between the "Stop" and "Delete" operations?
Stop instance: This operation releases the instance's computing resources (CPU and GPU) and pauses billing for them. Note: The expanded system disk continues to be billed.
Delete instance: This operation permanently deletes the instance and all its resources, including the system disk. All related billing stops completely.
When to choose which operation:
Stop: If you are not using the instance temporarily but want to keep the data and environment for future restarts.
Delete: If you no longer need the instance and want to stop all billing. Back up your data before you perform this operation.
5. Why is my DSW instance stuck in the "Stopping" or "Deleting" state and the operation cannot be completed?
Stopping or deleting an instance takes time because the system needs to safely terminate tasks, save the instance state, and reclaim resources. If an instance is unresponsive for a long time, the common reasons include the following:
If you encounter this situation, wait for a few minutes and then refresh the page. The instance should then show a normal stopped status.
6. Will my data and code be lost after stopping or deleting a DSW instance?
Whether data is retained depends on the operation and the resource group type of the instance.
Stop instance:
Data retention policies vary by resource group type.
Public resource group instance: Data is retained on the mounted disk. Note: If the instance is stopped for more than 15 consecutive days, the disk and its data are deleted.
Dedicated resource group instance: Data is stored on the instance's system disk. Stopping the instance deletes the data, and it cannot be recovered.
Delete instance:
All data on the system disk is permanently erased and cannot be recovered. Therefore, you must back up all important data before you delete the instance.
7. Why does my running DSW instance stop automatically?
The instance is configured with an idle auto-shutdown policy. This policy is designed to save resources and is enabled by default for free trial instances.
8. I have stopped or deleted all my DSW instances, so why does it still show "Running" or why do I receive billing notifications?
This may be due to one of the following common reasons:
You may be confusing resource plans with instances. The 'Running' status that you see might refer to a resource plan, such as '250 billable hours per month', not an instance. A resource plan is always active during its validity period, and its status is independent of any instance.
The expanded system disk is still being billed. Stopping an instance only pauses billing for computing resources. An expanded system disk continues to incur storage fees.
There is a delay in billing. Billing is not in real-time, and a bill might be generated several hours after you use a resource. For example, charges incurred in the morning might not appear on the bill until the afternoon.