In EAS, you can define and deploy online services using a JSON configuration file. After you prepare the JSON configuration file, you can deploy the service using the EAS console, the EASCMD client, or an SDK.
Prepare a JSON configuration file
To deploy a service, create a JSON file that contains all the required configurations. If you are a first-time user, you can specify the basic configurations on the service deployment page in the console. The system automatically generates the corresponding JSON content, which you can then modify and extend.
The following code shows an example of a service.json
file. For more information about the parameters and their descriptions, see Appendix: JSON parameter description.
{
"cloud": {
"computing": {
"instances": [
{
"type": "ecs.c7a.large"
}
]
}
},
"containers": [
{
"image": "****-registry.cn-beijing.cr.aliyuncs.com/***/***:latest",
"port": 8000,
"script": "python app.py"
}
],
"metadata": {
"cpu": 2,
"instance": 1,
"memory": 4000,
"name": "demo"
}
}
Deploy a service using a JSON file
Console
Log on to the PAI console. Select a region on the top of the page. Then, select the desired workspace and click Enter Elastic Algorithm Service (EAS).
On the Inference Service tab, click Deploy Service. On the Deploy Service page, choose
.Paste the JSON configuration and click Deploy. Wait for the service status to change to Running. A status of Running indicates that the service is deployed.
EASCMD
The EASCMD client tool is used to manage model services on your server. You can use it to create, view, delete, and update services. The following procedure shows how to deploy a service using the EASCMD client on a 64-bit Linux system.
Download and authenticate the client
If you use a Data Science Workshop (DSW) development environment and an official image, the EASCMD client is pre-installed in the
/etc/dsw/eascmd64
directory. Otherwise, you must download and authenticate the client.Run the deployment command
In the directory that contains the JSON file, run the following command to deploy the service. For more information about the available operations, see Command reference.
eascmdwin64 create <service.json>
Replace <service.json> with the name of your JSON file.
NoteIf you use a DSW development environment and need to upload the JSON configuration file, see Upload and download files.
The system returns a result similar to the following code.
[RequestId]: 1651567F-8F8D-4A2B-933D-F8D3E2DD**** +-------------------+----------------------------------------------------------------------------+ | Intranet Endpoint | http://166233998075****.cn-shanghai.pai-eas.aliyuncs.com/api/predict/test_eascmd | | Token | YjhjOWQ2ZjNkYzdiYjEzMDZjOGEyNGY5MDIxMzczZWUzNGEyMzhi**** | +-------------------+--------------------------------------------------------------------------+ [OK] Creating api gateway [OK] Building image [registry-vpc.cn-shanghai.aliyuncs.com/eas/test_eascmd_cn-shanghai:v0.0.1-20221122114614] [OK] Pushing image [registry-vpc.cn-shanghai.aliyuncs.com/eas/test_eascmd_cn-shanghai:v0.0.1-20221122114614] [OK] Waiting [Total: 1, Pending: 1, Running: 0] [OK] Waiting [Total: 1, Pending: 1, Running: 0] [OK] Service is running
Appendix: JSON parameter description
Parameter | Required | Description |
name | Yes | The service name. The name must be unique within a region. |
token | No | The token string for access authentication. If you do not specify this parameter, the system automatically generates a token. |
model_path | Yes | This parameter is required when you deploy a service using a processor. model_path and processor_path specify the source data addresses of the model and the processor, respectively. The following address formats are supported:
|
oss_endpoint | No | The OSS endpoint. Example: oss-cn-beijing.aliyuncs.com. For other values, see OSS regions and endpoints. Note By default, you do not need to specify this parameter. The system uses the internal OSS endpoint of the current region to download model files or processor files. You must specify this parameter for cross-region access to OSS. For example, if you deploy a service in the China (Hangzhou) region and specify an OSS address in the China (Beijing) region for model_path, you must use this parameter to specify the public endpoint of OSS in the China (Beijing) region. |
model_entry | No | The entry file of the model. It can contain any file. If you do not specify this parameter, the file name in model_path is used. The path of the main file is passed to the initialize() function in the processor. |
model_config | No | The model configuration. Any text is supported. The parameter value is passed to the second parameter of the Initialize() function in the processor. |
processor | No |
|
processor_path | No | The path of the file package related to the processor. For more information, see the description of the model_path parameter. |
processor_entry | No | The main file of the processor. Examples: libprocessor.so or app.py. The file contains the implementation of the This parameter is required when processor_type is set to cpp or python. |
processor_mainclass | No | The main file of the processor, which is the main class in the JAR package. Example: com.aliyun.TestProcessor. This parameter is required when processor_type is set to java. |
processor_type | No | The language in which the processor is implemented. Valid values:
|
warm_up_data_path | No | The path of the request file used for model prefetch. For more information about the model prefetch feature, see Prefetch a model service. |
runtime.enable_crash_block | No | Specifies whether a service instance automatically restarts after it crashes due to an exception in the processor code. Valid values:
|
cloud | No | For more information, see Table 1. cloud parameter description. |
autoscaler | No | The configuration for automatic scaling of the model service. For more information about parameter configurations, see Auto scaling. |
containers | No | For more information, see Table 2. containers parameter description. |
dockerAuth | No | If the image is from a private repository, you must configure dockerAuth. The value is the Base64-encoded string of |
storage | No | The information about service storage mounts. For detailed configuration instructions, see Storage configuration. |
metadata | Yes | The metadata of the service. For more information about parameter configurations, see Table 3. metadata parameter description. |
features | No | The configuration of special features for the service. For more information about parameter configurations, see Table 4. features parameter description. |
networking | No | The call configuration of the service. For more information about parameter configurations, see Table 5. networking parameter description. |
labels | No | The tags for the EAS service. The format is |
unit.size | No | The number of machines deployed for a single instance in a distributed inference configuration. The default value is 2. |
sinker | No | You can persist all requests and responses of a service to MaxCompute or Simple Log Service (SLS). The following provides configuration examples. For more information about parameter configurations, see Table 6. sinker parameter description. Store in MaxCompute
Store in Simple Log Service (SLS)
|
confidential | No | By configuring the system trust management service, you can ensure that information such as data, models, and code is securely encrypted during service deployment and invocation. This implements a secure and verifiable inference service. The format is as follows: Note The secure encryption environment is mainly for your mounted storage files. Mount the storage files before you enable this feature.
The following table describes the parameters.
|
Appendix: JSON configuration example
The following code provides an example of a JSON configuration file that uses the parameters described in this topic:
{
"name": "test_eascmd",
"token": "****M5Mjk0NDZhM2EwYzUzOGE0OGMx****",
"processor": "tensorflow_cpu_1.12",
"model_path": "oss://examplebucket/exampledir/",
"oss_endpoint": "oss-cn-beijing.aliyuncs.com",
"model_entry": "",
"model_config": "",
"processor_path": "",
"processor_entry": "",
"processor_mainclass": "",
"processor_type": "",
"warm_up_data_path": "",
"runtime": {
"enable_crash_block": false
},
"unit": {
"size": 2
},
"sinker": {
"type": "maxcompute",
"config": {
"maxcompute": {
"project": "cl****",
"table": "te****"
}
}
},
"cloud": {
"computing": {
"instances": [
{
"capacity": 800,
"type": "dedicated_resource"
},
{
"capacity": 200,
"type": "ecs.c7.4xlarge",
"spot_price_limit": 3.6
}
],
"disable_spot_protection_period": true
},
"networking": {
"vpc_id": "vpc-bp1oll7xawovg9t8****",
"vswitch_id": "vsw-bp1jjgkw51nsca1e****",
"security_group_id": "sg-bp1ej061cnyfn0b****"
}
},
"autoscaler": {
"min": 2,
"max": 5,
"strategies": {
"qps": 10
}
},
"storage": [
{
"mount_path": "/data_oss",
"oss": {
"endpoint": "oss-cn-shanghai-internal.aliyuncs.com",
"path": "oss://bucket/path/"
}
}
],
"confidential": {
"trustee_endpoint": "xx",
"decryption_key": "xx"
},
"metadata": {
"resource": "eas-r-9lkbl2jvdm0puv****",
"instance": 1,
"workspace_id": 1405**,
"gpu": 0,
"cpu": 1,
"memory": 2000,
"gpu_memory": 10,
"gpu_core_percentage": 10,
"qos": "",
"cuda": "11.2",
"enable_grpc": false,
"enable_webservice": false,
"rdma": 1,
"rpc": {
"batching": false,
"keepalive": 5000,
"io_threads": 4,
"max_batch_size": 16,
"max_batch_timeout": 50,
"max_queue_size": 64,
"worker_threads": 5,
"rate_limit": 0,
"enable_sigterm": false
},
"rolling_strategy": {
"max_surge": 1,
"max_unavailable": 1
},
"eas.termination_grace_period": 30,
"scheduling": {
"spread": {
"policy": "host"
}
},
"resource_rebalancing": false,
"workload_type": "elasticjob",
"shm_size": 100
},
"features": {
"eas.aliyun.com/extra-ephemeral-storage": "100Gi",
"eas.aliyun.com/gpu-driver-version": "tesla=550.127.08"
},
"networking": {
"gateway": "gw-m2vkzbpixm7mo****"
},
"containers": [
{
"image": "registry-vpc.cn-shanghai.aliyuncs.com/xxx/yyy:zzz",
"prepare": {
"pythonRequirements": [
"numpy==1.16.4",
"absl-py==0.11.0"
]
},
"command": "python app.py",
"port": 8000
}
],
"dockerAuth": "dGVzdGNhbzoxM*******"
}