This API is used to obtain the version of a specified training job based on the job ID.
GET /v1/{project_id}/training-jobs/{job_id}/versions
Parameter |
Mandatory |
Type |
Description |
|---|---|---|---|
project_id |
Yes |
String |
Project ID. For details about how to obtain a project ID, see Obtaining a Project ID and Name. |
job_id |
Yes |
Long |
ID of a training job |
Parameter |
Mandatory |
Type |
Description |
|---|---|---|---|
per_page |
No |
Integer |
Number of job parameters displayed on each page. The value range is [1, 1000]. Default value: 10 |
page |
No |
Integer |
Index of the page to be queried
|
None
Parameter |
Type |
Description |
|---|---|---|
is_success |
Boolean |
Whether the request is successful |
error_message |
String |
Error message of a failed API call. This parameter is not included when the API call succeeds. |
error_code |
String |
Error code of a failed API call. For details, see Error Codes. This parameter is not included when the API call succeeds. |
job_id |
Long |
ID of a training job |
job_name |
String |
Name of a training job |
job_desc |
String |
Description of a training job |
version_count |
Long |
Number of versions of a training job |
versions |
JSON Array |
Version parameters of a training job. For details, see the sample response. For details about the attributes, see Table 4. |
Parameter |
Type |
Description |
|---|---|---|
version_id |
Long |
Version ID of a training job |
version_name |
String |
Version name of a training job |
pre_version_id |
Long |
ID of the previous version of a training job |
engine_type |
Long |
Engine type of a training job |
engine_name |
String |
Name of the engine selected for a training job |
engine_id |
Long |
ID of the engine selected for a training job |
engine_version |
String |
Version of the engine selected for a training job |
status |
Int |
Status of a training job |
app_url |
String |
Code directory of a training job |
boot_file_url |
String |
Boot file of a training job |
create_time |
Long |
Time when a training job is created |
parameter |
JSON Array |
Running parameters of a training job. This parameter is a container environment variable when a training job uses a custom image. For details, see Table 5. |
duration |
Long |
Training job running duration, in milliseconds |
spec_id |
Long |
ID of the resource specifications selected for a training job |
core |
String |
Number of cores of the resource specifications |
cpu |
String |
CPU memory of the resource specifications |
gpu |
Boolean |
Whether to use GPUs |
gpu_num |
Integer |
Number of GPUs of the resource specifications |
gpu_type |
String |
GPU type of the resource specifications |
worker_server_num |
Integer |
Number of workers in a training job |
data_url |
String |
Dataset of a training job |
train_url |
String |
OBS path of the training job output file |
log_url |
String |
OBS URL of the logs of a training job. By default, this parameter is left blank. Example value: /usr/log/ |
dataset_version_id |
String |
Dataset version ID of a training job |
dataset_id |
String |
Dataset ID of a training job |
data_source |
JSON Array |
Dataset of a training job. For details, see Table 6. |
model_id |
Long |
Model ID of a training job |
model_metric_list |
String |
Model metrics of a training job. For details, see Table 7. |
system_metric_list |
String |
System monitoring metrics of a training job. For details, see Table 8. |
user_image_url |
String |
SWR URL of a custom image used by a training job |
user_command |
String |
Boot command used to start the container of a custom image of a training job |
resource_id |
String |
Charged resource ID of a training job |
dataset_name |
String |
Dataset of a training job |
start_time |
Long |
Training start time |
volumes |
JSON Array |
Storage volume that can be used by a training job. For details, see Table 13. |
dataset_version_name |
String |
Dataset of a training job |
pool_name |
String |
Name of a resource pool |
pool_id |
String |
ID of a resource pool |
nas_mount_path |
String |
Local mount path of SFS Turbo (NAS). Example value: /home/work/nas |
nas_share_addr |
String |
Shared path of SFS Turbo (NAS). Example value: 192.168.8.150:/ |
nas_type |
String |
Only NFS is supported. Example value: nfs |
Parameter |
Type |
Description |
|---|---|---|
label |
String |
Parameter name |
value |
String |
Parameter value |
Parameter |
Type |
Description |
|---|---|---|
dataset_id |
String |
Dataset ID of a training job |
dataset_version |
String |
Dataset version ID of a training job |
type |
String |
Dataset type
|
data_url |
String |
OBS bucket path |
Parameter |
Type |
Description |
|---|---|---|
metric |
JSON Array |
Validation metrics of a classification of a training job. |
total_metric |
JSON |
Overall validation parameters of a training job. For details, see Table 11. |
Parameter |
Type |
Description |
|---|---|---|
cpuUsage |
Array |
CPU usage of a training job |
memUsage |
Array |
Memory usage of a training job |
gpuUtil |
Array |
GPU usage of a training job |
Parameter |
Type |
Description |
|---|---|---|
metric_values |
JSON |
Validation metrics of a classification of a training job. For details, see Table 10. |
reserved_data |
JSON |
Reserved parameter |
metric_meta |
JSON |
Classification of a training job, including the classification ID and name |
Parameter |
Type |
Description |
|---|---|---|
recall |
Float |
Recall of a classification of a training job |
precision |
Float |
Precision of a classification of a training job |
accuracy |
Float |
Accuracy of a classification of a training job |
Parameter |
Type |
Description |
|---|---|---|
total_metric_meta |
JSON Array |
Reserved parameter |
total_reserved_data |
JSON Array |
Reserved parameter |
total_metric_values |
JSON Array |
Overall validation metrics of a training job. For details, see Table 12. |
Parameter |
Type |
Description |
|---|---|---|
f1_score |
Float |
F1 score of a training job. This parameter is used only by some preset algorithms and is automatically generated. It is for reference only. |
recall |
Float |
Total recall of a training job |
precision |
Float |
Total precision of a training job |
accuracy |
Float |
Total accuracy of a training job |
Parameter |
Type |
Description |
|---|---|---|
nfs |
object |
Storage volume of the shared file system type. Only the training jobs running in a resource pool with the shared file system network connected support such storage volumes. For details, see Table 14. |
host_path |
object |
Storage volume of the host file system type. Only training jobs running in a dedicated resource pool support such storage volumes. For details, see Table 15. |
Parameter |
Type |
Description |
|---|---|---|
id |
String |
ID of an SFS Turbo file system |
src_path |
String |
Address of an SFS Turbo file system |
dest_path |
String |
Local path to a training job |
read_only |
Boolean |
Whether dest_path is read-only. The default value is false.
|
The following shows how to obtain the job version details on the first page when job_id is set to 10 and five records are displayed on each page.
GET https://endpoint/v1/{project_id}/training-jobs/10/versions?per_page=5&page=1
{
"is_success": true,
"job_id": 10,
"job_name": "testModelArtsJob",
"job_desc": "testModelArtsJob desc",
"version_count": 2,
"versions": [
{
"version_id": 10,
"version_name": "V0004",
"pre_version_id": 5,
"engine_type": 1,
"engine_name": "TensorFlow",
"engine_id": 1,
"engine_version": "TF-1.4.0-python2.7",
"status": 10,
"app_url": "/usr/app/",
"boot_file_url": "/usr/app/boot.py",
"create_time": 1524189990635,
"parameter": [
{
"label": "learning_rate",
"value": 0.01
}
],
"duration": 532003,
"spec_id": 1,
"core": 2,
"cpu": 8,
"gpu": true,
"gpu_num": 2,
"gpu_type": "P100",
"worker_server_num": 1,
"data_url": "/usr/data/",
"train_url": "/usr/train/",
"log_url": "/usr/log/",
"dataset_version_id": "2ff0d6ba-c480-45ae-be41-09a8369bfc90",
"dataset_id": "38277e62-9e59-48f4-8d89-c8cf41622c24",
"data_source": [
{
"type": "obs",
"data_url": "/qianjiajun-test/minst/data/"
}
],
"user_image_url": "100.125.5.235:20202/jobmng/custom-cpu-base:1.0",
"user_command": "bash -x /home/work/run_train.sh python /home/work/user-job-dir/app/mnist/mnist_softmax.py --data_url /home/work/user-job-dir/app/mnist_data",
"model_id": 1,
"model_metric_list": "{\"metric\":[{\"metric_values\":{\"recall\":0.005833,\"precision\":0.000178,\"accuracy\":0.000937},\"reserved_data\":{},\"metric_meta\":{\"class_name\":0,\"class_id\":0}}],\"total_metric\":{\"total_metric_meta\":{},\"total_reserved_data\":{},\"total_metric_values\":{\"recall\":0.005833,\"id\":0,\"precision\":0.000178,\"accuracy\":0.000937}}}",
"system_metric_list": "{\"cpuUsage\":[\"0\",\"3.10\",\"5.76\",\"0\",\"0\",\"0\",\"0\"],\"memUsage\":[\"0\",\"0.77\",\"2.09\",\"0\",\"0\",\"0\",\"0\"],\"gpuUtil\":[\"0\",\"0.25\",\"0.88\",\"0\",\"0\",\"0\",\"0\"],\"gpuMemUsage\":[\"0\",\"0.65\",\"6.01\",\"0\",\"0\",\"0\",\"0\"],\"diskReadRate\":[\"0\",\"91811.07\",\"38846.63\",\"0\",\"0\",\"0\",\"0\"],\"diskWriteRate\":[\"0\",\"2.23\",\"0.94\",\"0\",\"0\",\"0\",\"0\"],\"recvBytesRate\":[\"0\",\"5770405.50\",\"2980077.75\",\"0\",\"0\",\"0\",\"0\"],\"sendBytesRate\":[\"0\",\"12607.17\",\"10487410.00\",\"0\",\"0\",\"0\",\"0\"],\"interval\":1}",
"dataset_name": "dataset-test",
"dataset_version_name": "dataset-version-test",
"start_time": 1563172362000,
"volumes": [
{
"nfs": {
"id": "43b37236-9afa-4855-8174-32254b9562e7",
"src_path": "192.168.8.150:/",
"dest_path": "/home/work/nas",
"read_only": false
}
},
{
"host_path": {
"src_path": "/root/work",
"dest_path": "/home/mind",
"read_only": false
}
}
],
"pool_id": "pool9928813f",
"pool_name": "p100",
"nas_mount_path": "/home/work/nas",
"nas_share_addr": "192.168.8.150:/",
"nas_type": "nfs"
}
]
}
{
"is_success": false,
"error_message": "Error string",
"error_code": "ModelArts.0105"
}
For details about the status code, see Status Code.