Files
doc-exports/docs/modelarts/umn/modelarts_trouble_0141.html
Lai, Weijian 6aa966a79a ModelArts UMN 24.3.0 version
Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com>
Co-authored-by: Lai, Weijian <laiweijian4@huawei.com>
Co-committed-by: Lai, Weijian <laiweijian4@huawei.com>
2024-11-02 09:04:52 +00:00

3.6 KiB

System Container Exits Unexpectedly

Symptom

After a training job is created, the system container exits unexpectedly.

Possible Causes

The possible causes are as follows:

  1. An error occurred in OBS.
    1. Unavailable file: The specified key does not exist.
    2. Insufficient OBS permissions
    3. OBS traffic limiting
    4. Others
  2. The disk space is insufficient.

Solution

  1. For an OBS error:
    1. Unavailable file: The specified key does not exist.

      For details, see Error Message "errorMessage:The specified key does not exist" Displayed in Logs.

    2. Insufficient OBS permissions

      For details, see What Should I Do If Error "stat:403 reason:Forbidden" Is Displayed in Logs When a Training Job Accesses OBS.

    3. OBS traffic limiting

      For details, see Error Message "BrokenPipeError: Broken pipe" Displayed When OBS Data Is Copied.

    4. Others

      Alternatively, collect the request ID and contact OBS customer service.

  2. For insufficient disk space:

    For details, see Common Issues Related to Insufficient Disk Space and Solutions.