The built-in algorithms provided by ModelArts can be used for image classification, object detection, and image semantic segmentation. The requirements for the datasets vary according to the built-in algorithms used for different purposes. Before using a built-in algorithm to create a training job, you are advised to prepare a dataset based on the requirements of the algorithm.
The training dataset must be stored in the OBS bucket. The following shows the OBS path structure of the dataset:
|-- data_url
|--a.jpg
|--a.txt
|--b.jpg
|--b.txt
...
cat
|-- data_url
|--cat
|--a.jpg
|--a.txt
|--dog
|--b.jpg
|--b.txt
...
The training dataset must be stored in the OBS bucket. The following shows the OBS path structure of the dataset:
|-- data_url
|--a.jpg
|--a.xml
|--b.jpg
|--b.xml
...
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<annotation>
<folder>Images</folder>
<filename>IMG_20180919_120022.jpg</filename>
<source>
<database>Unknown</database>
</source>
<size>
<width>800</width>
<height>600</height>
<depth>1</depth>
</size>
<segmented>0</segmented>
<object>
<name>yunbao</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>216.00</xmin>
<ymin>108.00</ymin>
<xmax>705.00</xmax>
<ymax>488.00</ymax>
</bndbox>
</object>
</annotation>
The training dataset must be stored in the OBS bucket. The following shows the OBS path structure of the dataset:
|-- data_url
|--Image
|--a.jpg
|--b.jpg
...
|--Label
|--a.jpg
|--b.jpg
...
|--train.txt
|--val.txt
Description:
In the list file, the relative paths of images and labels are separated by spaces. Different pieces of data are separated by newline characters. The following gives an example:
Image/a.jpg Label/a.jpg Image/b.jpg Label/b.jpg ...