The built-in algorithms provided by ModelArts can be used for image classification, object detection, and image semantic segmentation. The requirements for the datasets vary according to the built-in algorithms used for different purposes. Before using a built-in algorithm to create a training job, you are advised to prepare a dataset based on the requirements of the algorithm.
The training dataset must be stored in the OBS bucket. The following shows the OBS path structure of the dataset:
|-- data_url |--a.jpg |--a.txt |--b.jpg |--b.txt ...
cat
|-- data_url |--cat |--a.jpg |--a.txt |--dog |--b.jpg |--b.txt ...
The training dataset must be stored in the OBS bucket. The following shows the OBS path structure of the dataset:
|-- data_url |--a.jpg |--a.xml |--b.jpg |--b.xml ...
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | <?xml version="1.0" encoding="UTF-8" standalone="no"?>
<annotation>
<folder>Images</folder>
<filename>IMG_20180919_120022.jpg</filename>
<source>
<database>Unknown</database>
</source>
<size>
<width>800</width>
<height>600</height>
<depth>1</depth>
</size>
<segmented>0</segmented>
<object>
<name>yunbao</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>216.00</xmin>
<ymin>108.00</ymin>
<xmax>705.00</xmax>
<ymax>488.00</ymax>
</bndbox>
</object>
</annotation>
|
The training dataset must be stored in the OBS bucket. The following shows the OBS path structure of the dataset:
|-- data_url |--Image |--a.jpg |--b.jpg ... |--Label |--a.jpg |--b.jpg ... |--train.txt |--val.txt
Description:
In the list file, the relative paths of images and labels are separated by spaces. Different pieces of data are separated by newline characters. The following gives an example:
Image/a.jpg Label/a.jpg Image/b.jpg Label/b.jpg ...