doc-exports/docs/obs/umn/obs_03_0083.html
zhangyue 92e44874e7 OBS UMN DOC
Reviewed-by: Sabelnikov, Dmitriy <dmitriy.sabelnikov@t-systems.com>
Co-authored-by: zhangyue <zhangyue164@huawei.com>
Co-committed-by: zhangyue <zhangyue164@huawei.com>
2024-11-06 14:33:05 +00:00

117 lines
16 KiB
HTML

<a name="obs_03_0083"></a><a name="obs_03_0083"></a>
<h1 class="topictitle1">Bucket Inventory Overview</h1>
<div id="body1547800638356"><p id="obs_03_0083__p1448944813228">The bucket inventory function periodically generates lists of metadata information of objects in a bucket. Inventories help you better understand object statuses in the bucket.</p>
<p id="obs_03_0083__p153921010111">An inventory is a CSV file. Inventory files are automatically uploaded to the specified bucket.</p>
<p id="obs_03_0083__p209174330204">You specify that inventories are generated for objects with the same object name prefix. You can also determine the inventory generation interval and whether to list all object versions in the inventory file. The object metadata you specify in the inventory include the file size, last modification time, storage class, ETag, multipart upload, encryption status, and replication status.</p>
<div class="section" id="obs_03_0083__section158213151418"><h4 class="sectiontitle">Constraints</h4><ul id="obs_03_0083__ul56961234171412"><li id="obs_03_0083__li10796191411720">A bucket can have a maximum of 10 inventory rules.</li><li id="obs_03_0083__li546946171514">The source bucket (for which the inventory is configured) and the destination bucket (that stores the generated inventory files) must belong to the same account.</li><li id="obs_03_0083__li1855663111518">The source and destination buckets must be in the same region.</li><li id="obs_03_0083__li249410358152">Inventory files must be in the CSV format.</li><li id="obs_03_0083__li10344114519156">OBS can generate inventory files for all objects in a bucket or a group of objects whose names begin with the same prefix.</li><li id="obs_03_0083__li158061615195219">If a bucket has multiple inventory rules, overlaps between the inventory rules are not allowed.<ul id="obs_03_0083__ul122461318115219"><li id="obs_03_0083__li1883713488313">If a bucket already has an inventory rule for the entire bucket, new inventory rules that filter objects by prefixes cannot be created. If you need an inventory rule that covers only a subset of objects in the bucket, delete the inventory rule configured for the entire bucket.</li><li id="obs_03_0083__li96451644175211">If an inventory rule that filters objects by a specified prefix already exists, you cannot create an inventory rule for the entire bucket. To create an inventory rule for the entire bucket, make sure that the bucket has no other inventory rules that filter objects by specified prefixes.</li><li id="obs_03_0083__li207632319715">If a bucket already has an inventory rule that filters objects by the object name prefix <strong id="obs_03_0083__b41999527153">ab</strong>, the filter of a new inventory rule cannot start with <strong id="obs_03_0083__b1520010525159">a</strong> or <strong id="obs_03_0083__b1020075281516">abc</strong>. To create such a rule, you need to first delete the existing inventory rule that conflicts with the rule you will create.</li></ul>
</li><li id="obs_03_0083__li16109064164">Bucket inventory files can be encrypted only in the SSE-KMS mode.</li><li id="obs_03_0083__li12600182018472">The destination bucket cannot have <span id="obs_03_0083__ph1865519323262">server-side encryption</span> enabled.</li></ul>
</div>
<div class="section" id="obs_03_0083__section05556294353"><h4 class="sectiontitle">Content in an Inventory File</h4><p id="obs_03_0083__p552919224342"><a href="#obs_03_0083__table2291537413">Table 1</a> lists all possible metadata fields that an inventory file can contain.</p>
<div class="tablenoborder"><a name="obs_03_0083__table2291537413"></a><a name="table2291537413"></a><table cellpadding="4" cellspacing="0" summary="" id="obs_03_0083__table2291537413" frame="border" border="1" rules="all"><caption><b>Table 1 </b>Object metadata fields allowed in an inventory file</caption><thead align="left"><tr id="obs_03_0083__row2029119304113"><th align="left" class="cellrowborder" valign="top" width="25.96%" id="mcps1.3.5.3.2.3.1.1"><p id="obs_03_0083__p52920314113">Metadata</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="74.03999999999999%" id="mcps1.3.5.3.2.3.1.2"><p id="obs_03_0083__p13292173184114">Description</p>
</th>
</tr>
</thead>
<tbody><tr id="obs_03_0083__row1129215317419"><td class="cellrowborder" valign="top" width="25.96%" headers="mcps1.3.5.3.2.3.1.1 "><p id="obs_03_0083__p132929354117">Bucket</p>
</td>
<td class="cellrowborder" valign="top" width="74.03999999999999%" headers="mcps1.3.5.3.2.3.1.2 "><p id="obs_03_0083__p22921833418">Name of the source bucket</p>
</td>
</tr>
<tr id="obs_03_0083__row129215315412"><td class="cellrowborder" valign="top" width="25.96%" headers="mcps1.3.5.3.2.3.1.1 "><p id="obs_03_0083__p6292173154118">Key</p>
</td>
<td class="cellrowborder" valign="top" width="74.03999999999999%" headers="mcps1.3.5.3.2.3.1.2 "><p id="obs_03_0083__p149689407412">Name of an object. Each object in a bucket has a unique key. Object names in the inventory file are URL-encoded using UTF-8 and must be decoded before you can use them.</p>
</td>
</tr>
<tr id="obs_03_0083__row629220318416"><td class="cellrowborder" valign="top" width="25.96%" headers="mcps1.3.5.3.2.3.1.1 "><p id="obs_03_0083__p62921634416">VersionId</p>
</td>
<td class="cellrowborder" valign="top" width="74.03999999999999%" headers="mcps1.3.5.3.2.3.1.2 "><p id="obs_03_0083__p5292538411">Object version ID. This field is not included in the inventory file if <strong id="obs_03_0083__b103241351596">ObjectVersions</strong> in the inventory configuration is set to <strong id="obs_03_0083__b182471745418">Current version only</strong>.</p>
</td>
</tr>
<tr id="obs_03_0083__row8292431415"><td class="cellrowborder" valign="top" width="25.96%" headers="mcps1.3.5.3.2.3.1.1 "><p id="obs_03_0083__p142921139411">IsLatest</p>
</td>
<td class="cellrowborder" valign="top" width="74.03999999999999%" headers="mcps1.3.5.3.2.3.1.2 "><p id="obs_03_0083__p07938119421">This field is set to <strong id="obs_03_0083__b48107102010">True</strong> if the object version is the latest. This field is not included in the inventory file if <strong id="obs_03_0083__b1952119512311">ObjectVersions</strong> in the inventory configuration is set to <strong id="obs_03_0083__b352217517313">Current version only</strong>.</p>
</td>
</tr>
<tr id="obs_03_0083__row829273144115"><td class="cellrowborder" valign="top" width="25.96%" headers="mcps1.3.5.3.2.3.1.1 "><p id="obs_03_0083__p1629253144110">IsDeleteMarker</p>
</td>
<td class="cellrowborder" valign="top" width="74.03999999999999%" headers="mcps1.3.5.3.2.3.1.2 "><p id="obs_03_0083__p19292193144117">When versioning is enabled for the source bucket, deleting an object will create a new piece of object metadata and set <strong id="obs_03_0083__b326628162811">IsDeleteMarker</strong> of the metadata to <strong id="obs_03_0083__b1626628132819">true</strong>. This field is not included in the inventory file if <strong id="obs_03_0083__b1326142822819">ObjectVersions</strong> in the inventory configuration is set to <strong id="obs_03_0083__b826122819281">Current version only</strong>.</p>
</td>
</tr>
<tr id="obs_03_0083__row1529215312417"><td class="cellrowborder" valign="top" width="25.96%" headers="mcps1.3.5.3.2.3.1.1 "><p id="obs_03_0083__p162923354110">Size</p>
</td>
<td class="cellrowborder" valign="top" width="74.03999999999999%" headers="mcps1.3.5.3.2.3.1.2 "><p id="obs_03_0083__p1129223194117">Object size, in bytes</p>
</td>
</tr>
<tr id="obs_03_0083__row22921533418"><td class="cellrowborder" valign="top" width="25.96%" headers="mcps1.3.5.3.2.3.1.1 "><p id="obs_03_0083__p112920315417">LastModifiedDate</p>
</td>
<td class="cellrowborder" valign="top" width="74.03999999999999%" headers="mcps1.3.5.3.2.3.1.2 "><p id="obs_03_0083__p62928354117">Object creation date or the last modification date</p>
</td>
</tr>
<tr id="obs_03_0083__row1951034294315"><td class="cellrowborder" valign="top" width="25.96%" headers="mcps1.3.5.3.2.3.1.1 "><p id="obs_03_0083__p6511142114315">ETag</p>
</td>
<td class="cellrowborder" valign="top" width="74.03999999999999%" headers="mcps1.3.5.3.2.3.1.2 "><p id="obs_03_0083__p1551114284311">Hexadecimal digest of the object MD5. ETag is the unique identifier of the object content. It reflects whether the object content is changed. For example, if the ETag value is <strong id="obs_03_0083__b217119916532">A</strong> when an object is uploaded but changes to <strong id="obs_03_0083__b1117113935310">B</strong> when the object is downloaded, it means that the object content has been changed.</p>
</td>
</tr>
<tr id="obs_03_0083__row195501402444"><td class="cellrowborder" valign="top" width="25.96%" headers="mcps1.3.5.3.2.3.1.1 "><p id="obs_03_0083__p16550107446">StorageClass</p>
</td>
<td class="cellrowborder" valign="top" width="74.03999999999999%" headers="mcps1.3.5.3.2.3.1.2 "><p id="obs_03_0083__p1755020134414">Storage class of an object</p>
</td>
</tr>
<tr id="obs_03_0083__row1117118584436"><td class="cellrowborder" valign="top" width="25.96%" headers="mcps1.3.5.3.2.3.1.1 "><p id="obs_03_0083__p1417145813430">IsMultipartUploaded</p>
</td>
<td class="cellrowborder" valign="top" width="74.03999999999999%" headers="mcps1.3.5.3.2.3.1.2 "><p id="obs_03_0083__p21712589430">Whether an object is uploaded using multipart upload</p>
</td>
</tr>
<tr id="obs_03_0083__row675205616431"><td class="cellrowborder" valign="top" width="25.96%" headers="mcps1.3.5.3.2.3.1.1 "><p id="obs_03_0083__p187515694311">ReplicationStatus</p>
</td>
<td class="cellrowborder" valign="top" width="74.03999999999999%" headers="mcps1.3.5.3.2.3.1.2 "><p id="obs_03_0083__p19436123854416">Cross-region replication status of an object</p>
</td>
</tr>
<tr id="obs_03_0083__row57801523439"><td class="cellrowborder" valign="top" width="25.96%" headers="mcps1.3.5.3.2.3.1.1 "><p id="obs_03_0083__p19780852174320">EncryptionStatus</p>
</td>
<td class="cellrowborder" valign="top" width="74.03999999999999%" headers="mcps1.3.5.3.2.3.1.2 "><p id="obs_03_0083__p16780145234315">Encryption status of an object</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="obs_03_0083__section14172191411423"><a name="obs_03_0083__section14172191411423"></a><a name="section14172191411423"></a><h4 class="sectiontitle">Inventory File Name</h4><p id="obs_03_0083__p13914676286">The name of an inventory file is in the following format:</p>
<pre class="screen" id="obs_03_0083__screen295812410458">destinationPrefix/sourceBucketName/inventoryId/yyyy-MM-dd'T'HH-mm'Z'/files/UUID_index.csv</pre>
<ul id="obs_03_0083__ul1947285194914"><li id="obs_03_0083__li14266125514491"><em id="obs_03_0083__i316019571683">destinationPrefix</em> indicates the prefix specified in the inventory configuration, which can be used to group inventory files. If no prefix is specified, the default prefix is <strong id="obs_03_0083__b129161247645">BucketInventory</strong>.</li><li id="obs_03_0083__li430385764915"><em id="obs_03_0083__i837963416113">sourceBucketName</em> indicates the source bucket for which the inventory is configured. This field can prevent conflicts when inventory files of different source buckets are saved to the same destination bucket.</li><li id="obs_03_0083__li12702105884919"><em id="obs_03_0083__i45955416161">inventoryId</em> can prevent conflicts when multiple inventory files of the same source bucket are sent to the same destination bucket.</li><li id="obs_03_0083__li20472115134916"><em id="obs_03_0083__i9633613171819">yyyy-MM-dd'T'HH-mm'Z'</em> indicates the start time and date when the inventory generation begins scanning the bucket. Objects uploaded to the source bucket after this time may not be listed in the inventory file.</li><li id="obs_03_0083__li134901541825"><strong id="obs_03_0083__b18969133216513">UUID_index.csv</strong> indicates one of the inventory files.</li></ul>
</div>
<div class="section" id="obs_03_0083__section265932074213"><h4 class="sectiontitle">The manifest.json File</h4><p id="obs_03_0083__p112396417419">If there are a large number of objects in a bucket, multiple inventory files may be generated for a single inventory configuration. It takes some time to generate these files. For example, if there are 200,000 objects in a bucket, it takes about 1.5 minutes to generate all inventory files. One or two hours after all inventory files are generated, a <strong id="obs_03_0083__b1497103219496">manifest.json</strong> file will be generated. The <strong id="obs_03_0083__b79773214912">manifest.json</strong> file contains information about all inventory files generated this time, including:</p>
<ul id="obs_03_0083__ul21392031092"><li id="obs_03_0083__li1213912317920"><strong id="obs_03_0083__b2184153865015">sourceBucket</strong> that indicates the name of the source bucket</li><li id="obs_03_0083__li193621538491"><strong id="obs_03_0083__b1495315195112">destinationBucket</strong> that indicates the name of the destination bucket</li><li id="obs_03_0083__li821725219918"><strong id="obs_03_0083__b3397941511">version</strong> that indicates the inventory version</li><li id="obs_03_0083__li975211316101"><strong id="obs_03_0083__b377317635118">fileFormat</strong> that indicates the inventory file format</li><li id="obs_03_0083__li9399815131019"><strong id="obs_03_0083__b141878918518">fileSchema</strong> that indicates the object metadata fields contained in the inventory files</li><li id="obs_03_0083__li11918161219117"><strong id="obs_03_0083__b4390811195119">files</strong> that indicates the list of all inventory files</li><li id="obs_03_0083__li11605174216112"><strong id="obs_03_0083__b8473171335117">key</strong> that indicates the inventory file name</li><li id="obs_03_0083__li1911135618117"><strong id="obs_03_0083__b18667151620513">size</strong> that indicates the inventory file size, in bytes</li><li id="obs_03_0083__li918803061214"><strong id="obs_03_0083__b328492275110">inventoriedRecord</strong> that indicates the number of inventory records</li></ul>
<div class="p" id="obs_03_0083__p5316855141217">The following is an example of a <strong id="obs_03_0083__b1148522445120">manifest.json</strong> file.<pre class="screen" id="obs_03_0083__screen68562022068">{
"sourceBucket":"user001",
"destinationBucket":"bucket001",
"version":"2019-01-03",
"fileFormat":"CSV",
"fileSchema":"Bucket,Key,Size,LastModifiedDate,ETag,StorageClass,IsMultipartUploaded,ReplicationStatus,EncryptionStatus",
"files":[
{
"key":"inventory%2Fuser001%2Ftest_id%2F2019-01-03T12-28Z%2Ffiles%2F0000016813AF58E66806C1E2D7F15155_1.csv",
"size":6705647390,
"inventoriedRecord":70585762,
}
]
}</pre>
</div>
<p id="obs_03_0083__p1940417175216">The name of the <strong id="obs_03_0083__b3622154919553">manifest.json</strong> file is as follows (for details about each field, see <a href="#obs_03_0083__section14172191411423">Inventory File Name</a>):</p>
<pre class="screen" id="obs_03_0083__screen228016318523">destinationPrefix/sourceBucketName/inventoryId/yyyy-MM-dd'T'HH-mm'Z'/manifest.json</pre>
</div>
<div class="section" id="obs_03_0083__section1497633735818"><h4 class="sectiontitle">The symlink.txt File</h4><p id="obs_03_0083__p3585155511582">The <strong id="obs_03_0083__b934317263217">symlink.txt</strong> file records the path of an inventory file. It helps quickly find all inventory files in big data scenarios. Apache Hive is compatible with the <strong id="obs_03_0083__b346614171081">symlink.txt</strong> file. Hive can automatically find the <strong id="obs_03_0083__b1462242885">symlink.txt</strong> file and the inventory files recorded in it.</p>
<p id="obs_03_0083__p77111930132917">The name of the <strong id="obs_03_0083__b1549120378515">symlink.txt</strong> file is as follows (for details about each field, see <a href="#obs_03_0083__section14172191411423">Inventory File Name</a>):</p>
<pre class="screen" id="obs_03_0083__screen47112030162910">destinationPrefix/sourceBucketName/inventoryId/hive/dt=YYYY-MM-DD-00-00/symlink.txt</pre>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="obs_03_0082.html">Bucket Inventories</a></div>
</div>
</div>