forked from docs/doc-exports
Reviewed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com> Co-authored-by: Yang, Tong <yangtong2@huawei.com> Co-committed-by: Yang, Tong <yangtong2@huawei.com>
51 lines
12 KiB
HTML
51 lines
12 KiB
HTML
<a name="mrs_01_24112"></a><a name="mrs_01_24112"></a>
|
|
|
|
<h1 class="topictitle1">Configuring HBase Data Compression and Encoding</h1>
|
|
<div id="body0000001159461809"><div class="section" id="mrs_01_24112__section856117255431"><h4 class="sectiontitle">Scenario</h4><p id="mrs_01_24112__p8867193317424">HBase encodes data blocks in HFiles to reduce duplicate keys in KeyValues, reducing used space. Currently, the following data block encoding modes are supported: NONE, PREFIX, DIFF, FAST_DIFF, and ROW_INDEX_V1. NONE indicates that data blocks are not encoded. HBase also supports compression algorithms for HFile compression. The following algorithms are supported by default: NONE, GZ, SNAPPY, and ZSTD. NONE indicates that HFiles are not compressed.</p>
|
|
<p id="mrs_01_24112__p513713147219">The two methods are used on the HBase column family. They can be used together or separately.</p>
|
|
</div>
|
|
<div class="section" id="mrs_01_24112__section1955213122560"><h4 class="sectiontitle">Prerequisites</h4><ul id="mrs_01_24112__ul14649132212816"><li id="mrs_01_24112__li16649122202810">You have installed an HBase client. For example, the client is installed in <strong id="mrs_01_24112__b1030812712394">opt/client</strong>.</li><li id="mrs_01_24112__li14428928142917">If authentication has been enabled for HBase, you must have the corresponding operation permissions. For example, you must have the creation (C) or administration (A) permission on the corresponding namespace or higher-level items to create a table, and the creation (C) or administration (A) permission on the created table or higher-level items to modify a table. For details about how to grant permissions, see <a href="mrs_01_1608.html">Creating HBase Roles</a>.</li></ul>
|
|
</div>
|
|
<div class="section" id="mrs_01_24112__section190602818563"><h4 class="sectiontitle">Procedure</h4><p id="mrs_01_24112__p9644134318481"><strong id="mrs_01_24112__b14791661761">Setting data block encoding and compression algorithms during creation</strong></p>
|
|
<ul id="mrs_01_24112__ul183151238122214"><li id="mrs_01_24112__li8316638122217"><strong id="mrs_01_24112__b13146500619">Method 1: Using hbase shell</strong><ol id="mrs_01_24112__ol1892965082215"><li id="mrs_01_24112__li189291850162213"><span id="mrs_01_24112__ph104395241162">Log in to the node where the client is installed as the client installation user.</span></li><li id="mrs_01_24112__li1192912503223">Run the following command to go to the client directory:<p id="mrs_01_24112__p15265165712228"><a name="mrs_01_24112__li1192912503223"></a><a name="li1192912503223"></a><strong id="mrs_01_24112__b315455003318">cd /opt/client</strong></p>
|
|
</li><li id="mrs_01_24112__li1929125022219">Run the following command to configure environment variables:<p id="mrs_01_24112__p11823155812211"><a name="mrs_01_24112__li1929125022219"></a><a name="li1929125022219"></a><strong id="mrs_01_24112__b915445016330">source bigdata_env</strong></p>
|
|
</li><li id="mrs_01_24112__li1392975018225">If the Kerberos authentication is enabled for the current cluster, run the following command to authenticate the user. If Kerberos authentication is disabled for the current cluster, skip this step:<p id="mrs_01_24112__p1092116222310"><a name="mrs_01_24112__li1392975018225"></a><a name="li1392975018225"></a><strong id="mrs_01_24112__b159910713018">kinit</strong> <em id="mrs_01_24112__i13993715300">Component service user</em></p>
|
|
<p id="mrs_01_24112__p1692142152314">For example, <strong id="mrs_01_24112__b357603313211">kinit hbaseuser</strong>.</p>
|
|
</li><li id="mrs_01_24112__li293014506225">Run the following HBase client command:<p id="mrs_01_24112__p1146212582316"><a name="mrs_01_24112__li293014506225"></a><a name="li293014506225"></a><strong id="mrs_01_24112__b11551509336">hbase shell</strong></p>
|
|
</li><li id="mrs_01_24112__li119301503224">Create a table.<div class="p" id="mrs_01_24112__p18582752319"><a name="mrs_01_24112__li119301503224"></a><a name="li119301503224"></a><strong id="mrs_01_24112__b11930852117">create '</strong><em id="mrs_01_24112__i121401541018">t1</em><strong id="mrs_01_24112__b14991010211">', {NAME => '</strong><em id="mrs_01_24112__i19411211728">f1</em><strong id="mrs_01_24112__b149121513329">', COMPRESSION => '</strong><em id="mrs_01_24112__i167626142211">SNAPPY</em><strong id="mrs_01_24112__b56398252028">', DATA_BLOCK_ENCODING => '</strong><em id="mrs_01_24112__i6214326929">FAST_DIFF</em><strong id="mrs_01_24112__b564012515214">'}</strong><div class="note" id="mrs_01_24112__note14461142814234"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><ul id="mrs_01_24112__ul1050934116236"><li id="mrs_01_24112__li1350944152311"><em id="mrs_01_24112__i164851658152115">t1</em>: indicates the table name.</li><li id="mrs_01_24112__li1350912415237"><em id="mrs_01_24112__i1340084414308">f1</em>: indicates the column family name.</li><li id="mrs_01_24112__li17509164116234"><em id="mrs_01_24112__i164041528313">SNAPPY</em>: indicates the column family uses the SNAPPY compression algorithm.</li><li id="mrs_01_24112__li20509174112237"><em id="mrs_01_24112__i466195813210">FAST_DIFF</em>: indicates FAST_DIFF is used for encoding.</li><li id="mrs_01_24112__li145091041192311">The parameter in the braces specifies the column family. You can specify multiple column families using multiple braces and separate them by commas (,). For details about table creation statements, run the <strong id="mrs_01_24112__b19988173512314">help 'create'</strong> statement in the HBase shell.</li></ul>
|
|
</div></div>
|
|
</div>
|
|
</li></ol>
|
|
</li><li id="mrs_01_24112__li831653852210"><strong id="mrs_01_24112__b1961414314333">Method 2: Using Java APIs</strong><div class="p" id="mrs_01_24112__p93351536103919">The following code snippet shows only how to set the encoding and compression modes of a column family when creating a table. For complete code for creating a table and how to use the code to create a table, see "HBase Development Guide" > "Modifying a Table" in <em id="mrs_01_24112__i148338331517"><span id="mrs_01_24112__text195101951108"></span></em>.<pre class="screen" id="mrs_01_24112__screen137911513329">TableDescriptorBuilder htd = TableDescriptorBuilder.newBuilder(TableName.valueOf("t1"));// Create a descriptor for table <strong id="mrs_01_24112__b199911308462">t1</strong>.
|
|
ColumnFamilyDescriptorBuilder hcd = ColumnFamilyDescriptorBuilder.newBuilder(Bytes.toBytes("f1"));// Create a builder for column family <strong id="mrs_01_24112__b1656793731317">f1</strong>.
|
|
hcd.setDataBlockEncoding(DataBlockEncoding.FAST_DIFF);// Set the encoding mode of column family <strong id="mrs_01_24112__b1632747199">f1</strong> to <strong id="mrs_01_24112__b381413595567">FAST_DIFF</strong>.
|
|
hcd.setCompressionType(Compression.Algorithm.SNAPPY);// Set the compression algorithm of column family <strong id="mrs_01_24112__b68388162199">f1</strong> to <strong id="mrs_01_24112__b102759316570">SNAPPY</strong>.
|
|
htd.setColumnFamily(hcd.build())// Add the column family <strong id="mrs_01_24112__b4451191005713">f1</strong> to the descriptor of table <strong id="mrs_01_24112__b9451101011577">t1</strong>.</pre>
|
|
</div>
|
|
</li></ul>
|
|
<p id="mrs_01_24112__p671955972114"><strong id="mrs_01_24112__b102211347134418">Setting or modifying the data block encoding mode and compression algorithm for an existing table</strong></p>
|
|
<ul id="mrs_01_24112__ul72911249153114"><li id="mrs_01_24112__li13291124915318"><strong id="mrs_01_24112__b581671143614">Method 1: Using hbase shell</strong><ol id="mrs_01_24112__ol1027135415311"><li id="mrs_01_24112__li52713544319"><span id="mrs_01_24112__ph85785267482">Log in to the node where the client is installed as the client installation user.</span></li><li id="mrs_01_24112__li1027115545316">Run the following command to go to the client directory:<p id="mrs_01_24112__p15908572318"><a name="mrs_01_24112__li1027115545316"></a><a name="li1027115545316"></a><strong id="mrs_01_24112__b1762164232614">cd /opt/client</strong></p>
|
|
</li><li id="mrs_01_24112__li427115545310">Run the following command to configure environment variables:<p id="mrs_01_24112__p37521559133118"><a name="mrs_01_24112__li427115545310"></a><a name="li427115545310"></a><strong id="mrs_01_24112__b1962154219269">source bigdata_env</strong></p>
|
|
</li><li id="mrs_01_24112__li1327110542319">If the Kerberos authentication is enabled for the current cluster, run the following command to authenticate the user. If Kerberos authentication is disabled for the current cluster, skip this step:<p id="mrs_01_24112__p72961825323"><a name="mrs_01_24112__li1327110542319"></a><a name="li1327110542319"></a><strong id="mrs_01_24112__b1064850561">kinit</strong> <em id="mrs_01_24112__i1624705967">Component service user</em></p>
|
|
<p id="mrs_01_24112__p929617213210">For example, <strong id="mrs_01_24112__b2676162054611">kinit hbaseuser</strong>.</p>
|
|
</li><li id="mrs_01_24112__li152711954113114">Run the following HBase client command:<p id="mrs_01_24112__p15303443211"><a name="mrs_01_24112__li152711954113114"></a><a name="li152711954113114"></a><strong id="mrs_01_24112__b152771656162611">hbase shell</strong></p>
|
|
</li><li id="mrs_01_24112__li1727115544315">Run the following command to modify the table:<p id="mrs_01_24112__p1398325103212"><a name="mrs_01_24112__li1727115544315"></a><a name="li1727115544315"></a><strong id="mrs_01_24112__b1411032012715">alter '</strong><em id="mrs_01_24112__i480572062710">t1</em><strong id="mrs_01_24112__b11706529142716">', {NAME => '</strong><em id="mrs_01_24112__i728553019273">f1</em><strong id="mrs_01_24112__b16561334152715">', COMPRESSION => '</strong><em id="mrs_01_24112__i431443519276">SNAPPY</em><strong id="mrs_01_24112__b12714123918274">', DATA_BLOCK_ENCODING => '</strong><em id="mrs_01_24112__i1306174072717">FAST_DIFF</em><strong id="mrs_01_24112__b3715123902719">'}</strong></p>
|
|
</li></ol>
|
|
</li><li id="mrs_01_24112__li12291164915311"><strong id="mrs_01_24112__b99802515468">Method 2: Using Java APIs</strong><p id="mrs_01_24112__p1355975003919">The following code snippet shows only how to modify the encoding and compression modes of a column family in an existing table. For complete code for modifying a table and how to use the code to modify a table, see "HBase Development Guide".</p>
|
|
<pre class="screen" id="mrs_01_24112__screen9649349131512">TableDescriptor htd = admin.getDescriptor(TableName.valueOf("t1"));// Obtain the descriptor of table <strong id="mrs_01_24112__b1860154525214">t1</strong>.
|
|
ColumnFamilyDescriptor originCF = htd.getColumnFamily(Bytes.toBytes("f1"));// Obtain the descriptor of column family <strong id="mrs_01_24112__b187615581307">f1</strong>.
|
|
builder.ColumnFamilyDescriptorBuilder hcd = ColumnFamilyDescriptorBuilder.newBuilder(originCF);// Create a builder based on the existing column family attributes.
|
|
hcd.setDataBlockEncoding(DataBlockEncoding.FAST_DIFF);// Change the encoding mode of the column family to <strong id="mrs_01_24112__b1358764519578">FAST_DIFF</strong>.
|
|
hcd.setCompressionType(Compression.Algorithm.SNAPPY);// Change the compression algorithm of the column family to <strong id="mrs_01_24112__b151731747155713">SNAPPY</strong>.
|
|
admin.modifyColumnFamily(TableName.valueOf("t1"), hcd.build());// Submit to the server to modify the attributes of column family <strong id="mrs_01_24112__b11786334651">f1</strong>.</pre>
|
|
<p id="mrs_01_24112__p1733254712215">After the modification, the encoding and compression modes of the existing HFile will take effect after the next compaction.</p>
|
|
</li></ul>
|
|
</div>
|
|
</div>
|
|
<div>
|
|
<div class="familylinks">
|
|
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_0500.html">Using HBase</a></div>
|
|
</div>
|
|
</div>
|
|
|