forked from docs/doc-exports
Reviewed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com> Co-authored-by: Yang, Tong <yangtong2@huawei.com> Co-committed-by: Yang, Tong <yangtong2@huawei.com>
105 lines
16 KiB
HTML
105 lines
16 KiB
HTML
<a name="mrs_01_1018"></a><a name="mrs_01_1018"></a>
|
|
|
|
<h1 class="topictitle1">Improving Real-time Data Read Performance</h1>
|
|
<div id="body1590128863845"><div class="section" id="mrs_01_1018__s367c39a0527d4d67bf75b6a9062afb20"><h4 class="sectiontitle">Scenario</h4><p id="mrs_01_1018__a77c1949d11bd403a972e2594d95880af">HBase data needs to be read.</p>
|
|
</div>
|
|
<div class="section" id="mrs_01_1018__sa4eeb3920d2b40a889c6427b40103d81"><h4 class="sectiontitle">Prerequisites</h4><p id="mrs_01_1018__a11961fda22524570a39d33999185a219">The get or scan interface of HBase has been invoked and data is read in real time from HBase.</p>
|
|
</div>
|
|
<div class="section" id="mrs_01_1018__sc385e92c5fda4d3d973d95916edb3edb"><h4 class="sectiontitle">Procedure</h4><ul id="mrs_01_1018__ue9bc36a9a24e4e118baae48820dca9b3"><li id="mrs_01_1018__la9dad44cde8d41099ad3d13e5dd331a6"><strong id="mrs_01_1018__b1048115531915">Data reading server tuning</strong><p id="mrs_01_1018__a40ba5dda32864c5cb8f4b302c92cf6df">Parameter portal:</p>
|
|
<p id="mrs_01_1018__p21172338305">Go to the <strong id="mrs_01_1018__b205261581781">All Configurations</strong> page of the HBase service. For details, see <a href="mrs_01_2125.html">Modifying Cluster Service Configuration Parameters</a>.</p>
|
|
|
|
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="mrs_01_1018__t7b5134c28c3645d2a89902715aae2e28" frame="border" border="1" rules="all"><caption><b>Table 1 </b>Configuration items that affect real-time data reading</caption><thead align="left"><tr id="mrs_01_1018__rc029c6b1949b4a089fbdff644f704b08"><th align="left" class="cellrowborder" valign="top" width="31.630000000000003%" id="mcps1.3.3.2.1.4.2.4.1.1"><p id="mrs_01_1018__en-us_topic_0116526926_p285027721170"><strong id="mrs_01_1018__b4328143415248">Parameter</strong></p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="45.92%" id="mcps1.3.3.2.1.4.2.4.1.2"><p id="mrs_01_1018__en-us_topic_0116526926_p418822571170"><strong id="mrs_01_1018__en-us_topic_0116526926_b413959981170">Description</strong></p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="22.45%" id="mcps1.3.3.2.1.4.2.4.1.3"><p id="mrs_01_1018__en-us_topic_0116526926_p647415761170"><strong id="mrs_01_1018__en-us_topic_0116526926_b458032791170">Default Value</strong></p>
|
|
</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody><tr id="mrs_01_1018__r2d300a1fc1404a999cdb604ea8e09a26"><td class="cellrowborder" valign="top" width="31.630000000000003%" headers="mcps1.3.3.2.1.4.2.4.1.1 "><p id="mrs_01_1018__en-us_topic_0116526926_p279116021170">GC_OPTS</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="45.92%" headers="mcps1.3.3.2.1.4.2.4.1.2 "><p id="mrs_01_1018__a5a564f50cdc045a189d1916392799498">You can increase HBase memory to improve HBase performance because read and write operations are performed in HBase memory.</p>
|
|
<p id="mrs_01_1018__af586da16692e4897ba56a7812d744899"><strong id="mrs_01_1018__b149401824259">HeapSize</strong> and <strong id="mrs_01_1018__b1494182132515">NewSize</strong> need to be adjusted. When you adjust <strong id="mrs_01_1018__b169263672514">HeapSize</strong>, set <strong id="mrs_01_1018__b6932569253">Xms</strong> and <strong id="mrs_01_1018__b139338622512">Xmx</strong> to the same value to avoid performance problems when JVM dynamically adjusts <strong id="mrs_01_1018__b353391712514">HeapSize</strong>. Set <strong id="mrs_01_1018__b66382256254">NewSize</strong> to 1/8 of <strong id="mrs_01_1018__b18959129152512">HeapSize</strong>.</p>
|
|
<ul id="mrs_01_1018__uc30f93a56cc64e559e45a4557c4dde03"><li id="mrs_01_1018__lf41deb99eef04b6ab62a67b52772357d"><strong id="mrs_01_1018__b2542183542513">HMaster</strong>: If HBase clusters enlarge and the number of Regions grows, properly increase the <strong id="mrs_01_1018__b1654873519258">GC_OPTS</strong> parameter value of the HMaster. </li><li id="mrs_01_1018__lb5ce3162405d4b61a4a667a3a8c61043"><strong id="mrs_01_1018__b14245104262513">RegionServer</strong>: A RegionServer needs more memory than an HMaster. If sufficient memory is available, increase the <strong id="mrs_01_1018__b798917456257">HeapSize</strong> value.</li></ul>
|
|
<div class="note" id="mrs_01_1018__n20d462fa8d2b4cc7abdeae4d30ece3f5"><span class="notetitle"> NOTE: </span><div class="notebody"><p id="mrs_01_1018__en-us_topic_0116526934_p301917211735">When the value of <strong id="mrs_01_1018__b390284992517">HeapSize</strong> for the active HMaster is 4 GB, the HBase cluster can support 100,000 regions. Empirically, each time 35,000 regions are added to the cluster, the value of <strong id="mrs_01_1018__b41591454132516">HeapSize</strong> must be increased by 2 GB. It is recommended that the value of <strong id="mrs_01_1018__b12166145442515">HeapSize</strong> for the active HMaster not exceed 32 GB.</p>
|
|
</div></div>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="22.45%" headers="mcps1.3.3.2.1.4.2.4.1.3 "><p id="mrs_01_1018__p44574123716">For versions earlier than MRS 3.x:</p>
|
|
<ul id="mrs_01_1018__u1908f8f9966a4cf0a960c85425b2e391"><li id="mrs_01_1018__l67ab481439e94cda815a994a83a9d075">HMaster:<p id="mrs_01_1018__a91bfd11c9649416f9117bf9fd128dcc5"><a name="mrs_01_1018__l67ab481439e94cda815a994a83a9d075"></a><a name="l67ab481439e94cda815a994a83a9d075"></a>-server -Xms2G -Xmx2G -XX:NewSize=256M -XX:MaxNewSize=256M -XX:MetaspaceSize=128M -XX:MaxMetaspaceSize=512M -XX:MaxDirectMemorySize=512M -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:CMSInitiatingOccupancyFraction=65 -XX:+PrintGCDetails -Dsun.rmi.dgc.client.gcInterval=0x7FFFFFFFFFFFFFE -Dsun.rmi.dgc.server.gcInterval=0x7FFFFFFFFFFFFFE -XX:-OmitStackTraceInFastThrow -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=1M</p>
|
|
</li><li id="mrs_01_1018__l914342c6b1184719a7110111d7495b29">RegionServer:<p id="mrs_01_1018__aeee0d4588d734939a41a1e0c7f8b0aab"><a name="mrs_01_1018__l914342c6b1184719a7110111d7495b29"></a><a name="l914342c6b1184719a7110111d7495b29"></a>-server -Xms4G -Xmx4G -XX:NewSize=512M -XX:MaxNewSize=512M -XX:MetaspaceSize=128M -XX:MaxMetaspaceSize=512M -XX:MaxDirectMemorySize=512M -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:CMSInitiatingOccupancyFraction=65 -XX:+PrintGCDetails -Dsun.rmi.dgc.client.gcInterval=0x7FFFFFFFFFFFFFE -Dsun.rmi.dgc.server.gcInterval=0x7FFFFFFFFFFFFFE -XX:-OmitStackTraceInFastThrow -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=1M</p>
|
|
</li></ul>
|
|
<p id="mrs_01_1018__p0619250164517">For MRS 3.<em id="mrs_01_1018__i107815281880">x</em> or later:</p>
|
|
<ul id="mrs_01_1018__ul15626105011451"><li id="mrs_01_1018__li12626195018459">HMaster<p id="mrs_01_1018__p1962605054513"><a name="mrs_01_1018__li12626195018459"></a><a name="li12626195018459"></a>-server -Xms4G -Xmx4G -XX:NewSize=512M -XX:MaxNewSize=512M -XX:MetaspaceSize=128M -XX:MaxMetaspaceSize=512M -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:CMSInitiatingOccupancyFraction=65 -XX:+PrintGCDetails -Dsun.rmi.dgc.client.gcInterval=0x7FFFFFFFFFFFFFE -Dsun.rmi.dgc.server.gcInterval=0x7FFFFFFFFFFFFFE -XX:-OmitStackTraceInFastThrow -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=1M</p>
|
|
</li><li id="mrs_01_1018__li14626145014512">Region Server<p id="mrs_01_1018__p116265506459"><a name="mrs_01_1018__li14626145014512"></a><a name="li14626145014512"></a>-server -Xms6G -Xmx6G -XX:NewSize=1024M -XX:MaxNewSize=1024M -XX:MetaspaceSize=128M -XX:MaxMetaspaceSize=512M -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:CMSInitiatingOccupancyFraction=65 -XX:+PrintGCDetails -Dsun.rmi.dgc.client.gcInterval=0x7FFFFFFFFFFFFFE -Dsun.rmi.dgc.server.gcInterval=0x7FFFFFFFFFFFFFE -XX:-OmitStackTraceInFastThrow -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=1M</p>
|
|
</li></ul>
|
|
</td>
|
|
</tr>
|
|
<tr id="mrs_01_1018__r2a90471e506e4be0b2953cabe933970d"><td class="cellrowborder" valign="top" width="31.630000000000003%" headers="mcps1.3.3.2.1.4.2.4.1.1 "><p id="mrs_01_1018__en-us_topic_0116526926_p85737481170">hbase.regionserver.handler.count</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="45.92%" headers="mcps1.3.3.2.1.4.2.4.1.2 "><p id="mrs_01_1018__en-us_topic_0116526926_p233849511170">Indicates the number of requests that RegionServer can process concurrently. If the parameter is set to an excessively large value, threads will compete fiercely. If the parameter is set to an excessively small value, requests will be waiting for a long time in RegionServer, reducing the processing capability. You can add threads based on resources.</p>
|
|
<p id="mrs_01_1018__en-us_topic_0116526926_p91379751170">It is recommended that the value be set to 100 to 300 based on the CPU usage.</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="22.45%" headers="mcps1.3.3.2.1.4.2.4.1.3 "><p id="mrs_01_1018__en-us_topic_0116526926_p19785191170">200</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="mrs_01_1018__r4b1b5a656a674dcc95857960f4b7a8f9"><td class="cellrowborder" valign="top" width="31.630000000000003%" headers="mcps1.3.3.2.1.4.2.4.1.1 "><p id="mrs_01_1018__en-us_topic_0116526926_p330543901170">hfile.block.cache.size</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="45.92%" headers="mcps1.3.3.2.1.4.2.4.1.2 "><p id="mrs_01_1018__en-us_topic_0116526926_p601599481170">HBase cache sizes affect query efficiency. Set cache sizes based on query modes and query record distribution. If random query is used to reduce the hit ratio of the buffer, you can reduce the buffer size.</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="22.45%" headers="mcps1.3.3.2.1.4.2.4.1.3 "><p id="mrs_01_1018__en-us_topic_0116526926_p411176261170">When <strong id="mrs_01_1018__b157231838388">offheap</strong> is disabled, the default value is <strong id="mrs_01_1018__b034018431189">0.25</strong>. When <strong id="mrs_01_1018__b190817471389">offheap</strong> is enabled, the default value is <strong id="mrs_01_1018__b1075118504815">0.1</strong>.</p>
|
|
</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
</div>
|
|
<div class="note" id="mrs_01_1018__nb161224c92a14851a8aac6cf7f806aac"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="mrs_01_1018__a0eb32f05aa284e60b9e0004dcd7237d5">If read and write operations are performed at the same time, the performance of the two operations affects each other. If flush and compaction operations are frequently performed due to data writes, a large number of disk I/O operations are occupied, affecting read performance. If a large number of compaction operations are blocked due to write operations, multiple HFiles exist in the region, affecting read performance. Therefore, if the read performance is unsatisfactory, you need to check whether the write configurations are proper.</p>
|
|
</div></div>
|
|
</li><li id="mrs_01_1018__l59d5b7ac986e4824a4ce4d7256f0f7bc"><strong id="mrs_01_1018__b1193717331210">Data reading client tuning</strong><p id="mrs_01_1018__a2df91f76c6fe43e2aabc5a40a8411771">When scanning data, you need to set <strong id="mrs_01_1018__b179116982814">caching</strong> (the number of records read from the server at a time. The default value is <strong id="mrs_01_1018__b20239191615285">1</strong>.). If the default value is used, the read performance will be extremely low.</p>
|
|
<p id="mrs_01_1018__ad9a43281e9e14ac4b9af664103586f63">If you do not need to read all columns of a piece of data, specify the columns to be read to reduce network I/O.</p>
|
|
<p id="mrs_01_1018__ad7e58c842cca456abab0706ee561f29c">If you only need to read the row key, add a filter (FirstKeyOnlyFilter or KeyOnlyFilter) that only reads the row key.</p>
|
|
</li><li id="mrs_01_1018__l033351bd4f2d4a1c8b56bbe8293fa371"><strong id="mrs_01_1018__b83731910102617">Data table reading design optimization</strong>
|
|
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="mrs_01_1018__t748e7b27a2ad436da61e1e8523d5548d" frame="border" border="1" rules="all"><caption><b>Table 2 </b>Parameters affecting real-time data reading</caption><thead align="left"><tr id="mrs_01_1018__r1d93070f0a0348739c4973d837d2dc78"><th align="left" class="cellrowborder" valign="top" width="24%" id="mcps1.3.3.2.3.2.2.4.1.1"><p id="mrs_01_1018__a642bb99abd55490f8e72496f6e223b86"><strong id="mrs_01_1018__b1772265183016">Parameter</strong></p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="54%" id="mcps1.3.3.2.3.2.2.4.1.2"><p id="mrs_01_1018__af36bde3dd71e4364b6285585413dedde"><strong id="mrs_01_1018__aa55c360ec91d46a2896ae5342ac68701">Description</strong></p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="22%" id="mcps1.3.3.2.3.2.2.4.1.3"><p id="mrs_01_1018__aa52981af9c0047ed94abbd69e8758783"><strong id="mrs_01_1018__af125bd3f9dd54802bf807827867f16eb">Default Value</strong></p>
|
|
</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody><tr id="mrs_01_1018__r971ddf0288bf4f1fa4894f5e7c487294"><td class="cellrowborder" valign="top" width="24%" headers="mcps1.3.3.2.3.2.2.4.1.1 "><p id="mrs_01_1018__a4dee128175134b9ba24d0a834385bf56">COMPRESSION</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="54%" headers="mcps1.3.3.2.3.2.2.4.1.2 "><p id="mrs_01_1018__a90a22b2257224c7c94ef7c8fbabb1327">The compression algorithm compresses blocks in HFiles. For compressible data, configure the compression algorithm to efficiently reduce disk I/Os and improve performance. </p>
|
|
<div class="note" id="mrs_01_1018__n14e03159675e4eb283a8d7d57e1687bf"><span class="notetitle"> NOTE: </span><div class="notebody"><p id="mrs_01_1018__aca6a8aceb07b474da9d388f37fbd5425">Some data cannot be efficiently compressed. For example, a compressed figure can hardly be compressed again. The common compression algorithm is SNAPPY, because it has a high encoding/decoding speed and acceptable compression rate.</p>
|
|
</div></div>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="22%" headers="mcps1.3.3.2.3.2.2.4.1.3 "><p id="mrs_01_1018__a07e44e5d37d746c58ebce7ce9698118f">NONE</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="mrs_01_1018__rc495098a9bad46739316f7d807f62752"><td class="cellrowborder" valign="top" width="24%" headers="mcps1.3.3.2.3.2.2.4.1.1 "><p id="mrs_01_1018__acb6808d5e7ae42a0a1a6ccba9fbf99d0">BLOCKSIZE</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="54%" headers="mcps1.3.3.2.3.2.2.4.1.2 "><p id="mrs_01_1018__a7e319dc59f5047c4a90b640e5eb02638">Different block sizes affect HBase data read and write performance. You can configure sizes for blocks in an HFile. Larger blocks have a higher compression rate. However, they have poor performance in random data read, because HBase reads data in a unit of blocks. </p>
|
|
<p id="mrs_01_1018__a828e1bacebee4926b5e40cbdd647a67c">Set the parameter to 128 KB or 256 KB to improve data write efficiency without greatly affecting random read performance. The unit is byte.</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="22%" headers="mcps1.3.3.2.3.2.2.4.1.3 "><p id="mrs_01_1018__a5528829dce524a6fbc259d8d27fc20b9">65536</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="mrs_01_1018__r981da489972741ed98ea887870fcaaf7"><td class="cellrowborder" valign="top" width="24%" headers="mcps1.3.3.2.3.2.2.4.1.1 "><p id="mrs_01_1018__acf078e37b3254991b5274b91fe0c9390">DATA_BLOCK_ENCODING</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="54%" headers="mcps1.3.3.2.3.2.2.4.1.2 "><p id="mrs_01_1018__ace1bf1bd3c754092a794bdc4bbd763cc">Encoding method of the block in an HFile. If a row contains multiple columns, set <strong id="mrs_01_1018__b204741115526">FAST_DIFF</strong> to save data storage space and improve performance.</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="22%" headers="mcps1.3.3.2.3.2.2.4.1.3 "><p id="mrs_01_1018__a8d34ab06bf0942a4b10ffda915e23179">NONE</p>
|
|
</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
</div>
|
|
</li></ul>
|
|
</div>
|
|
</div>
|
|
<div>
|
|
<div class="familylinks">
|
|
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1013.html">HBase Performance Tuning</a></div>
|
|
</div>
|
|
</div>
|
|
|