doc-exports/docs/mrs/umn/ALM-19009.html
Yang, Tong 3b1f73dece MRS UMN 2.0.38.SP20 version
Reviewed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com>
Co-authored-by: Yang, Tong <yangtong2@huawei.com>
Co-committed-by: Yang, Tong <yangtong2@huawei.com>
2022-12-13 12:03:34 +00:00

87 lines
15 KiB
HTML

<a name="ALM-19009"></a><a name="ALM-19009"></a>
<h1 class="topictitle1">ALM-19009 Direct Memory Usage of the HBase Process Exceeds the Threshold</h1>
<div id="body39746667"><div class="section" id="ALM-19009__s3435379dbdf748c08421e7ac1c26b5e7"><h4 class="sectiontitle">Description</h4><p id="ALM-19009__en-us_topic_0070543523_p2147306">The system checks the HBase service status every 30 seconds. The alarm is generated when the direct memory usage of an HBase service exceeds the threshold (90% of the maximum memory).</p>
<p id="ALM-19009__en-us_topic_0070543523_p19325759">The alarm is cleared when the direct memory usage is less than the threshold.</p>
<div class="note" id="ALM-19009__note14544102852418"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p class="text" id="ALM-19009__en-us_topic_0070543520_p32794215">If the multi-instance function is enabled in the cluster and multiple HBase service instances are installed, you need to determine the HBase service instance where the alarm is generated based on the value of <strong id="ALM-19009__en-us_topic_0070543520_b26712487">ServiceName</strong> in <strong id="ALM-19009__en-us_topic_0070543520_b39085796">Location</strong>. For example, if the HBase1 service is unavailable, <strong id="ALM-19009__en-us_topic_0070543520_b11832897">ServiceName=HBase1</strong> is displayed in <strong id="ALM-19009__en-us_topic_0070543520_b39387211">Location</strong>, and the operation object in the procedure needs to be changed from HBase to HBase1.</p>
</div></div>
</div>
<div class="section" id="ALM-19009__sf3d1a2ffedbd4aee8ba8579b3d3aaf5a"><h4 class="sectiontitle">Attribute</h4>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-19009__en-us_topic_0070543523_table21882681" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-19009__en-us_topic_0070543523_row59271898"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-19009__en-us_topic_0070543523_p36294438">Alarm ID</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-19009__en-us_topic_0070543523_p54168362">Alarm Severity</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-19009__en-us_topic_0070543523_p25561231">Automatically Cleared</p>
</th>
</tr>
</thead>
<tbody><tr id="ALM-19009__en-us_topic_0070543523_row57193853"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-19009__en-us_topic_0070543523_p2190491">19009</p>
</td>
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-19009__en-us_topic_0070543523_p43212049">Major</p>
</td>
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-19009__en-us_topic_0070543523_p10515080">Yes</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="ALM-19009__sc79d584ee66b4e159ebce7daa7df7aaf"><h4 class="sectiontitle">Parameters</h4>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-19009__en-us_topic_0070543523_table46415143" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-19009__en-us_topic_0070543523_row41820417"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-19009__en-us_topic_0070543523_p32010577">Name</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-19009__en-us_topic_0070543523_p42719917">Meaning</p>
</th>
</tr>
</thead>
<tbody><tr id="ALM-19009__row73008232103"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-19009__p192431315431">Source</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-19009__p692551319435">Specifies the cluster for which the alarm is generated.</p>
</td>
</tr>
<tr id="ALM-19009__en-us_topic_0070543523_row37761278"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-19009__en-us_topic_0070543523_p38764716">ServiceName</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-19009__en-us_topic_0070543523_p52934302">Specifies the service name for which the alarm is generated.</p>
</td>
</tr>
<tr id="ALM-19009__en-us_topic_0070543523_row6646677"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-19009__en-us_topic_0070543523_p1509928">RoleName</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-19009__en-us_topic_0070543523_p55195381">Specifies the role name for which the alarm is generated.</p>
</td>
</tr>
<tr id="ALM-19009__en-us_topic_0070543523_row26996384"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-19009__en-us_topic_0070543523_p39223476">HostName</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-19009__en-us_topic_0070543523_p22984981">Specifies the object (host ID) for which the alarm is generated.</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="ALM-19009__s6b4264f46b514e1a82ce8cb781cf9141"><h4 class="sectiontitle">Impact on the System</h4><p id="ALM-19009__en-us_topic_0070543523_p49844138">If the available HBase direct memory is insufficient, a memory overflow occurs and the service breaks down.</p>
</div>
<div class="section" id="ALM-19009__s6336b0fea6144e86932beeacc300ed62"><h4 class="sectiontitle">Possible Causes</h4><p id="ALM-19009__en-us_topic_0070543523_p10843355">The direct memory of the HBase service is overused or the direct memory is inappropriately allocated.</p>
</div>
<div class="section" id="ALM-19009__sa3cddf5eca904b619cf9c1f3d8ffaaa1"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-19009__en-us_topic_0070543523_p5896594"><strong id="ALM-19009__b8352718195430">Check direct memory usage.</strong></p>
<ol id="ALM-19009__ol64515859195437"><li id="ALM-19009__li36906153195425"><span>On the FusionInsight Manager portal, click <span class="menucascade" id="ALM-19009__menucascade9622183132519"><b><span class="uicontrol" id="ALM-19009__uicontrol862223115259">O&amp;M</span></b> &gt; <b><span class="uicontrol" id="ALM-19009__uicontrol106221731182512">Alarm</span></b> &gt; <b><span class="uicontrol" id="ALM-19009__uicontrol9622153122518">Alarms</span></b></span> and select the alarm whose <strong id="ALM-19009__b18853005195425">ID</strong> is <strong id="ALM-19009__b35459323195425">19009</strong>. Check the <strong id="ALM-19009__b1955573445015">RoleName</strong> in <strong id="ALM-19009__b052583712505">Location</strong> and confirm the IP address of <strong id="ALM-19009__b1241513413507">HostName</strong>.</span><p><ul class="subitemlist" id="ALM-19009__ul33926845195425"><li id="ALM-19009__li12934386195425">If the role for which the alarm is generated is HMaster, go to <a href="#ALM-19009__li51947016195425">2</a>.</li><li id="ALM-19009__li41052351195425">If the role for which the alarm is generated is RegionServer, go to <a href="#ALM-19009__li17000576195425">3</a>.</li></ul>
</p></li><li id="ALM-19009__li51947016195425"><a name="ALM-19009__li51947016195425"></a><a name="li51947016195425"></a><span>On the FusionInsight Manager portal, choose <strong id="ALM-19009__b93655375152">Cluster</strong> &gt; <em id="ALM-19009__i175981233194118">Name of the desired cluster</em> &gt;<strong id="ALM-19009__b1236543715159"> Services</strong> &gt; <strong id="ALM-19009__b36608407195425">HBase</strong> &gt; <strong id="ALM-19009__b61040211195425">Instance</strong> and click the HMaster for which the alarm is generated to go to the<strong id="ALM-19009__b14303164441516"> Dashboard </strong>page. Click the drop-down menu in the <strong id="ALM-19009__b653511582416">Chart </strong>area and choose<strong id="ALM-19009__b69658114246"> Customize</strong> &gt; <strong id="ALM-19009__b1785919537166">CPU and Memory</strong> &gt; <strong id="ALM-19009__b5137569195425">HMaster Heap Memory Usage and Direct Memory Usage Statistics</strong> and click <strong id="ALM-19009__b46238124195425">OK</strong> to check whether the used direct memory of the HBase service reaches 90% of the maximum direct memory specified for HBase.</span><p><ul class="subitemlist" id="ALM-19009__ul57967673195425"><li id="ALM-19009__li54300533195425">If yes, go to <a href="#ALM-19009__li30000576195425">4</a>.</li><li id="ALM-19009__li36267014195425">If no, go to <a href="#ALM-19009__li62317418195425">8</a>.</li></ul>
</p></li><li id="ALM-19009__li17000576195425"><a name="ALM-19009__li17000576195425"></a><a name="li17000576195425"></a><span>On the FusionInsight Manager portal, choose <strong id="ALM-19009__b390175141719">Cluster </strong>&gt; <em id="ALM-19009__i96641338163616">Name of the desired cluster</em> &gt;<strong id="ALM-19009__b1090213518178"> Services</strong> &gt; <strong id="ALM-19009__b46958754195425">HBase</strong> &gt; <strong id="ALM-19009__b19975608195425">Instance</strong> and click the RegionServer for which the alarm is generated to go to the<strong id="ALM-19009__b16371135214195"> Dashboard </strong>page. Click the drop-down menu in the <strong id="ALM-19009__b384164314233">Chart </strong>area and choose<strong id="ALM-19009__b18695040142313"> Customize</strong> &gt; <strong id="ALM-19009__b17229103817176">CPU and Memory </strong>&gt;<strong id="ALM-19009__b6230123881716"> RegionServer Heap Memory Usage and Direct Memory Usage Statistics</strong> and click <strong id="ALM-19009__b63465673195425">OK</strong> to check whether the used direct memory of the HBase service reaches 90% of the maximum direct memory specified for HBase.</span><p><ul class="subitemlist" id="ALM-19009__ul24258574195425"><li id="ALM-19009__li40445884195425">If yes, go to <a href="#ALM-19009__li30000576195425">4</a>.</li><li id="ALM-19009__li54891180195425">If no, go to <a href="#ALM-19009__li62317418195425">8</a>.</li></ul>
</p></li><li id="ALM-19009__li30000576195425"><a name="ALM-19009__li30000576195425"></a><a name="li30000576195425"></a><span>On the FusionInsight Manager portal, choose <strong id="ALM-19009__b15857144814172">Cluster </strong>&gt; <em id="ALM-19009__i158151043103617">Name of the desired cluster</em> &gt;<strong id="ALM-19009__b20858248131711"> Services</strong> &gt; <strong id="ALM-19009__b34869409195425">HBase</strong> &gt; <strong id="ALM-19009__b45389225195425">Configurations</strong>, and click <strong id="ALM-19009__b52648615195425">All Configurations</strong>. Choose <strong id="ALM-19009__b4075495195425">HMaster/RegionServer</strong> &gt; <strong id="ALM-19009__b36679462195425">System</strong> and check whether <strong id="ALM-19009__b68231112710">XX:MaxDirectMemorySize</strong> exists in <strong id="ALM-19009__b482313121517">GC_OPTS</strong>.</span><p><ul id="ALM-19009__ul148181324713"><li id="ALM-19009__li1779510368120">If yes, go to <a href="#ALM-19009__li131714294313">5</a>.</li><li id="ALM-19009__li171078461213">If no, go to <a href="#ALM-19009__li336333834">6</a>.</li></ul>
</p></li><li id="ALM-19009__li131714294313"><a name="ALM-19009__li131714294313"></a><a name="li131714294313"></a><span>On the FusionInsight Manager portal, choose <strong id="ALM-19009__b76331130936">Cluster </strong>&gt; <em id="ALM-19009__i1963318302310">Nameof the desired cluster</em> &gt;<strong id="ALM-19009__b156334301637"> Services</strong> &gt; <strong id="ALM-19009__b1863315303312">HBase</strong> &gt; <strong id="ALM-19009__b66331304316">Configurations</strong>, and click <strong id="ALM-19009__b1063315301335">All Configurations</strong>. Choose <strong id="ALM-19009__b206331730736">HMaster/RegionServer</strong> &gt; <strong id="ALM-19009__b1663316301238">System</strong> and delete <strong id="ALM-19009__b136338304320">XX:MaxDirectMemorySize</strong> from <strong id="ALM-19009__b146331307310">GC_OPTS</strong>.</span></li><li id="ALM-19009__li336333834"><a name="ALM-19009__li336333834"></a><a name="li336333834"></a><span>Check whether the <strong id="ALM-19009__b15780101619320">ALM-19008 Heap Memory Usage of the HBase Process Exceeds the Threshold</strong> alarm is generated.</span><p><p id="ALM-19009__p178201451238">If yes, handle the alarm by referring to <strong id="ALM-19009__b20936134113215">ALM-19008 Heap Memory Usage of the HBase Process Exceeds the Threshold</strong>.</p>
<p id="ALM-19009__p147673247417">If no, go to <a href="#ALM-19009__li62317418195425">8</a>.</p>
</p></li><li id="ALM-19009__li13893891195425"><span>Check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-19009__ul3485547195425"><li id="ALM-19009__li1569734195425">If yes, no further action is required.</li><li id="ALM-19009__li60039606195425">If no, go to <a href="#ALM-19009__li62317418195425">8</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-19009__p31369927195425"><strong id="ALM-19009__b22491650195455">Collect fault information.</strong></p>
<ol start="8" id="ALM-19009__ol37986748195458"><li id="ALM-19009__li62317418195425"><a name="ALM-19009__li62317418195425"></a><a name="li62317418195425"></a><span>On the FusionInsight Manager interface of active and standby clusters, choose <strong id="ALM-19009__b18341057121810">O&amp;M</strong> &gt; <strong id="ALM-19009__b11471205591811">Log </strong>&gt;<strong id="ALM-19009__b18471115541813"> Download</strong>.</span></li><li id="ALM-19009__li37385456195425"><span>In the <strong id="ALM-19009__b23985857195425">Service</strong> in the required cluster drop-down list box, select <strong id="ALM-19009__b14546125195425">HBase</strong>.</span></li><li id="ALM-19009__li1145664103113"><span>Click <span><img id="ALM-19009__image1945644173117" src="en-us_image_0269417423.png"></span> in the upper right corner, and set <strong id="ALM-19009__b6456941173117">Start Date</strong> and <strong id="ALM-19009__b11456154113318">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-19009__b13456164113319">Download</strong>.</span></li><li id="ALM-19009__li48139755195425"><span>Contact the <span id="ALM-19009__text4614151421417">O&amp;M personnel</span> and send the collected fault logs.</span></li></ol>
</div>
<div class="section" id="ALM-19009__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-19009__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
</div>
<div class="section" id="ALM-19009__s6541e68162fd43eca31c9a37236f1419"><h4 class="sectiontitle">Related Information</h4><p id="ALM-19009__en-us_topic_0070543523_p27619288">None</p>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
</div>
</div>