Files
doc-exports/docs/mrs/umn/ALM-19008.html
Yang, Tong 2195db241c MRS UMN 20231220 version update
Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com>
Reviewed-by: Rechenburg, Matthias <matthias.rechenburg@t-systems.com>
Co-authored-by: Yang, Tong <yangtong2@huawei.com>
Co-committed-by: Yang, Tong <yangtong2@huawei.com>
2024-05-16 09:40:21 +00:00

85 lines
15 KiB
HTML

<a name="ALM-19008"></a><a name="ALM-19008"></a>
<h1 class="topictitle1">ALM-19008 Heap Memory Usage of the HBase Process Exceeds the Threshold</h1>
<div id="body11029825"><div class="section" id="ALM-19008__sdcd3a633fbaf494d887e760724f8fa96"><h4 class="sectiontitle">Description</h4><p id="ALM-19008__en-us_topic_0070543522_p31353106">The system checks the HBase service status every 30 seconds. The alarm is generated when the heap memory usage of an HBase service exceeds the threshold (90% of the maximum memory).</p>
</div>
<div class="section" id="ALM-19008__s17c729e9f14a42b2b71d8b47bfa0a813"><h4 class="sectiontitle">Attribute</h4>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-19008__en-us_topic_0070543522_table56573654" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-19008__en-us_topic_0070543522_row50443655"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-19008__en-us_topic_0070543522_p59404221">Alarm ID</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-19008__en-us_topic_0070543522_p47012596">Alarm Severity</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-19008__en-us_topic_0070543522_p49923923">Automatically Cleared</p>
</th>
</tr>
</thead>
<tbody><tr id="ALM-19008__en-us_topic_0070543522_row17305987"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-19008__en-us_topic_0070543522_p59607724">19008</p>
</td>
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-19008__en-us_topic_0070543522_p63496309">Major</p>
</td>
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-19008__en-us_topic_0070543522_p42927366">Yes</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="ALM-19008__s728ce682fb9f4ed2b6aa1a1b14b105e0"><h4 class="sectiontitle">Parameters</h4>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-19008__en-us_topic_0070543522_table54564616" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-19008__en-us_topic_0070543522_row22360170"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-19008__en-us_topic_0070543522_p66343369">Name</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-19008__en-us_topic_0070543522_p5103804">Meaning</p>
</th>
</tr>
</thead>
<tbody><tr id="ALM-19008__row3210133011108"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-19008__p192431315431">Source</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-19008__p692551319435">Specifies the cluster for which the alarm is generated.</p>
</td>
</tr>
<tr id="ALM-19008__en-us_topic_0070543522_row10754988"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-19008__en-us_topic_0070543522_p65847680">ServiceName</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-19008__en-us_topic_0070543522_p32061857">Specifies the service name for which the alarm is generated.</p>
</td>
</tr>
<tr id="ALM-19008__en-us_topic_0070543522_row20121257"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-19008__en-us_topic_0070543522_p19209127">RoleName</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-19008__en-us_topic_0070543522_p12435494">Specifies the role name for which the alarm is generated.</p>
</td>
</tr>
<tr id="ALM-19008__en-us_topic_0070543522_row44810590"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-19008__en-us_topic_0070543522_p5779136">HostName</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-19008__en-us_topic_0070543522_p65456887">Specifies the object (host ID) for which the alarm is generated.</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="ALM-19008__s1e06f7c5b3ed477da744c2cbfffa7063"><h4 class="sectiontitle">Impact on the System</h4><p id="ALM-19008__en-us_topic_0070543522_p407659">If the available HBase heap memory is insufficient, a memory overflow occurs and the service breaks down.</p>
</div>
<div class="section" id="ALM-19008__s0561753f16b140dca3e12e07f1f3452a"><h4 class="sectiontitle">Possible Causes</h4><p id="ALM-19008__en-us_topic_0070543522_p33020419">The heap memory of the HBase service is overused or the heap memory is inappropriately allocated.</p>
</div>
<div class="section" id="ALM-19008__s28d4d569b23a4c61ab21f5d72e56aad1"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-19008__en-us_topic_0070543522_p57408296"><strong id="ALM-19008__b17926675195116">Check heap memory usage.</strong></p>
<ol id="ALM-19008__ol66169051195140"><li id="ALM-19008__li25829638195110"><span>On the <span id="ALM-19008__text34789336432">MRS</span> Manager portal, click <span class="menucascade" id="ALM-19008__menucascade9622183132519"><b><span class="uicontrol" id="ALM-19008__uicontrol862223115259">O&amp;M</span></b> &gt; <b><span class="uicontrol" id="ALM-19008__uicontrol106221731182512">Alarm</span></b> &gt; <b><span class="uicontrol" id="ALM-19008__uicontrol9622153122518">Alarms</span></b></span> and select the alarm whose <strong id="ALM-19008__b48537881195110">ID</strong> is <strong id="ALM-19008__b34187752195110">19008</strong>. Then check the role name in <strong id="ALM-19008__b14790172183618">Location </strong>and confirm the IP adress of the instance.</span><p><ul class="subitemlist" id="ALM-19008__ul55065742195110"><li id="ALM-19008__li25483007195110">If the role for which the alarm is generated is HMaster, go to <a href="#ALM-19008__li11316139195110">2</a>.</li><li id="ALM-19008__li50857658195110">If the role for which the alarm is generated is RegionServer, go to <a href="#ALM-19008__li44755158195110">3</a>.</li></ul>
</p></li><li id="ALM-19008__li11316139195110"><a name="ALM-19008__li11316139195110"></a><a name="li11316139195110"></a><span>On the <span id="ALM-19008__text121625118568">MRS</span> Manager portal, choose <strong id="ALM-19008__b1828016461471">Cluster</strong> &gt; <em id="ALM-19008__i175981233194118">Name of the desired cluster</em> &gt; <strong id="ALM-19008__b31140154195110">Services</strong> &gt; <strong id="ALM-19008__b11825930195110">HBase</strong> &gt; <strong id="ALM-19008__b39324513195110">Instance</strong> and click the HMaster for which the alarm is generated to go to the<strong id="ALM-19008__b14303164441516"> Dashboard </strong>page. Click the drop-down menu in the <strong id="ALM-19008__b1937114710229">Chart </strong>area and choose<strong id="ALM-19008__b1115264402213"> Customize</strong> &gt; <strong id="ALM-19008__b1875719437109">CPU and Memory </strong>&gt;<strong id="ALM-19008__b187571743161016"> HMaster Heap Memory Usage and Direct Memory Usage Statistics</strong> and click <strong id="ALM-19008__b41661510195110">OK</strong>, check whether the used heap memory of the HBase service reaches 90% of the maximum heap memory specified for HBase.</span><p><ul class="subitemlist" id="ALM-19008__ul60909672195110"><li id="ALM-19008__li19139155195110">If yes, go to <a href="#ALM-19008__li27009410195110">4</a>.</li><li id="ALM-19008__li6767741195110">If no, go to <a href="#ALM-19008__li56360562195110">6</a>.</li></ul>
</p></li><li id="ALM-19008__li44755158195110"><a name="ALM-19008__li44755158195110"></a><a name="li44755158195110"></a><span>On the <span id="ALM-19008__text51281524562">MRS</span> Manager portal, choose <strong id="ALM-19008__b1639614811118">Cluster </strong>&gt; <em id="ALM-19008__i1224712206367">Name of the desired cluster</em> &gt;<strong id="ALM-19008__b13246192019361"> Services</strong> &gt; <strong id="ALM-19008__b44192070195110">HBase</strong> &gt; <strong id="ALM-19008__b62184311195110">Instance</strong> and click the RegionServer for which the alarm is generated to go to the<strong id="ALM-19008__b1997562651919"> Dashboard </strong>page. Click the drop-down menu in the <strong id="ALM-19008__b386751982316">Chart </strong>area and choose<strong id="ALM-19008__b971441612232"> Customize</strong> &gt; <strong id="ALM-19008__b19841163301117">CPU and Memory </strong>&gt;<strong id="ALM-19008__b584393351110"> RegionServer Heap Memory Usage and Direct Memory Usage Statistics</strong> and click <strong id="ALM-19008__b36484841195110">OK</strong>, check whether the used heap memory of the HBase service reaches 90% of the maximum heap memory specified for HBase.</span><p><ul class="subitemlist" id="ALM-19008__ul64625118195110"><li id="ALM-19008__li2482106195110">If yes, go to <a href="#ALM-19008__li27009410195110">4</a>.</li><li id="ALM-19008__li66832892195110">If no, go to <a href="#ALM-19008__li56360562195110">6</a>.</li></ul>
</p></li><li id="ALM-19008__li27009410195110"><a name="ALM-19008__li27009410195110"></a><a name="li27009410195110"></a><span>On the <span id="ALM-19008__text3298053165619">MRS</span> Manager portal, choose <strong id="ALM-19008__b63881453111114">Cluster </strong>&gt; <em id="ALM-19008__i25071625183615">Name of the desired cluster</em> &gt;<strong id="ALM-19008__b6389125301116"> Services</strong> &gt; <strong id="ALM-19008__b1289201195110">HBase</strong> &gt; <strong id="ALM-19008__b11602814195110">Configurations</strong>, and click <strong id="ALM-19008__b303901195110">All Config</strong><strong id="ALM-19008__b837616138137">urations</strong>. Choose <strong id="ALM-19008__b2735112195110">HMaster/RegionServer</strong> &gt; <strong id="ALM-19008__b24616016195110">System</strong>. Increase the value of <strong id="ALM-19008__b20217557195110">-Xmx</strong> in <strong id="ALM-19008__b47740288195110">GC_OPTS</strong> by referring to the Note.</span><p><div class="note" id="ALM-19008__note13595132742917"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><ol type="a" id="ALM-19008__ol98931136413"><li id="ALM-19008__li12894436414">Suggestions on GC parameter configurations for HMaster<ul id="ALM-19008__ul1575412555118"><li id="ALM-19008__li207541155214">Set <strong id="ALM-19008__b157677941118">-Xms</strong> and <strong id="ALM-19008__b1776710918113">-Xmx</strong> to the same value to prevent JVM from dynamically adjusting the heap memory size and affecting performance.</li><li id="ALM-19008__li8754175518118">Set <strong id="ALM-19008__b1706142619239">-XX:NewSize</strong> to the value of <strong id="ALM-19008__b47061026172320">-XX:MaxNewSize</strong>, which is one eighth of <strong id="ALM-19008__b20706626122313">-Xmx</strong>.</li><li id="ALM-19008__li875445512117">For large-scale HBase clusters with a large number of regions, increase values of <strong id="ALM-19008__b398019572614">GC_OPTS</strong> parameters for HMaster. Specifically, set <strong id="ALM-19008__b17980859260">-Xmx</strong> to 4 GB if the number of regions is less than 100,000. If the number of regions is more than 100,000, set -Xmx to be greater than or equal to 6 GB. For each increased 35,000 regions, increase the value of <strong id="ALM-19008__b79808582618">-Xmx</strong> by 2 GB. The maximum value of <strong id="ALM-19008__b109808518268">-Xmx</strong> is 32 GB.</li></ul>
</li><li id="ALM-19008__li1995063410311">Suggestions on GC parameter configurations for RegionServer<ul id="ALM-19008__ul8262185416213"><li id="ALM-19008__li11863072314">Set <strong id="ALM-19008__b19627407123">-Xms</strong> and <strong id="ALM-19008__b1262124061218">-Xmx</strong> to the same value to prevent JVM from dynamically adjusting the heap memory size and affecting performance.</li><li id="ALM-19008__li162621954329">Set <strong id="ALM-19008__b250503141217">-XX:NewSize</strong> to one eighth of <strong id="ALM-19008__b17505203111126">-Xmx</strong>.</li><li id="ALM-19008__li889514285313">Set the memory for RegionServer to be greater than that for HMaster. If sufficient memory is available, increase the heap memory.</li><li id="ALM-19008__li168901016311">Set <strong id="ALM-19008__b11220193171320">-Xmx</strong> based on the machine memory size. Specifically, set <strong id="ALM-19008__b7220173114135">-Xmx</strong> to 32 GB if the machine memory is greater than 200 GB, to 16 GB if the machine memory is greater than 128 GB and less than 200 GB, and to 8 GB if the machine memory is less than 128 GB. When <strong id="ALM-19008__b8631164614134">-Xmx</strong> is set to 32 GB, a RegionServer node supports 2000 regions and 200 hotspot regions.</li></ul>
</li></ol>
</div></div>
</p></li><li id="ALM-19008__li60189915195110"><span>Check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-19008__ul41339805195110"><li id="ALM-19008__li41758100195110">If yes, no further action is required.</li><li id="ALM-19008__li26962933195110">If no, go to <a href="#ALM-19008__li56360562195110">6</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-19008__p36513930195110"><strong id="ALM-19008__b4084949195156">Collect fault information.</strong></p>
<ol start="6" id="ALM-19008__ol4449508119520"><li id="ALM-19008__li56360562195110"><a name="ALM-19008__li56360562195110"></a><a name="li56360562195110"></a><span>On the <span id="ALM-19008__text3654175414565">MRS</span> Manager portal, choose <strong id="ALM-19008__b837556111412">O&amp;M</strong> &gt; <strong id="ALM-19008__b651481171416">Log </strong>&gt;<strong id="ALM-19008__b1951415113145"> Download</strong>.</span></li><li id="ALM-19008__li11806859195110"><span>Select <strong id="ALM-19008__b37483010195110">HBase</strong> in the required cluster from the <strong id="ALM-19008__b1802772195110">Service</strong> drop-down list.</span></li><li id="ALM-19008__li1145664103113"><span>Click <span><img id="ALM-19008__image1945644173117" src="en-us_image_0000001532927606.png"></span> in the upper right corner, and set <strong id="ALM-19008__b6456941173117">Start Date</strong> and <strong id="ALM-19008__b11456154113318">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-19008__b13456164113319">Download</strong>.</span></li><li id="ALM-19008__li37728124195110"><span>Contact the <span id="ALM-19008__text4614151421417">O&amp;M personnel</span> and send the collected fault logs.</span></li></ol>
</div>
<div class="section" id="ALM-19008__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-19008__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
</div>
<div class="section" id="ALM-19008__sc19d8ee71cbe4b62b7717cb470510a07"><h4 class="sectiontitle">Related Information</h4><p id="ALM-19008__en-us_topic_0070543522_p34745524">None</p>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
</div>
</div>