forked from docs/doc-exports
Reviewed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com> Co-authored-by: Yang, Tong <yangtong2@huawei.com> Co-committed-by: Yang, Tong <yangtong2@huawei.com>
77 lines
11 KiB
HTML
77 lines
11 KiB
HTML
<a name="alm_18009"></a><a name="alm_18009"></a>
|
|
|
|
<h1 class="topictitle1">ALM-18009 Heap Memory Usage of MapReduce JobHistoryServer Exceeds the Threshold</h1>
|
|
<div id="body8662426"><div class="section" id="alm_18009__en-us_topic_0191813867_en-us_topic_0087039367_section46467513"><h4 class="sectiontitle">Description</h4><p id="alm_18009__en-us_topic_0191813867_en-us_topic_0087039367_p24102752">The system checks the heap memory usage of MapReduce JobHistoryServer every 30 seconds and compares the actual usage with the threshold. The alarm is generated when the heap memory usage of MapReduce JobHistoryServer exceeds the threshold (80% of the maximum memory by default).</p>
|
|
<p id="alm_18009__en-us_topic_0191813867_en-us_topic_0087039367_p15598181">To change the threshold, choose <strong id="alm_18009__b184103515954632">System</strong> > <strong id="alm_18009__b166625321954632">Threshold Configuration</strong> > <strong id="alm_18009__b137035916354632">Service</strong> > <strong id="alm_18009__b190782700154632">MapReduce</strong>. The alarm is cleared when the heap memory usage is less than or equal to the threshold.</p>
|
|
</div>
|
|
<div class="section" id="alm_18009__en-us_topic_0191813867_en-us_topic_0087039367_section15554440"><h4 class="sectiontitle">Attribute</h4>
|
|
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="alm_18009__en-us_topic_0191813867_en-us_topic_0087039367_table29676093" frame="border" border="1" rules="all"><thead align="left"><tr id="alm_18009__en-us_topic_0191813867_en-us_topic_0087039367_row40212317"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="alm_18009__en-us_topic_0191813867_en-us_topic_0087039367_p35972254">Alarm ID</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="alm_18009__en-us_topic_0191813867_en-us_topic_0087039367_p28071463">Alarm Severity</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="alm_18009__en-us_topic_0191813867_en-us_topic_0087039367_p59196065">Automatically Cleared</p>
|
|
</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody><tr id="alm_18009__en-us_topic_0191813867_en-us_topic_0087039367_row30151988"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="alm_18009__en-us_topic_0191813867_en-us_topic_0087039367_p26391943">18009</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="alm_18009__en-us_topic_0191813867_en-us_topic_0087039367_p57372610">Major</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="alm_18009__en-us_topic_0191813867_en-us_topic_0087039367_p16669838">Yes</p>
|
|
</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
</div>
|
|
</div>
|
|
<div class="section" id="alm_18009__en-us_topic_0191813867_en-us_topic_0087039367_section5772232"><h4 class="sectiontitle">Parameters</h4>
|
|
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="alm_18009__en-us_topic_0191813867_en-us_topic_0087039367_table8079634" frame="border" border="1" rules="all"><thead align="left"><tr id="alm_18009__en-us_topic_0191813867_en-us_topic_0087039367_row17444750"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="alm_18009__en-us_topic_0191813867_en-us_topic_0087039367_p3738651">Parameter</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="alm_18009__en-us_topic_0191813867_en-us_topic_0087039367_p34395333">Description</p>
|
|
</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody><tr id="alm_18009__en-us_topic_0191813867_en-us_topic_0087039367_row34558579"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="alm_18009__en-us_topic_0191813867_en-us_topic_0087039367_p47781518">ServiceName</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="alm_18009__en-us_topic_0191813867_en-us_topic_0087039367_p45097725">Specifies the service for which the alarm is generated.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="alm_18009__en-us_topic_0191813867_en-us_topic_0087039367_row3226344"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="alm_18009__en-us_topic_0191813867_en-us_topic_0087039367_p60007281">RoleName</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="alm_18009__en-us_topic_0191813867_en-us_topic_0087039367_p28751554">Specifies the role for which the alarm is generated.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="alm_18009__en-us_topic_0191813867_en-us_topic_0087039367_row57437397"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="alm_18009__en-us_topic_0191813867_en-us_topic_0087039367_p21917606">HostName</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="alm_18009__en-us_topic_0191813867_en-us_topic_0087039367_p30495700">Specifies the host for which the alarm is generated.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="alm_18009__en-us_topic_0191813867_en-us_topic_0087039367_row6025849"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="alm_18009__en-us_topic_0191813867_en-us_topic_0087039367_p18331773">Trigger Condition</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="alm_18009__en-us_topic_0191813867_en-us_topic_0087039367_p8478608">Specifies the threshold for triggering the alarm.</p>
|
|
</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
</div>
|
|
</div>
|
|
<div class="section" id="alm_18009__en-us_topic_0191813867_en-us_topic_0087039367_section51950091"><h4 class="sectiontitle">Impact on the System</h4><p id="alm_18009__en-us_topic_0191813867_en-us_topic_0087039367_p15678685">When the heap memory usage of MapReduce JobHistoryServer is overhigh, the performance of MapReduce log archiving is affected. What is more, a memory overflow occurs so that the Yarn service is unavailable.</p>
|
|
</div>
|
|
<div class="section" id="alm_18009__en-us_topic_0191813867_en-us_topic_0087039367_section64897643"><h4 class="sectiontitle">Possible Causes</h4><p id="alm_18009__en-us_topic_0191813867_en-us_topic_0087039367_p62013962">The heap memory of the MapReduce JobHistoryServer instance on the node is overused or the heap memory is inappropriately allocated. As a result, the usage exceeds the threshold.</p>
|
|
</div>
|
|
<div class="section" id="alm_18009__en-us_topic_0191813867_en-us_topic_0087039367_section47207880"><h4 class="sectiontitle">Procedure</h4><ol id="alm_18009__en-us_topic_0191813867_ol20218781175629"><li id="alm_18009__en-us_topic_0191813867_li47751305175629"><span>Check the heap memory usage.</span><p><ol type="a" id="alm_18009__en-us_topic_0191813867_ol17980506181612"><li id="alm_18009__en-us_topic_0191813867_li1487713813414">Go to the cluster details page and choose <strong id="alm_18009__b113058493754632">Alarms</strong>.</li><li id="alm_18009__en-us_topic_0191813867_li58727490181612">Select the alarm whose <strong id="alm_18009__b92174826354632">Alarm ID</strong> is <strong id="alm_18009__b134387497954632">18009</strong> and view the IP address and role name of the instance in <strong id="alm_18009__b26186582654632">Location</strong>.</li><li id="alm_18009__en-us_topic_0191813867_li37461388181615">Choose <strong id="alm_18009__b155845197554632">Components</strong> > <strong id="alm_18009__b3311430954632">MapReduce</strong> > <strong id="alm_18009__b33972069554632">Instance</strong> > <strong id="alm_18009__b128522323354632">JobHistoryServer</strong> (IP address of the instance for which the alarm is generated) > <strong id="alm_18009__b80914687454632">Customize</strong> > <strong id="alm_18009__b207711445654632">JobHistoryServer Heap Memory Usage Statistics</strong>. Check the heap memory usage.</li><li id="alm_18009__en-us_topic_0191813867_li5803814181617">Check whether the heap memory usage of JobHistoryServer has reached the threshold (80% of the maximum heap memory).<ul id="alm_18009__en-us_topic_0191813867_ul2889331181625"><li id="alm_18009__en-us_topic_0191813867_li5345176181624">If yes, go to <a href="#alm_18009__en-us_topic_0191813867_li1011493181634">1.e</a>.</li><li id="alm_18009__en-us_topic_0191813867_li14368473181624">If no, go to <a href="#alm_18009__en-us_topic_0191813867_li572522141314">2</a>.</li></ul>
|
|
</li><li id="alm_18009__en-us_topic_0191813867_li1011493181634"><a name="alm_18009__en-us_topic_0191813867_li1011493181634"></a><a name="en-us_topic_0191813867_li1011493181634"></a>Choose <strong id="alm_18009__b31561632155213">Components</strong> > <strong id="alm_18009__b2156113216524">MapReduce</strong> > <strong id="alm_18009__b1315711320521">Service Configuration</strong>. Set <strong id="alm_18009__b31571232115211">Type</strong> to <strong id="alm_18009__b4157163217523">All</strong> and choose <strong id="alm_18009__b101571032115217">JobHistoryServer</strong> > <strong id="alm_18009__b215743210528">System</strong>. Increase the value of <strong id="alm_18009__b53993846954632">-Xmx</strong> in the <strong id="alm_18009__b96542472154632">GC_OPTS</strong> parameter as required, click <strong id="alm_18009__b194756579154632">Save Configuration</strong>, and select <strong id="alm_18009__b81679399554632">Restart the affected services or instances.</strong> Click <strong id="alm_18009__b183298471554632">OK</strong>. </li><li id="alm_18009__en-us_topic_0191813867_li11969688181637">Check whether the alarm is cleared.<ul id="alm_18009__en-us_topic_0191813867_ul51315766181641"><li id="alm_18009__en-us_topic_0191813867_li54070239181641">If yes, no further action is required.</li><li id="alm_18009__en-us_topic_0191813867_li17383192181641">If no, go to <a href="#alm_18009__en-us_topic_0191813867_li572522141314">2</a>.</li></ul>
|
|
</li></ol>
|
|
</p></li><li id="alm_18009__en-us_topic_0191813867_li572522141314"><a name="alm_18009__en-us_topic_0191813867_li572522141314"></a><a name="en-us_topic_0191813867_li572522141314"></a><span>Collect fault information.</span><p><ol type="a" id="alm_18009__en-us_topic_0191813867_en-us_topic_0191813935_ol6089206913036"><li id="alm_18009__en-us_topic_0191813867_en-us_topic_0191813935_li4478836213036">On MRS Manager, choose <span class="menucascade" id="alm_18009__menucascade49892165654632"><b><span class="uicontrol" id="alm_18009__uicontrol10533393354632">System</span></b> > <b><span class="uicontrol" id="alm_18009__uicontrol114599290454632">Export Log</span></b></span>.</li><li id="alm_18009__li18574327401">Contact technical support engineers for help. For details, see <a href="https://docs.otc.t-systems.com/en-us/public/learnmore.html" target="_blank" rel="noopener noreferrer">technical support</a>.</li></ol>
|
|
</p></li></ol>
|
|
</div>
|
|
<div class="section" id="alm_18009__en-us_topic_0191813867_en-us_topic_0087039367_section22217739"><h4 class="sectiontitle">Reference</h4><p id="alm_18009__en-us_topic_0191813867_en-us_topic_0087039367_p49822483">None</p>
|
|
</div>
|
|
</div>
|
|
<div>
|
|
<div class="familylinks">
|
|
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_0241.html">Alarm Reference (Applicable to Versions Earlier Than MRS 3.x)</a></div>
|
|
</div>
|
|
</div>
|
|
|