forked from docs/doc-exports
Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com> Reviewed-by: Rechenburg, Matthias <matthias.rechenburg@t-systems.com> Co-authored-by: Yang, Tong <yangtong2@huawei.com> Co-committed-by: Yang, Tong <yangtong2@huawei.com>
90 lines
13 KiB
HTML
90 lines
13 KiB
HTML
<a name="ALM-38005"></a><a name="ALM-38005"></a>
|
|
|
|
<h1 class="topictitle1">ALM-38005 GC Duration of the Broker Process Exceeds the Threshold</h1>
|
|
<div id="body54749644"><div class="section" id="ALM-38005__s1871caece54646e1a9d7e57b7bc55cb5"><h4 class="sectiontitle">Description</h4><p id="ALM-38005__en-us_topic_0070543589_p52261031">The system checks the garbage collection (GC) duration of the Broker process every 60 seconds. This alarm is generated when the GC duration exceeds the threshold (12 seconds by default) for 3 consecutive times.</p>
|
|
<p id="ALM-38005__p21207955145822">When the <strong id="ALM-38005__b1855881691815">Trigger Count</strong> is 1, this alarm is cleared when the GC duration is less than or equal to the threshold. When the <strong id="ALM-38005__b1687213212193">Trigger Count</strong> is greater than 1, this alarm is cleared when the GC duration is less than or equal to 90% of the threshold.</p>
|
|
</div>
|
|
<div class="section" id="ALM-38005__sccedccceb3d24706a97a999ace64569c"><h4 class="sectiontitle">Attribute</h4>
|
|
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-38005__en-us_topic_0070543589_table47566430" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-38005__en-us_topic_0070543589_row50034027"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-38005__en-us_topic_0070543589_p26224387">Alarm ID</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-38005__en-us_topic_0070543589_p43800616">Alarm Severity</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-38005__en-us_topic_0070543589_p58188993">Automatically Cleared</p>
|
|
</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody><tr id="ALM-38005__en-us_topic_0070543589_row15687991"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-38005__en-us_topic_0070543589_p62767729">38005</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-38005__en-us_topic_0070543589_p51021271">Major</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-38005__en-us_topic_0070543589_p39082323">Yes</p>
|
|
</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
</div>
|
|
</div>
|
|
<div class="section" id="ALM-38005__sd1b0d16d22ad445bbd4301fb68aad6cc"><h4 class="sectiontitle">Parameters</h4>
|
|
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-38005__en-us_topic_0070543589_table11551562" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-38005__en-us_topic_0070543589_row13735244"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-38005__en-us_topic_0070543589_p38812946">Name</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-38005__en-us_topic_0070543589_p56840921">Meaning</p>
|
|
</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody><tr id="ALM-38005__row484685917577"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-38005__p192431315431">Source</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-38005__p692551319435">Specifies the cluster for which the alarm is generated.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="ALM-38005__en-us_topic_0070543589_row40711880"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-38005__en-us_topic_0070543589_p9328008">ServiceName</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-38005__en-us_topic_0070543589_p17371201">Specifies the service for which the alarm is generated.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="ALM-38005__en-us_topic_0070543589_row22123082"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-38005__en-us_topic_0070543589_p47139178">RoleName</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-38005__en-us_topic_0070543589_p60177046">Specifies the role for which the alarm is generated.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="ALM-38005__en-us_topic_0070543589_row4722509"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-38005__en-us_topic_0070543589_p46978935">HostName</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-38005__en-us_topic_0070543589_p47197410">Specifies the host for which the alarm is generated.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="ALM-38005__en-us_topic_0070543589_row22123511"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-38005__en-us_topic_0070543589_p47173990">Trigger Condition</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-38005__en-us_topic_0070543589_p62996841">Specifies the threshold triggering the alarm. If the current indicator value exceeds this threshold, the alarm is generated.</p>
|
|
</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
</div>
|
|
</div>
|
|
<div class="section" id="ALM-38005__sf1f39fa3919141509ab35607cec7bc41"><h4 class="sectiontitle">Impact on the System</h4><p id="ALM-38005__en-us_topic_0070543589_p2470505">A long GC duration of the Broker process may interrupt the services.</p>
|
|
</div>
|
|
<div class="section" id="ALM-38005__se1c2e69bc5204e92a7fb97c28920ba57"><h4 class="sectiontitle">Possible Causes</h4><p id="ALM-38005__en-us_topic_0070543589_p65893197">The Kafka GC duration of the node is too long or the heap memory is inappropriately allocated. As a result, GCs occur frequently.</p>
|
|
</div>
|
|
<div class="section" id="ALM-38005__sca801f02cfd04e54ac76ab0eb5e8524e"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-38005__en-us_topic_0070543589_p35748734"><strong id="ALM-38005__b41457767155053">Check the GC duration.</strong></p>
|
|
<ol id="ALM-38005__ol397266715512"><li id="ALM-38005__li61311336155050"><span>On the <span id="ALM-38005__text34789336432">MRS</span> Manager portal, choose <strong id="ALM-38005__b1239665219477">O&M </strong>><strong id="ALM-38005__b4396752124711"> Alarm </strong>><strong id="ALM-38005__b171801950154716"> Alarm</strong><strong id="ALM-38005__b103393225303">s</strong> > <strong id="ALM-38005__b5362230145512">GC Duration of the Broker Process Exceeds the Threshold</strong> > <strong id="ALM-38005__b8495615145612">Location</strong>.<strong id="ALM-38005__b125881710205610"> </strong>Check the host name of the instance involved in this alarm.</span></li><li id="ALM-38005__li53993914155050"><span>On the <span id="ALM-38005__text20197145910165">MRS</span> Manager portal, choose <strong id="ALM-38005__b4831185113332">Cluster</strong> > <em id="ALM-38005__i14240109123418">Name of the desired cluster</em><strong id="ALM-38005__b9831125114333"> </strong>> <strong id="ALM-38005__b14931113155050">Services</strong> > <strong id="ALM-38005__b162296155050">Kafka</strong> > <strong id="ALM-38005__b1460669155050">Instance</strong>. Click the instance for which the alarm is generated to go to the page for the instance. Click the drop-down list in the upper right corner of the chart area, choose <strong id="ALM-38005__b19474172052218">Customize</strong> > <strong id="ALM-38005__b16996122482220">Process</strong> > <strong id="ALM-38005__b1495213252496">Broker GC Duration per Minute</strong>, and click <strong id="ALM-38005__b3504172818136">OK</strong>.</span></li><li id="ALM-38005__li37530339155050"><span>Check whether the GC duration of the Broker process collected every minute exceeds the threshold (12 seconds by default).</span><p><ul class="subitemlist" id="ALM-38005__ul11626578155050"><li id="ALM-38005__li11430899155050">If yes, go to <a href="#ALM-38005__li759117561678">4</a>.</li><li id="ALM-38005__li53487625155050">If no, go to <a href="#ALM-38005__li3395370155050">7</a>.</li></ul>
|
|
</p></li></ol>
|
|
<p class="subitemlist" id="ALM-38005__p5572185611719"><strong id="ALM-38005__b2962804811">Check the direct memory size configured for the Kafka.</strong></p>
|
|
<ol start="4" id="ALM-38005__ol125914561675"><li id="ALM-38005__li759117561678"><a name="ALM-38005__li759117561678"></a><a name="li759117561678"></a><span>On the <span id="ALM-38005__text18475188171717">MRS</span> Manager portal, choose <strong id="ALM-38005__b1859025618717">Cluster</strong> > <em id="ALM-38005__i1659013561975">Name of the desired cluster</em><strong id="ALM-38005__b195902056879"> </strong>><strong id="ALM-38005__b1059085610719"> Services</strong> > <strong id="ALM-38005__b2590115618719">Kafka</strong> > <strong id="ALM-38005__b18590115610711">Configurations</strong> > <strong id="ALM-38005__b1590145617717">All</strong> <strong id="ALM-38005__b959011565719">Configurations</strong> > <strong id="ALM-38005__b118251191588">Broker(Role) </strong>> <strong id="ALM-38005__b859085615713">Environment</strong> to increase the value of <strong id="ALM-38005__b17590135619718">-Xmx</strong> configured in the <strong id="ALM-38005__b75901756472">KAFKA_HEAP_OPTS</strong> parameter by referring to the Note.</span><p><div class="note" id="ALM-38005__note35918563712"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><ul id="ALM-38005__ul175912561371"><li id="ALM-38005__li135901856373">It is recommended that <strong id="ALM-38005__b4590155618717">-Xmx</strong> and <strong id="ALM-38005__b6590165613713">-Xms</strong> be set to the same value.</li><li id="ALM-38005__li18591556979">You are advised to set the value of <strong id="ALM-38005__b1759015614710">KAFKA_HEAP_OPTS</strong> to twice the value of <strong id="ALM-38005__b859019568716">Direct Memory Used by Kafka.</strong><p id="ALM-38005__p55919561978">On the <span id="ALM-38005__text104781638175">MRS</span> Manager portal, choose <strong id="ALM-38005__b959075617714">Cluster</strong> > <em id="ALM-38005__i1459035616710">Name of the desired cluster</em><strong id="ALM-38005__b1959025612710"> </strong>><strong id="ALM-38005__b75903561779"> Services</strong> > <strong id="ALM-38005__b859055615714">Kafka</strong> > <strong id="ALM-38005__b459145619719">Instance</strong>. Click the instance for which the alarm is generated to go to the page for the instance. Click the drop-down list in the upper right corner of the chart area and choose <strong id="ALM-38005__b25915561179">Customize</strong> > <strong id="ALM-38005__b1559135618718">Process</strong> > <strong id="ALM-38005__b175911656870">Kafka Direct Memory Resource Status</strong> to check the value of <strong id="ALM-38005__b13591156177">Direct Memory Used by Kafka</strong>.</p>
|
|
</li></ul>
|
|
</div></div>
|
|
</p></li><li id="ALM-38005__li1459112565710"><span>Save the configuration and restart the Kafka service.</span></li><li id="ALM-38005__li8591756074"><span>Check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-38005__ul259112569719"><li id="ALM-38005__li95919563717">If yes, no further action is required.</li><li id="ALM-38005__li35911856374">If no, go to <a href="#ALM-38005__li3395370155050">7</a>.</li></ul>
|
|
</p></li></ol>
|
|
<p class="tableheading" id="ALM-38005__p1247907415516"><strong id="ALM-38005__b4520280415516">Collect fault information.</strong></p>
|
|
<ol start="7" id="ALM-38005__ol62190066155110"><li id="ALM-38005__li3395370155050"><a name="ALM-38005__li3395370155050"></a><a name="li3395370155050"></a><span>On the <span id="ALM-38005__text72922012121712">MRS</span> Manager portal, choose <strong id="ALM-38005__b122311218505">O&M</strong> > <strong id="ALM-38005__b1042452685014">Log </strong>><strong id="ALM-38005__b13424826105012"> Download</strong>.</span></li><li id="ALM-38005__li63993212155050"><span>Select <strong id="ALM-38005__b30558336155050">Kafka</strong> in the required cluster from the <strong id="ALM-38005__b6589571155050">Service</strong> drop-down list.</span></li><li id="ALM-38005__li1145664103113"><span>Click <span><img id="ALM-38005__image1945644173117" src="en-us_image_0000001583087269.png"></span> in the upper right corner, and set <strong id="ALM-38005__b6456941173117">Start Date</strong> and <strong id="ALM-38005__b11456154113318">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-38005__b13456164113319">Download</strong>.</span></li><li id="ALM-38005__li58905145155050"><span>Contact the <span id="ALM-38005__text4614151421417">O&M personnel</span> and send the collected logs.</span></li></ol>
|
|
</div>
|
|
<div class="section" id="ALM-38005__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-38005__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
|
|
</div>
|
|
<div class="section" id="ALM-38005__s99a39a1ab32945569020d23169e09333"><h4 class="sectiontitle">Related Information</h4><p id="ALM-38005__en-us_topic_0070543589_p35267730">None</p>
|
|
</div>
|
|
</div>
|
|
<div>
|
|
<div class="familylinks">
|
|
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
|
|
</div>
|
|
</div>
|
|
|