forked from docs/doc-exports
Reviewed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com> Co-authored-by: Yang, Tong <yangtong2@huawei.com> Co-committed-by: Yang, Tong <yangtong2@huawei.com>
80 lines
11 KiB
HTML
80 lines
11 KiB
HTML
<a name="alm_12016"></a><a name="alm_12016"></a>
|
|
|
|
<h1 class="topictitle1">ALM-12016 CPU Usage Exceeds the Threshold</h1>
|
|
<div id="body8662426"><div class="section" id="alm_12016__en-us_topic_0191813922_section44995779104420"><h4 class="sectiontitle">Description</h4><p id="alm_12016__en-us_topic_0191813922_p59751490104414">The system checks the CPU usage every 30 seconds and compares the check result with the default threshold. The CPU usage has a default threshold. This alarm is generated when the CPU usage exceeds the threshold for several times (configurable, 10 times by default) consecutively.</p>
|
|
<p id="alm_12016__en-us_topic_0191813922_p8032484104414">This alarm is cleared when the average CPU usage is less than or equal to 90% of the threshold.</p>
|
|
</div>
|
|
<div class="section" id="alm_12016__en-us_topic_0191813922_section58728046104442"><h4 class="sectiontitle"><strong id="alm_12016__b651282618315">Attribute</strong></h4>
|
|
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="alm_12016__en-us_topic_0191813922_table17210170104414" frame="border" border="1" rules="all"><thead align="left"><tr id="alm_12016__en-us_topic_0191813922_row57423022104414"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="alm_12016__en-us_topic_0191813922_p20753233104414"><strong id="alm_12016__b113101627203110">Alarm ID</strong></p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="alm_12016__en-us_topic_0191813922_p29612629104414"><strong id="alm_12016__b393414275316">Alarm Severity</strong></p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="alm_12016__en-us_topic_0191813922_p45661403104414"><strong id="alm_12016__b14678112853116">Auto Clear</strong></p>
|
|
</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody><tr id="alm_12016__en-us_topic_0191813922_row7586159104414"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="alm_12016__en-us_topic_0191813922_p10499172104414">12016</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="alm_12016__en-us_topic_0191813922_p45126626104414">Major</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="alm_12016__en-us_topic_0191813922_p31378064104414">Yes</p>
|
|
</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
</div>
|
|
</div>
|
|
<div class="section" id="alm_12016__en-us_topic_0191813922_section62831052104450"><h4 class="sectiontitle">Parameters</h4>
|
|
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="alm_12016__en-us_topic_0191813922_table57594954104414" frame="border" border="1" rules="all"><thead align="left"><tr id="alm_12016__en-us_topic_0191813922_row48560076104414"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="alm_12016__en-us_topic_0191813922_p41052101104414"><strong id="alm_12016__b174114336317">Parameter</strong></p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="alm_12016__en-us_topic_0191813922_p63537230104414"><strong id="alm_12016__b13816153363118">Description</strong></p>
|
|
</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody><tr id="alm_12016__en-us_topic_0191813922_row46241978104414"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="alm_12016__en-us_topic_0191813922_p54612763104414">ServiceName</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="alm_12016__en-us_topic_0191813922_p61557721104414">Specifies the service for which the alarm is generated.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="alm_12016__en-us_topic_0191813922_row17148582104414"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="alm_12016__en-us_topic_0191813922_p46857914104414">RoleName</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="alm_12016__en-us_topic_0191813922_p37394653104414">Specifies the role for which the alarm is generated.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="alm_12016__en-us_topic_0191813922_row1007565104414"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="alm_12016__en-us_topic_0191813922_p14503949104414">HostName</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="alm_12016__en-us_topic_0191813922_p33969201104414">Specifies the host for which the alarm is generated.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="alm_12016__en-us_topic_0191813922_row37287356104414"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="alm_12016__en-us_topic_0191813922_p377010104414">Trigger Condition</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="alm_12016__en-us_topic_0191813922_p30537856104414">Generates an alarm when the actual indicator value exceeds the specified threshold.</p>
|
|
</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
</div>
|
|
</div>
|
|
<div class="section" id="alm_12016__en-us_topic_0191813922_section49050226104458"><h4 class="sectiontitle">Impact on the System</h4><p id="alm_12016__en-us_topic_0191813922_p49063759104414">Processes respond slowly or do not work.</p>
|
|
</div>
|
|
<div class="section" id="alm_12016__en-us_topic_0191813922_section10569495104528"><h4 class="sectiontitle">Possible Causes</h4><ul id="alm_12016__en-us_topic_0191813922_ul61620054104550"><li id="alm_12016__en-us_topic_0191813922_li32853302104550">The alarm threshold or alarm hit number is improperly configured.</li><li id="alm_12016__en-us_topic_0191813922_li63956538104550">The CPU configuration cannot meet service requirements. The CPU usage reaches the upper limit.</li></ul>
|
|
</div>
|
|
<div class="section" id="alm_12016__en-us_topic_0191813922_section38136361104545"><h4 class="sectiontitle">Procedure</h4><ol id="alm_12016__en-us_topic_0191813922_ol17888199104659"><li id="alm_12016__en-us_topic_0191813922_li180887110470"><span>Check whether the alarm threshold or alarm hit number is properly configured.</span><p><ol type="a" id="alm_12016__en-us_topic_0191813922_ol9805591105351"><li id="alm_12016__en-us_topic_0191813922_li60485854105351">Log in to MRS Manager and change the alarm threshold and alarm hit number based on CPU usage.</li><li id="alm_12016__en-us_topic_0191813922_li32981630105351">Choose <strong id="alm_12016__b28051051183218">System</strong> > <strong id="alm_12016__b1805175116329">Threshold Configuration</strong> > <strong id="alm_12016__b0805185143218">Device</strong> > <strong id="alm_12016__b10806115103210">Host</strong> > <strong id="alm_12016__b68067513329">CPU</strong> > <strong id="alm_12016__b12806351163215">CPU Usage</strong> > <strong id="alm_12016__b1080635103216">CPU Usage</strong> and change the alarm threshold based on the actual CPU usage.</li><li id="alm_12016__en-us_topic_0191813922_li33082837105351">Choose <strong id="alm_12016__b181011648113314">System</strong> > <strong id="alm_12016__b4106194810334">Threshold Configuration</strong> > <strong id="alm_12016__b11106114853313">Device</strong> > <strong id="alm_12016__b310611489334">Host</strong> > <strong id="alm_12016__b10107184818336">CPU</strong> > <strong id="alm_12016__b4107148143317">CPU Usage</strong> > <strong id="alm_12016__b41071484334">CPU Usage</strong> and change <strong id="alm_12016__b3107194812337">hit number</strong> based on the actual CPU usage.<div class="note" id="alm_12016__en-us_topic_0191813922_note6284400105359"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="alm_12016__en-us_topic_0191813922_p56559601105359">This option defines the alarm check phase. <strong id="alm_12016__b11755135519331">Interval</strong> indicates the alarm check period and <strong id="alm_12016__b776135519334">hit number</strong> indicates the number of times when the CPU usage exceeds the threshold. An alarm is generated when the CPU usage exceeds the threshold for several times consecutively.</p>
|
|
</div></div>
|
|
</li><li id="alm_12016__en-us_topic_0191813922_li10066380105416">Wait 2 minutes and check whether the alarm is automatically cleared.<ul id="alm_12016__en-us_topic_0191813922_ul12395011105420"><li id="alm_12016__en-us_topic_0191813922_li43305816105423">If yes, no further action is required.</li><li id="alm_12016__en-us_topic_0191813922_li65069531105420">If no, go to <a href="#alm_12016__en-us_topic_0191813922_li23374914104744">2</a>.</li></ul>
|
|
</li></ol>
|
|
</p></li><li id="alm_12016__en-us_topic_0191813922_li23374914104744"><a name="alm_12016__en-us_topic_0191813922_li23374914104744"></a><a name="en-us_topic_0191813922_li23374914104744"></a><span>Expand the system.</span><p><ol type="a" id="alm_12016__en-us_topic_0191813922_ol7603411105431"><li id="alm_12016__en-us_topic_0191813922_li34955479105431">Go to the MRS cluster details page. In the alarm list on the alarm management tab page, click the row that contains the alarm. In the alarm details, view the address of the node.</li><li id="alm_12016__en-us_topic_0191813922_li31915697105431">Log in to the node for which the alarm is generated.</li><li id="alm_12016__en-us_topic_0191813922_li38889105105431">Run <strong id="alm_12016__b1496324463419">cat /proc/stat | awk 'NR==1'|awk '{for(i=2;i<=NF;i++)j+=$i;print "" 100 - ($5+$6) * 100 / j;}'</strong> to check the system CPU usage.</li><li id="alm_12016__en-us_topic_0191813922_li31665825105431">If the CPU usage exceeds the threshold, expand the CPU capacity.</li><li id="alm_12016__en-us_topic_0191813922_li870081105512">Check whether the alarm is cleared.<ul id="alm_12016__en-us_topic_0191813922_ul65303542105513"><li id="alm_12016__en-us_topic_0191813922_li66385379105515">If yes, no further action is required.</li><li id="alm_12016__en-us_topic_0191813922_li52485994105513">If no, go to <a href="#alm_12016__en-us_topic_0191813922_li572522141314">3</a>.</li></ul>
|
|
</li></ol>
|
|
</p></li><li id="alm_12016__en-us_topic_0191813922_li572522141314"><a name="alm_12016__en-us_topic_0191813922_li572522141314"></a><a name="en-us_topic_0191813922_li572522141314"></a><span>Collect fault information.</span><p><ol type="a" id="alm_12016__en-us_topic_0191813922_en-us_topic_0191813935_ol6089206913036"><li id="alm_12016__en-us_topic_0191813922_en-us_topic_0191813935_li4478836213036">On MRS Manager, choose <span class="menucascade" id="alm_12016__menucascade1285131511352"><b><span class="uicontrol" id="alm_12016__uicontrol18847151356">System</span></b> > <b><span class="uicontrol" id="alm_12016__uicontrol1085101553520">Export Log</span></b></span>.</li><li id="alm_12016__li18574327401">Contact technical support engineers for help. For details, see <a href="https://docs.otc.t-systems.com/en-us/public/learnmore.html" target="_blank" rel="noopener noreferrer">technical support</a>.</li></ol>
|
|
</p></li></ol>
|
|
</div>
|
|
<div class="section" id="alm_12016__en-us_topic_0191813922_section13081136172452"><h4 class="sectiontitle"><strong id="alm_12016__b12460116193520">Reference</strong></h4><p id="alm_12016__en-us_topic_0191813922_p509006751263">None</p>
|
|
</div>
|
|
</div>
|
|
<div>
|
|
<div class="familylinks">
|
|
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_0241.html">Alarm Reference (Applicable to Versions Earlier Than MRS 3.x)</a></div>
|
|
</div>
|
|
</div>
|
|
|