doc-exports/docs/mrs/umn/ALM-12016.html
Yang, Tong 3b1f73dece MRS UMN 2.0.38.SP20 version
Reviewed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com>
Co-authored-by: Yang, Tong <yangtong2@huawei.com>
Co-committed-by: Yang, Tong <yangtong2@huawei.com>
2022-12-13 12:03:34 +00:00

93 lines
12 KiB
HTML

<a name="ALM-12016"></a><a name="ALM-12016"></a>
<h1 class="topictitle1">ALM-12016 CPU Usage Exceeds the Threshold</h1>
<div id="body3068798"><div class="section" id="ALM-12016__sd2aa377cedd7428ab43926bcd0571371"><h4 class="sectiontitle">Description</h4><p id="ALM-12016__en-us_topic_0070543548_p27177474">The system checks the CPU usage every 30 seconds and compares the actual CPU usage with the threshold. The CPU usage has a default threshold. This alarm is generated when the CPU usage exceeds the threshold for several times (configurable, 10 times by default) consecutively.</p>
<p id="ALM-12016__p20853383104938">The alarm is cleared in the following two scenarios: The value of <strong id="ALM-12016__b6894114712255">Trigger Count</strong> is 1 and the CPU usage is smaller than or equal to the threshold; the value of <strong id="ALM-12016__b44134084101639"><strong id="ALM-12016__b041615559258">Trigger Count</strong> </strong>is greater than 1 and the CPU usage is smaller than or equal to 90% of the threshold.</p>
</div>
<div class="section" id="ALM-12016__sed5654e0fb4744e6b4d40addf988ce76"><h4 class="sectiontitle">Attribute</h4>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12016__en-us_topic_0070543548_table15263883" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12016__en-us_topic_0070543548_row2649980"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-12016__en-us_topic_0070543548_p13321813">Alarm ID</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-12016__en-us_topic_0070543548_p5325067">Alarm Severity</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-12016__en-us_topic_0070543548_p28677253">Auto Clear</p>
</th>
</tr>
</thead>
<tbody><tr id="ALM-12016__en-us_topic_0070543548_row41156139"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-12016__en-us_topic_0070543548_p45312999">12016</p>
</td>
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-12016__en-us_topic_0070543548_p46474270">Major</p>
</td>
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-12016__en-us_topic_0070543548_p6319546">Yes</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="ALM-12016__sba368bce011f4d36800cdf21f0be3bb8"><h4 class="sectiontitle">Parameters</h4>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12016__en-us_topic_0070543548_table42121251" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12016__en-us_topic_0070543548_row29066061"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-12016__en-us_topic_0070543548_p5540732">Name</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-12016__en-us_topic_0070543548_p46146172">Meaning</p>
</th>
</tr>
</thead>
<tbody><tr id="ALM-12016__row17737167175520"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12016__p192431315431">Source</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12016__p692551319435">Specifies the cluster or system for which the alarm is generated.</p>
</td>
</tr>
<tr id="ALM-12016__en-us_topic_0070543548_row46852469"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12016__en-us_topic_0070543548_p36953614">ServiceName</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12016__en-us_topic_0070543548_p40452719">Specifies the service for which the alarm is generated.</p>
</td>
</tr>
<tr id="ALM-12016__en-us_topic_0070543548_row28530154"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12016__en-us_topic_0070543548_p29241163">RoleName</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12016__en-us_topic_0070543548_p19724041">Specifies the role for which the alarm is generated.</p>
</td>
</tr>
<tr id="ALM-12016__en-us_topic_0070543548_row43298646"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12016__en-us_topic_0070543548_p17529450">HostName</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12016__en-us_topic_0070543548_p10599309">Specifies the host for which the alarm is generated.</p>
</td>
</tr>
<tr id="ALM-12016__en-us_topic_0070543548_row28284925"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12016__en-us_topic_0070543548_p9377597">Trigger Condition</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12016__en-us_topic_0070543548_p21387873">Specifies the threshold triggering the alarm. If the current indicator value exceeds this threshold, the alarm is generated.</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="ALM-12016__s5446085a2a0441728a92a541f5eb95ae"><h4 class="sectiontitle">Impact on the System</h4><p id="ALM-12016__en-us_topic_0070543548_p54696146">Service processes respond slowly or become unavailable.</p>
</div>
<div class="section" id="ALM-12016__s23c7881992f44efb95893912e391c0c0"><h4 class="sectiontitle">Possible Causes</h4><ul id="ALM-12016__en-us_topic_0070543548_ul1202807"><li id="ALM-12016__en-us_topic_0070543548_li10825264">The alarm threshold or alarm smoothing times are incorrect.</li><li id="ALM-12016__en-us_topic_0070543548_li30318520">CPU configuration cannot meet service requirements. The CPU usage reaches the upper limit.</li></ul>
</div>
<div class="section" id="ALM-12016__s43e4003b37294857a410ff23763ad2ef"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12016__en-us_topic_0070543548_p39881087"><strong id="ALM-12016__b58386659173930">Check whether the alarm threshold or alarm <strong id="ALM-12016__b18142175243719">Trigger Count</strong> are correct.</strong></p>
<ol id="ALM-12016__ol1362745417400"><li id="ALM-12016__li24816170173938"><span>Change the alarm threshold and alarm <strong id="ALM-12016__b13281711203813">Trigger Count</strong> based on CPU usage.</span><p><p class="litext" id="ALM-12016__p6523306173938">On FusionInsight Manager, choose <strong id="ALM-12016__b73164535166">O&amp;M</strong> &gt; <strong id="ALM-12016__b1366935516171">Alarm</strong> &gt; <strong id="ALM-12016__b14318131145112">Thresholds &gt; </strong><em id="ALM-12016__i193217112515">Name of the desired cluster</em> &gt; <strong id="ALM-12016__b16357675173938">Host</strong> &gt; <strong id="ALM-12016__b13001354173938">CPU</strong> &gt; <strong id="ALM-12016__b49903330173938">Host CPU Usage</strong> and change the alarm smoothing times based on CPU usage, as shown in <a href="#ALM-12016__fig42676420173938">Figure 1</a>.</p>
<div class="note" id="ALM-12016__note57869743173938"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p class="text" id="ALM-12016__p58625754173938">This option defines the alarm check phase. <strong id="ALM-12016__b74612137375">Trigger Count</strong> indicates the alarm check threshold. An alarm is generated when the number of check times exceeds the threshold.</p>
</div></div>
<div class="fignone" id="ALM-12016__fig42676420173938"><a name="ALM-12016__fig42676420173938"></a><a name="fig42676420173938"></a><span class="figcap"><b>Figure 1 </b>Setting alarm smoothing times</span><br><span><img id="ALM-12016__image122911304588" src="en-us_image_0269383824.png"></span></div>
<p class="litext" id="ALM-12016__p21675643173938">On <strong id="ALM-12016__b66954485173938">Host CPU Usage</strong> page and click <strong id="ALM-12016__b511919416293">Modify</strong> in the <strong id="ALM-12016__b19162174615296">Operation</strong> column to change the alarm threshold, as shown in <a href="#ALM-12016__fig30961038173938">Figure 2</a>.</p>
<div class="fignone" id="ALM-12016__fig30961038173938"><a name="ALM-12016__fig30961038173938"></a><a name="fig30961038173938"></a><span class="figcap"><b>Figure 2 </b>Setting an alarm threshold</span><br><span><img id="ALM-12016__image1615410501365" src="en-us_image_0000001440977805.png"></span></div>
</p></li><li id="ALM-12016__li29621482173938"><span>After 2 minutes, check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12016__ul12793264173938"><li id="ALM-12016__li22018946173938">If yes, no further action is required.</li><li id="ALM-12016__li38704176173938">If no, go to <a href="#ALM-12016__li65266749173938">3</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-12016__p48030518173938"><strong id="ALM-12016__b1326250617406">Check whether the CPU usage reaches the upper limit.</strong></p>
<ol start="3" id="ALM-12016__ol44225396174015"><li id="ALM-12016__li65266749173938"><a name="ALM-12016__li65266749173938"></a><a name="li65266749173938"></a><span>In the alarm list on FusionInsight Manager, click <span><img id="ALM-12016__image168221113135319" src="en-us_image_0269383826.png"></span> in the row where the alarm is located to view the alarm host address in the alarm details.</span></li><li id="ALM-12016__li52115308173938"><span>On the <strong id="ALM-12016__b51685932101729">Hosts</strong> page, click the node on which the alarm is reported.</span></li><li id="ALM-12016__li60590444173938"><span>View the CPU usage for 5 minutes. If the CPU usage exceeds the threshold for multiple times, contact the system administrator to add more CPUs.</span></li><li id="ALM-12016__li38620506173938"><span>Check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12016__ul30302958173938"><li id="ALM-12016__li8878949173938">If yes, no further action is required.</li><li id="ALM-12016__li48106238173938">If no, go to <a href="#ALM-12016__li35735451173938">7</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-12016__p51491657174016"><strong id="ALM-12016__b42921091174020">Collect fault information.</strong></p>
<ol start="7" id="ALM-12016__ol57964469174025"><li id="ALM-12016__li35735451173938"><a name="ALM-12016__li35735451173938"></a><a name="li35735451173938"></a><span>On the FusionInsight Manager in the active cluster, choose <strong id="ALM-12016__b12040241173938">O&amp;M</strong> &gt; <strong id="ALM-12016__b41253307173938">Log &gt; Download</strong>.</span></li><li id="ALM-12016__li49036890173938"><span>Select <strong id="ALM-12016__b53183609173938">OmmServer</strong> from the <strong id="ALM-12016__b477010478910">Service</strong> and click <strong id="ALM-12016__b1577112471895">OK</strong>.</span></li><li id="ALM-12016__li11141594173938"><span>Set <strong id="ALM-12016__b38678826173938">Start Date</strong> for log collection to 10 minutes ahead of the alarm generation time and <strong id="ALM-12016__b12565117173938">End Date</strong> to 10 minutes behind the alarm generation time in <strong id="ALM-12016__b20155417195615">Time Range</strong> and click <strong id="ALM-12016__b45977197173938">Download</strong>.</span></li><li id="ALM-12016__li495644512588"><span>Contact the <span id="ALM-12016__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
</div>
<div class="section" id="ALM-12016__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12016__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
</div>
<div class="section" id="ALM-12016__s8c5dd7b3b5ce47dfabf1d96c699ad06c"><h4 class="sectiontitle">Related Information</h4><p id="ALM-12016__en-us_topic_0070543548_p58361484">None</p>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
</div>
</div>