Files
doc-exports/docs/mrs/umn/ALM-50211.html
Yang, Tong 5914b67d13 MRS UMN Doc 20240802 version
Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com>
Co-authored-by: Yang, Tong <yangtong2@huawei.com>
Co-committed-by: Yang, Tong <yangtong2@huawei.com>
2024-09-28 19:04:58 +00:00

90 lines
12 KiB
HTML

<a name="ALM-50211"></a><a name="ALM-50211"></a>
<h1 class="topictitle1">ALM-50211 FE Queue Length of BE Periodic Report Tasks Exceeds the Threshold</h1>
<div id="body1558602975412"><div class="section" id="ALM-50211__section60313499"><h4 class="sectiontitle"><span id="ALM-50211__text1558625720546">Alarm Description</span></h4><p id="ALM-50211__p184701641115613">The system checks the queue length of each BE periodic report task on FE every 30 seconds. This alarm is generated when the queue length exceeds the threshold (10 by default). This value indicates the number of report tasks waiting on the master FE node. A large value indicates a poor FE processing capability.</p>
<p id="ALM-50211__p1678319243233">This alarm is cleared when the system detects that the queue length of BE periodic report tasks on FE is less than the threshold.</p>
</div>
<div class="section" id="ALM-50211__section5950580"><h4 class="sectiontitle"><span id="ALM-50211__text38748475555">Alarm Attributes</span></h4>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-50211__table15548096" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-50211__row49989141"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-50211__p57710042"><span id="ALM-50211__text17980150175619">Alarm ID</span></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-50211__p44001849"><span id="ALM-50211__text199471335614">Alarm Severity</span></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-50211__p7380012"><span id="ALM-50211__text152400388563">Auto Cleared</span></p>
</th>
</tr>
</thead>
<tbody><tr id="ALM-50211__row30415758"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-50211__p47757325">50211</p>
</td>
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-50211__p43138141">Minor</p>
</td>
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-50211__p4528550">Yes</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="ALM-50211__section53555227"><h4 class="sectiontitle"><span id="ALM-50211__text155061195577">Alarm Parameters</span></h4>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-50211__table31268239" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-50211__row59179380"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-50211__p21975462"><span id="ALM-50211__text776142495720">Parameter</span></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-50211__p35182007"><span id="ALM-50211__text632018391572">Description</span></p>
</th>
</tr>
</thead>
<tbody><tr id="ALM-50211__row12465939134110"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-50211__p17935380415">Source</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-50211__p187931338134115">Specifies the cluster or system for which the alarm is generated.</p>
</td>
</tr>
<tr id="ALM-50211__row48724307"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-50211__p54354790">ServiceName</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-50211__p40661878">Specifies the service for which the alarm is generated.</p>
</td>
</tr>
<tr id="ALM-50211__row30412584"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-50211__p47500221">RoleName</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-50211__p22312707">Specifies the role for which the alarm is generated.</p>
</td>
</tr>
<tr id="ALM-50211__row66596640"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-50211__p25618737">HostName</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-50211__p61851848">Specifies the host for which the alarm is generated.</p>
</td>
</tr>
<tr id="ALM-50211__row19795720"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-50211__p59949472">Trigger Condition</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-50211__p24069040">Specifies the threshold for triggering the alarm.</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="ALM-50211__section12235000"><h4 class="sectiontitle"><span id="ALM-50211__text152545568">Impact on the System</span></h4><p id="ALM-50211__p1855220712578">The processing capability of FE is insufficient, affecting the service query speed.</p>
</div>
<div class="section" id="ALM-50211__section43006140"><h4 class="sectiontitle"><span id="ALM-50211__text16100381863">Possible Causes</span></h4><p id="ALM-50211__p1816812695119">The processing capability of the master FE node is insufficient due to a large number of concurrent service requests in the Doris cluster or insufficient memory for FE processes.</p>
</div>
<div class="section" id="ALM-50211__section333352655112"><h4 class="sectiontitle"><span id="ALM-50211__text228118610">Handling Procedure</span></h4><p class="tableheading" id="ALM-50211__p25448063"><strong id="ALM-50211__b501332858554">Check the GC duration.</strong></p>
<ol id="ALM-50211__ol18676172312505"><li id="ALM-50211__li1765632310506"><span>On FusionInsight Manager, choose <strong id="ALM-50211__b241862913216">O&amp;M</strong> &gt; <strong id="ALM-50211__b636183293220">Alarm</strong> &gt; <strong id="ALM-50211__b194401634113214">Alarms</strong>. In the alarm list, view the role name and obtain the IP address of the instance in <strong id="ALM-50211__b631445373210">Location</strong> of the alarm whose ID is <strong id="ALM-50211__b12393508339">50211</strong>.</span></li><li id="ALM-50211__li666032315507"><span>Choose <strong id="ALM-50211__b2058110351335">Cluster</strong> &gt; <strong id="ALM-50211__b4441937203316">Services</strong> &gt; <strong id="ALM-50211__b319194020334">Doris</strong> &gt; <strong id="ALM-50211__b112915417330">Instances</strong>, click the FE instance for which the alarm is generated, and click the <strong id="ALM-50211__b193491454173420">Chart</strong> tab of the instance.</span><p><div class="p" id="ALM-50211__p16660122375011">Select <strong id="ALM-50211__b1593154718361">JVM</strong> from <strong id="ALM-50211__b964245118363">Chart Category</strong> on the left, and check whether <strong id="ALM-50211__b146797294912">Accumulated GC duration of the old generation</strong> of the FE process is greater than 3 seconds.<ul id="ALM-50211__ul13659152365016"><li id="ALM-50211__li665992375010">If yes, go to <a href="#ALM-50211__li967382335015">3</a>.</li><li id="ALM-50211__li96596239508">If no, go to <a href="#ALM-50211__li196491423155013">5</a>.</li></ul>
</div>
</p></li><li id="ALM-50211__li967382335015"><a name="ALM-50211__li967382335015"></a><a name="li967382335015"></a><span>Choose <strong id="ALM-50211__b944416075012">Cluster</strong> &gt; <strong id="ALM-50211__b75571364509">Services</strong> &gt; <strong id="ALM-50211__b44531010175012">Doris</strong> &gt; <strong id="ALM-50211__b1295501465017">Configurations</strong> &gt; <strong id="ALM-50211__b1199518145017">All Configurations</strong> &gt; <strong id="ALM-50211__b1744092314504">FE(Role)</strong> &gt; <strong id="ALM-50211__b2820940165010">JVM</strong>, and increase the value of <strong id="ALM-50211__b145321913195114">-Xmx</strong> in <strong id="ALM-50211__b118391765112">FE_GC_OPTS</strong>. The default value is <strong id="ALM-50211__b458352912512">8GB</strong>.</span><p><ul id="ALM-50211__ul743953705317"><li id="ALM-50211__li203503315535">If this alarm is generated occasionally, increase the value by 0.5 times. If this alarm is generated frequently, double the parameter value.</li><li id="ALM-50211__li012363925313">In the case of large service volume and high service concurrency, you are advised to add instances.</li></ul>
</p></li><li id="ALM-50211__li11676723115012"><span>Check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-50211__ul1167522345011"><li id="ALM-50211__li116752234504">If yes, no further action is required.</li><li id="ALM-50211__li36751823145019">If no, go to <a href="#ALM-50211__li196491423155013">5</a>.</li></ul>
</p></li></ol>
<p id="ALM-50211__p1214212184619"><strong id="ALM-50211__b315933565212">Check whether the alarm threshold or alarm trigger count is properly configured.</strong></p>
<ol start="5" id="ALM-50211__ol17655623205019"><li id="ALM-50211__li196491423155013"><a name="ALM-50211__li196491423155013"></a><a name="li196491423155013"></a><span>Log in to FusionInsight Manager, choose <strong id="ALM-50211__b97721644205212">O&amp;M</strong> &gt; <strong id="ALM-50211__b18112194835213">Alarm</strong> &gt; <strong id="ALM-50211__b538645165214">Thresholds</strong>, click the name of the desired cluster, and choose <strong id="ALM-50211__b1620877125319">Doris</strong> &gt; <strong id="ALM-50211__b379211103539">Queue</strong> &gt; <strong id="ALM-50211__b346005919538">Queue Length of BE Periodic Report Tasks on the FE (FE)</strong>.</span></li><li id="ALM-50211__li4655152395018"><span>Click the edit button next to <strong id="ALM-50211__b19389636545">Trigger Count</strong>, change the number based on site requirements, and click <strong id="ALM-50211__b13389231542">OK</strong>.</span></li><li class="litext" id="ALM-50211__li15655182318505"><span>Click <strong id="ALM-50211__b145135695417">Modify</strong> in the <strong id="ALM-50211__b135131363540">Operation</strong> column, change the alarm threshold based on site requirements, and click <strong id="ALM-50211__b3514176175414">OK</strong>.</span></li><li id="ALM-50211__li8655122312505"><span>Wait 2 minutes and check whether the alarm is automatically cleared.</span><p><ul class="subitemlist" id="ALM-50211__ul5655323135015"><li id="ALM-50211__li3655323175013">If yes, no further action is required.</li><li id="ALM-50211__li11655923105014">If no, go to <a href="#ALM-50211__li1058072415565">9</a>.</li></ul>
</p></li></ol>
<p id="ALM-50211__p15601739207"><strong id="ALM-50211__b3606332013">Collect fault information.</strong></p>
<ol start="9" id="ALM-50211__ol658172418563"><li id="ALM-50211__li1058072415565"><a name="ALM-50211__li1058072415565"></a><a name="li1058072415565"></a><span>On FusionInsight Manager, choose <strong id="ALM-50211__b185637197545">O&amp;M</strong>. In the navigation pane on the left, choose <strong id="ALM-50211__b20564191917544">Log</strong> &gt; <strong id="ALM-50211__b7564161995416">Download</strong>.</span></li><li id="ALM-50211__li358010249561"><span>Expand the <strong id="ALM-50211__b14613742173914">Service</strong> drop-down list, and select <strong id="ALM-50211__b11614242193916">Doris</strong> for the target cluster.</span></li><li id="ALM-50211__li11581132420564"><span>Click the edit icon in the upper right corner, and set <strong id="ALM-50211__b10351172718543">Start Date</strong> and <strong id="ALM-50211__b12351727135414">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-50211__b113518274548">Download</strong>.</span></li><li id="ALM-50211__li1558111246562"><span>Contact <span id="ALM-50211__text148081329125418">O&amp;M personnel</span> and provide the collected logs.</span></li></ol>
</div>
<div class="section" id="ALM-50211__section169311343318"><h4 class="sectiontitle"><span id="ALM-50211__text03306174617">Alarm Clearance</span></h4><p id="ALM-50211__p754913417333">This alarm is automatically cleared after the fault is rectified.</p>
</div>
<div class="section" id="ALM-50211__section60945317"><h4 class="sectiontitle"><span id="ALM-50211__text124081820768">Related Information</span></h4><p id="ALM-50211__p10326323"><span id="ALM-50211__text19275105817121">None.</span></p>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
</div>
</div>