forked from docs/doc-exports
Reviewed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com> Co-authored-by: Yang, Tong <yangtong2@huawei.com> Co-committed-by: Yang, Tong <yangtong2@huawei.com>
91 lines
15 KiB
HTML
91 lines
15 KiB
HTML
<a name="alm_24005"></a><a name="alm_24005"></a>
|
|
|
|
<h1 class="topictitle1">ALM-24005 Data Transmission by Flume Is Abnormal</h1>
|
|
<div id="body8662426"><div class="section" id="alm_24005__en-us_topic_0191813885_section19665522175625"><h4 class="sectiontitle">Description</h4><p id="alm_24005__en-us_topic_0191813885_p45861105163035">The alarm module monitors the capacity of Flume channels. This alarm is generated if the duration that a channel is full or the number of times that a source fails to send data to the channel exceeds the threshold.</p>
|
|
<p id="alm_24005__en-us_topic_0191813885_p23762025163035">Users can set the threshold as required by modifying the <strong id="alm_24005__b84235270616322">channelfullcount</strong> parameter.</p>
|
|
<p id="alm_24005__en-us_topic_0191813885_p45675886163035">This alarm is cleared after the Flume channel space is released.</p>
|
|
</div>
|
|
<div class="section" id="alm_24005__en-us_topic_0191813885_section42254989175625"><h4 class="sectiontitle">Attribute</h4>
|
|
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="alm_24005__en-us_topic_0191813885_table102091175625" frame="border" border="1" rules="all"><thead align="left"><tr id="alm_24005__en-us_topic_0191813885_row31905194175625"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="alm_24005__en-us_topic_0191813885_p34183898175625"><strong id="alm_24005__b39219631175625">Alarm ID</strong></p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="alm_24005__en-us_topic_0191813885_p22673543175625"><strong id="alm_24005__b2735300175625">Alarm Severity</strong></p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="alm_24005__en-us_topic_0191813885_p20232782175625"><strong id="alm_24005__b1214581051311">Auto Clear</strong></p>
|
|
</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody><tr id="alm_24005__en-us_topic_0191813885_row52857467175625"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="alm_24005__en-us_topic_0191813885_p63628609163045">24005</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="alm_24005__en-us_topic_0191813885_p53643687163045">Major</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="alm_24005__en-us_topic_0191813885_p50171427163045">Yes</p>
|
|
</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
</div>
|
|
</div>
|
|
<div class="section" id="alm_24005__en-us_topic_0191813885_section27218191175625"><h4 class="sectiontitle">Parameters</h4>
|
|
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="alm_24005__en-us_topic_0191813885_table57189892175625" frame="border" border="1" rules="all"><thead align="left"><tr id="alm_24005__en-us_topic_0191813885_row20832688175625"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="alm_24005__en-us_topic_0191813885_p9726186175625"><strong id="alm_24005__b20426813175625">Parameter</strong></p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="alm_24005__en-us_topic_0191813885_p43959148175625"><strong id="alm_24005__b60088019175625">Description</strong></p>
|
|
</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody><tr id="alm_24005__en-us_topic_0191813885_row35291346175625"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="alm_24005__en-us_topic_0191813885_p32188096163058">ServiceName</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="alm_24005__en-us_topic_0191813885_p57098960163058">Specifies the service for which the alarm is generated.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="alm_24005__en-us_topic_0191813885_row54265439175625"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="alm_24005__en-us_topic_0191813885_p17646605163058">HostName</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="alm_24005__en-us_topic_0191813885_p20088914163058">Specifies the host for which the alarm is generated.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="alm_24005__en-us_topic_0191813885_row5894265175625"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="alm_24005__en-us_topic_0191813885_p15086183163058">ComponentType</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="alm_24005__en-us_topic_0191813885_p14021290163058">Specifies the component type for which the alarm is generated.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="alm_24005__en-us_topic_0191813885_row60420241163054"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="alm_24005__en-us_topic_0191813885_p20973185163058">ComponentName</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="alm_24005__en-us_topic_0191813885_p21106461163058">Specifies the component name for which the alarm is generated.</p>
|
|
</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
</div>
|
|
</div>
|
|
<div class="section" id="alm_24005__en-us_topic_0191813885_section23922301175625"><h4 class="sectiontitle">Impact on the System</h4><p id="alm_24005__en-us_topic_0191813885_p6406802316319">If the usage of the Flume channel continues to grow, the data transmission time increases. When the usage reaches 100%, the Flume agent process is suspended.</p>
|
|
</div>
|
|
<div class="section" id="alm_24005__en-us_topic_0191813885_section58162349175625"><h4 class="sectiontitle">Possible Causes</h4><ul id="alm_24005__en-us_topic_0191813885_ul47443458171911"><li id="alm_24005__en-us_topic_0191813885_li30320587171911">The Flume sink is faulty.</li><li id="alm_24005__en-us_topic_0191813885_li22698423171911">The network is faulty.</li></ul>
|
|
</div>
|
|
<div class="section" id="alm_24005__en-us_topic_0191813885_section51182191175625"><h4 class="sectiontitle">Procedure</h4><ol id="alm_24005__en-us_topic_0191813885_ol1841691171954"><li id="alm_24005__en-us_topic_0191813885_li59484218171954"><span>Check whether the Flume sink is normal.</span><p><ol type="a" id="alm_24005__en-us_topic_0191813885_ol22574611172029"><li id="alm_24005__en-us_topic_0191813885_li40344521172029">Check whether the Flume sink is the HDFS type.<ul id="alm_24005__en-us_topic_0191813885_ul51011390172040"><li id="alm_24005__en-us_topic_0191813885_li23729619172040">If yes, go to <a href="#alm_24005__en-us_topic_0191813885_li35603802172029">1.b</a>.</li><li id="alm_24005__en-us_topic_0191813885_li64578141172040">If no, go to <a href="#alm_24005__en-us_topic_0191813885_li17206137172029">1.c</a>.</li></ul>
|
|
</li><li id="alm_24005__en-us_topic_0191813885_li35603802172029"><a name="alm_24005__en-us_topic_0191813885_li35603802172029"></a><a name="en-us_topic_0191813885_li35603802172029"></a>On MRS Manager, check whether the ALM-14000 HDFS Service Unavailable alarm is reported and whether the HDFS service is stopped.<ul id="alm_24005__en-us_topic_0191813885_ul64885567172049"><li id="alm_24005__en-us_topic_0191813885_li66314640172049">If the alarm is reported, clear it according to the handling suggestions of ALM-14000 HDFS Service Unavailable; if the HDFS service is stopped, start it. Then go to <a href="#alm_24005__en-us_topic_0191813885_li1487713813414">1.g</a>.</li><li id="alm_24005__en-us_topic_0191813885_li23592559172049">If no, go to <a href="#alm_24005__en-us_topic_0191813885_li1487713813414">1.g</a>.</li></ul>
|
|
</li><li id="alm_24005__en-us_topic_0191813885_li17206137172029"><a name="alm_24005__en-us_topic_0191813885_li17206137172029"></a><a name="en-us_topic_0191813885_li17206137172029"></a>Check whether the Flume sink is the HBase type.<ul id="alm_24005__en-us_topic_0191813885_ul64031349172054"><li id="alm_24005__en-us_topic_0191813885_li45460488172054">If yes, go to <a href="#alm_24005__en-us_topic_0191813885_li23959037172029">1.d</a>.</li><li id="alm_24005__en-us_topic_0191813885_li34475393172054">If no, go to <a href="#alm_24005__en-us_topic_0191813885_li1487713813414">1.g</a>.</li></ul>
|
|
</li><li id="alm_24005__en-us_topic_0191813885_li23959037172029"><a name="alm_24005__en-us_topic_0191813885_li23959037172029"></a><a name="en-us_topic_0191813885_li23959037172029"></a>On MRS Manager, check whether the ALM-19000 HBase Service Unavailable alarm is reported and whether the HBase service is stopped.<ul id="alm_24005__en-us_topic_0191813885_ul3185967817210"><li id="alm_24005__en-us_topic_0191813885_li1954951117210">If the alarm is reported, clear it according to the handling suggestions of "ALM-19000 HBase Service Unavailable"; if the HBase service is stopped, start it. Then go to <a href="#alm_24005__en-us_topic_0191813885_li1487713813414">1.g</a>.</li><li id="alm_24005__en-us_topic_0191813885_li1930278417210">If no, go to <a href="#alm_24005__en-us_topic_0191813885_li1487713813414">1.g</a>.</li></ul>
|
|
</li><li id="alm_24005__en-us_topic_0191813885_li40121494172029">Check whether the Flume sink is the Kafka type.<ul id="alm_24005__en-us_topic_0191813885_ul5514026817214"><li id="alm_24005__en-us_topic_0191813885_li3863433517214">If yes, go to <a href="#alm_24005__en-us_topic_0191813885_li13075641172029">1.f</a>.</li><li id="alm_24005__en-us_topic_0191813885_li969569517214">If no, go to <a href="#alm_24005__en-us_topic_0191813885_li1487713813414">1.g</a>.</li></ul>
|
|
</li><li id="alm_24005__en-us_topic_0191813885_li13075641172029"><a name="alm_24005__en-us_topic_0191813885_li13075641172029"></a><a name="en-us_topic_0191813885_li13075641172029"></a>On MRS Manager, check whether the ALM-38000 Kafka Service Unavailable alarm is reported and whether the Kafka service is stopped.<ul id="alm_24005__en-us_topic_0191813885_ul4732135617219"><li id="alm_24005__en-us_topic_0191813885_li261349517219">If the alarm is reported, clear it according to the handling suggestions of "ALM-38000 Kafka Service Unavailable"; if the Kafka service is stopped, start it. Then go to <a href="#alm_24005__en-us_topic_0191813885_li1487713813414">1.g</a>.</li><li id="alm_24005__en-us_topic_0191813885_li3438110517219">If no, go to <a href="#alm_24005__en-us_topic_0191813885_li1487713813414">1.g</a>.</li></ul>
|
|
</li><li id="alm_24005__en-us_topic_0191813885_li1487713813414"><a name="alm_24005__en-us_topic_0191813885_li1487713813414"></a><a name="en-us_topic_0191813885_li1487713813414"></a>Go to the MRS cluster details page and click <strong id="alm_24005__b116261817148">Components</strong>.<div class="note" id="alm_24005__en-us_topic_0191813885_note148774381044"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="alm_24005__en-us_topic_0191813885_p38771238047">For MRS 1.7.2 or earlier, log in to MRS Manager and click <strong id="alm_24005__b1432815681119">Services</strong>.</p>
|
|
</div></div>
|
|
</li><li id="alm_24005__en-us_topic_0191813885_li51878246172029">Choose <span class="menucascade" id="alm_24005__menucascade118035208142"><b><span class="uicontrol" id="alm_24005__uicontrol13802142018148">Flume</span></b> > <b><span class="uicontrol" id="alm_24005__uicontrol1380310205147">Instances</span></b></span>.</li><li id="alm_24005__en-us_topic_0191813885_li40290365172029">Click the Flume instance of the faulty node and check whether the value of the <strong id="alm_24005__b842352706155555">Sink Speed Metrics</strong> is 0.<ul id="alm_24005__en-us_topic_0191813885_ul57163500172116"><li id="alm_24005__en-us_topic_0191813885_li34773663172116">If yes, go to <a href="#alm_24005__en-us_topic_0191813885_li60707704172341">2.a</a>.</li><li id="alm_24005__en-us_topic_0191813885_li46977192172116">If no, no further action is required.</li></ul>
|
|
</li></ol>
|
|
</p></li><li id="alm_24005__en-us_topic_0191813885_li22413110171954"><span>Check the status of the network between the Flume sink and faulty node.</span><p><ol type="a" id="alm_24005__en-us_topic_0191813885_ol64852523172341"><li id="alm_24005__en-us_topic_0191813885_li60707704172341"><a name="alm_24005__en-us_topic_0191813885_li60707704172341"></a><a name="en-us_topic_0191813885_li60707704172341"></a>Check whether the Flume sink is the Avro type.<ul id="alm_24005__en-us_topic_0191813885_ul19691151172354"><li id="alm_24005__en-us_topic_0191813885_li12413425172354">If yes, go to <a href="#alm_24005__en-us_topic_0191813885_li31163561172341">2.c</a>.</li><li id="alm_24005__en-us_topic_0191813885_li41431968172354">If no, go to <a href="#alm_24005__en-us_topic_0191813885_li572522141314">3</a>.</li></ul>
|
|
</li><li id="alm_24005__en-us_topic_0191813885_li42453771111748">Log in to the host where the faulty node resides. Run the following command to switch to user <strong id="alm_24005__b84235270614127">root</strong>:<p id="alm_24005__en-us_topic_0191813885_p20779526111815"><strong id="alm_24005__en-us_topic_0191813885_b2140647111818">sudo su - root</strong></p>
|
|
</li><li id="alm_24005__en-us_topic_0191813885_li31163561172341"><a name="alm_24005__en-us_topic_0191813885_li31163561172341"></a><a name="en-us_topic_0191813885_li31163561172341"></a>Run the <strong id="alm_24005__b135722172015427">ping</strong> <em id="alm_24005__i193118694415427">Flume sink IP address</em> command to check whether the Flume sink can be pinged.<ul id="alm_24005__en-us_topic_0191813885_ul48380478172357"><li id="alm_24005__en-us_topic_0191813885_li50329611172357">If yes, go to <a href="#alm_24005__en-us_topic_0191813885_li572522141314">3</a>.</li><li id="alm_24005__en-us_topic_0191813885_li36968236172357">If no, go to <a href="#alm_24005__en-us_topic_0191813885_li35581265172341">2.d</a>.</li></ul>
|
|
</li><li id="alm_24005__en-us_topic_0191813885_li35581265172341"><a name="alm_24005__en-us_topic_0191813885_li35581265172341"></a><a name="en-us_topic_0191813885_li35581265172341"></a>Contact the network administrator to repair the network.</li><li id="alm_24005__en-us_topic_0191813885_li60934918172341">Wait for a while and check whether the alarm is cleared.<ul id="alm_24005__en-us_topic_0191813885_ul6060910817242"><li id="alm_24005__en-us_topic_0191813885_li4661271217242">If yes, no further action is required.</li><li id="alm_24005__en-us_topic_0191813885_li1091583517242">If no, go to <a href="#alm_24005__en-us_topic_0191813885_li572522141314">3</a>.</li></ul>
|
|
</li></ol>
|
|
</p></li><li id="alm_24005__en-us_topic_0191813885_li572522141314"><a name="alm_24005__en-us_topic_0191813885_li572522141314"></a><a name="en-us_topic_0191813885_li572522141314"></a><span>Collect fault information.</span><p><ol type="a" id="alm_24005__en-us_topic_0191813885_en-us_topic_0191813935_ol6089206913036"><li id="alm_24005__en-us_topic_0191813885_en-us_topic_0191813935_li4478836213036">On MRS Manager, choose <span class="menucascade" id="alm_24005__menucascade4229902311467"><b><span class="uicontrol" id="alm_24005__uicontrol10547744114625">System</span></b> > <b><span class="uicontrol" id="alm_24005__uicontrol27820839114625">Export Log</span></b></span>.</li><li id="alm_24005__li18574327401">Contact technical support engineers for help. For details, see <a href="https://docs.otc.t-systems.com/en-us/public/learnmore.html" target="_blank" rel="noopener noreferrer">technical support</a>.</li></ol>
|
|
</p></li></ol>
|
|
</div>
|
|
<div class="section" id="alm_24005__en-us_topic_0191813885_section20269844175625"><h4 class="sectiontitle">Related Information</h4><p id="alm_24005__en-us_topic_0191813885_p31244662175625">N/A</p>
|
|
</div>
|
|
<p id="alm_24005__en-us_topic_0191813885_p8060118"></p>
|
|
</div>
|
|
<div>
|
|
<div class="familylinks">
|
|
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_0241.html">Alarm Reference (Applicable to Versions Earlier Than MRS 3.x)</a></div>
|
|
</div>
|
|
</div>
|
|
|