doc-exports/docs/mrs/umn/ALM-12052.html
Yang, Tong 3b1f73dece MRS UMN 2.0.38.SP20 version
Reviewed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com>
Co-authored-by: Yang, Tong <yangtong2@huawei.com>
Co-committed-by: Yang, Tong <yangtong2@huawei.com>
2022-12-13 12:03:34 +00:00

102 lines
14 KiB
HTML

<a name="ALM-12052"></a><a name="ALM-12052"></a>
<h1 class="topictitle1">ALM-12052 TCP Temporary Port Usage Exceeds the Threshold</h1>
<div id="body23731619"><div class="section" id="ALM-12052__s22ee097dfd934a9eb21054ce3d65fb29"><h4 class="sectiontitle">Description</h4><p id="ALM-12052__en-us_topic_0070543627_p23879306">The system checks the TCP temporary port usage every 30 seconds and compares the actual usage with the threshold (the default threshold is 80%). This alarm is generated when the TCP temporary port usage exceeds the threshold for several times (5 times by default) consecutively.</p>
<p id="ALM-12052__en-us_topic_0070543627_p13587166">To change the threshold, choose <strong id="ALM-12052__en-us_topic_0070543619_b28886228">O&amp;M &gt; Alarm</strong> &gt; <strong id="ALM-12052__b15474819115615">Thresholds</strong> &gt; <em id="ALM-12052__i13216192517564">Name of the desired cluster</em> &gt; <strong id="ALM-12052__en-us_topic_0070543619_b52985952">Host</strong> &gt; <strong id="ALM-12052__en-us_topic_0070543627_b22126576">Network Status</strong> &gt; <strong id="ALM-12052__en-us_topic_0070543627_b47422230">TCP Ephemeral Port Usage</strong>.</p>
<p id="ALM-12052__p3523103411952">When the <strong id="ALM-12052__b48421890111935">Trigger Count</strong> is 1, this alarm is cleared when the TCP temporary port usage is less than or equal to the threshold. When the <strong id="ALM-12052__b7959182312501">Trigger Count</strong> is greater than 1, this alarm is cleared when the TCP temporary port usage is less than or equal to 90% of the threshold.</p>
</div>
<div class="section" id="ALM-12052__s1b70c20a78604e90a382610d5cf6a1c5"><h4 class="sectiontitle">Attribute</h4>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12052__en-us_topic_0070543627_table9740802" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12052__en-us_topic_0070543627_row18296119"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-12052__en-us_topic_0070543627_p5590693">Alarm ID</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-12052__en-us_topic_0070543627_p50192963">Alarm Severity</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-12052__en-us_topic_0070543627_p39098169">Auto Clear</p>
</th>
</tr>
</thead>
<tbody><tr id="ALM-12052__en-us_topic_0070543627_row12835120"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-12052__en-us_topic_0070543627_p33011791">12052</p>
</td>
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-12052__en-us_topic_0070543627_p56709439">Major</p>
</td>
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-12052__en-us_topic_0070543627_p30061864">Yes</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="ALM-12052__s9cbd0eba27e94b17b7fa2b92a1b458c9"><h4 class="sectiontitle">Parameters</h4>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12052__en-us_topic_0070543627_table19091889" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12052__en-us_topic_0070543627_row62257126"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-12052__en-us_topic_0070543627_p9662484">Name</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-12052__en-us_topic_0070543627_p44463750">Meaning</p>
</th>
</tr>
</thead>
<tbody><tr id="ALM-12052__row7115951134820"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12052__p192431315431">Source</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12052__p692551319435">Specifies the cluster or system for which the alarm is generated.</p>
</td>
</tr>
<tr id="ALM-12052__en-us_topic_0070543627_row44794021"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12052__en-us_topic_0070543627_p4437080">ServiceName</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12052__en-us_topic_0070543627_p23859208">Specifies the service for which the alarm is generated.</p>
</td>
</tr>
<tr id="ALM-12052__en-us_topic_0070543627_row13406282"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12052__en-us_topic_0070543627_p12167039">RoleName</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12052__en-us_topic_0070543627_p46006117">Specifies the role for which the alarm is generated.</p>
</td>
</tr>
<tr id="ALM-12052__en-us_topic_0070543627_row11401874"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12052__en-us_topic_0070543627_p51136579">HostName</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12052__en-us_topic_0070543627_p48422266">Specifies the host for which the alarm is generated.</p>
</td>
</tr>
<tr id="ALM-12052__en-us_topic_0070543627_row33147210"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12052__en-us_topic_0070543627_p569474">Trigger Condition</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12052__en-us_topic_0070543627_p46127410">Specifies the threshold triggering the alarm. If the current indicator value exceeds this threshold, the alarm is generated.</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="ALM-12052__s7abc3bf7aeb24ab4a146ee338cc64e4f"><h4 class="sectiontitle">Impact on the System</h4><p id="ALM-12052__en-us_topic_0070543627_p45332739">Services on the host cannot establish external connections, and therefore they are interrupted.</p>
</div>
<div class="section" id="ALM-12052__sb7b610c6de7745eb88b799c8579eadf1"><h4 class="sectiontitle">Possible Causes</h4><ul id="ALM-12052__en-us_topic_0070543627_ul48073244"><li id="ALM-12052__en-us_topic_0070543627_li30006020">The temporary port cannot meet the current service requirements.</li><li id="ALM-12052__en-us_topic_0070543627_li1618726">The system is abnormal.</li></ul>
</div>
<div class="section" id="ALM-12052__s6decbfe8b04e489d9cf8766a9aa9271f"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12052__en-us_topic_0070543627_p64008009"><strong id="ALM-12052__b36299953151424">Expand the temporary port number range.</strong></p>
<ol id="ALM-12052__ol4904735151436"><li id="ALM-12052__li53454689151427"><span>On FusionInsight Manager, click <span><img id="ALM-12052__image168221113135319" src="en-us_image_0269383880.png"></span> in the row where the alarm is located in the real-time alarm list and obtain the IP address of the host for which the alarm is generated.</span></li><li id="ALM-12052__li34862525151427"><span>Log in to the host for which the alarm is generated as user <strong id="ALM-12052__b6057410214588">omm</strong>.</span></li><li id="ALM-12052__li5292302151427"><span>Run<strong id="ALM-12052__b4455986911048"> </strong>the<strong id="ALM-12052__b6549450511048"> cat /proc/sys/net/ipv4/ip_local_port_range |cut -f 1 </strong>command to obtain the value of the start port and run the <strong id="ALM-12052__b1596735623311"><strong id="ALM-12052__b179672056113317">cat /proc/sys/net/ipv4/ip_local_port_range</strong> |cut -f 2 </strong>command to obtain the value of the end port. The total number of temporary ports is the value of the end port minus the value of the start port. If the total number of temporary ports is smaller than 28,232, the random port range of the OS is narrow. Contact the system administrator to increase the port range.</span></li><li id="ALM-12052__li235192813711"><span>Run the <strong id="ALM-12052__b1571566811118">ss -ant 2&gt;/dev/null | grep -v LISTEN | awk 'NR &gt; 2 {print $4}'|cut -d ':' -f 2 | awk '$1 &gt;"</strong><i><span class="varname" id="ALM-12052__varname1665926611118">Value of the start port</span></i><strong id="ALM-12052__b722328511118">" {print $1}' | sort -u | wc -l</strong> command to calculate the number of used temporary ports.</span></li><li id="ALM-12052__li47630726151427"><span>The formula for calculating the usage of the temporary ports is: Usage of the temporary ports = (Number of used temporary ports/Total number of temporary ports) x 100%. Check whether the temporary port usage exceeds the threshold.</span><p><ul id="ALM-12052__ul22547539165328"><li id="ALM-12052__li56893717165328">If yes, go to <a href="#ALM-12052__li39311997145458">7</a>.</li><li id="ALM-12052__li20178347165328">If no, go to <a href="#ALM-12052__li61526456151427">6</a>.</li></ul>
</p></li><li id="ALM-12052__li61526456151427"><a name="ALM-12052__li61526456151427"></a><a name="li61526456151427"></a><span>Wait for 5 minutes, and check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12052__ul46327333151427"><li id="ALM-12052__li26023356151427">If yes, no further action is required.</li><li id="ALM-12052__li27517102151427">If no, go to <a href="#ALM-12052__li39311997145458">7</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-12052__p14292813151427"><strong id="ALM-12052__b49945765151444">Check whether the system environment is abnormal.</strong></p>
<ol start="7" id="ALM-12052__ol1400862151455"><li id="ALM-12052__li39311997145458"><a name="ALM-12052__li39311997145458"></a><a name="li39311997145458"></a><span>Run the following command to import the temporary file and view the frequently used ports in the <strong id="ALM-12052__b14425612153010">port_result.txt file</strong>:</span><p><p id="ALM-12052__p51428543103538"><strong id="ALM-12052__b2416961153019">netstat -tnp<strong id="ALM-12052__b420883413616">|sort</strong> &gt; $BIGDATA_HOME/tmp/port_result.txt</strong></p>
<pre class="screen" id="ALM-12052__screen60203709103538">netstat -tnp|sort
Active Internet connections (w/o servers)
Proto Recv Send LocalAddress ForeignAddress State PID/ProgramName tcp 0 0 10-120-85-154:45433 10-120-85-154:9866 CLOSE_WAIT 94237/java
tcp 0 0 10-120-85-154:45434 10-120-85-154:9866 CLOSE_WAIT 94237/java
tcp 0 0 10-120-85-154:45435 10-120-85-154:9866 CLOSE_WAIT 94237/java
...</pre>
</p></li><li id="ALM-12052__li28677351151427"><span>Run the following command to view the processes that occupy a large number of ports:</span><p><p id="ALM-12052__p44662294103538"><strong id="ALM-12052__b63447338115822">ps -ef |grep </strong><i><span class="varname" id="ALM-12052__varname14506244115822">PID</span></i></p>
<div class="note" id="ALM-12052__note63551870114625"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><ul id="ALM-12052__ul4800253115836"><li id="ALM-12052__li21930883172113">PID is the processes ID queried in <a href="#ALM-12052__li39311997145458">7</a>.</li><li id="ALM-12052__li33161866115836">Run the following command to collect information about all processes and check the processes that occupy a large number of ports:<p id="ALM-12052__p66552038104152"><a name="ALM-12052__li33161866115836"></a><a name="li33161866115836"></a><strong id="ALM-12052__b656814811435">ps -ef &gt; $BIGDATA_HOME/tmp/ps_result.txt</strong></p>
</li></ul>
</div></div>
</p></li><li id="ALM-12052__li785710172156"><span>After obtaining the administrator's approval, clear the processes that occupy a large number of ports. Wait for 5 minutes, and check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12052__ul45958539151427"><li id="ALM-12052__li56769573151427">If yes, no further action is required.</li><li id="ALM-12052__li34932666151427">If no, go to <a href="#ALM-12052__li57585220151427">10</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-12052__p10973675151427"><strong id="ALM-12052__b3641674915155">Collect fault information.</strong></p>
<ol start="10" id="ALM-12052__ol5485290715150"><li id="ALM-12052__li57585220151427"><a name="ALM-12052__li57585220151427"></a><a name="li57585220151427"></a><span>On the FusionInsight Manager home page of the active cluster, choose <strong id="ALM-12052__b39977366113627">O&amp;M</strong> &gt; <strong id="ALM-12052__b24251979113627">Log &gt; Download</strong>.</span></li><li id="ALM-12052__li60837487151427"><span>Select <strong id="ALM-12052__b1352831932712">OMS</strong> from the <strong id="ALM-12052__b33891259151427">Service</strong> and click <strong id="ALM-12052__b3991118545">OK</strong>.</span></li><li id="ALM-12052__li28889415151427"><span>Set <strong id="ALM-12052__b10666475151427">Host</strong> to the node for which the alarm is generated and the active OMS node.</span></li><li id="ALM-12052__li1145664103113"><span>Click <span><img id="ALM-12052__image1945644173117" src="en-us_image_0269383881.png"></span> in the upper right corner, and set <strong id="ALM-12052__b6456941173117">Start Date</strong> and <strong id="ALM-12052__b11456154113318">End Date</strong> for log collection to 30 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12052__b13456164113319">Download</strong>.</span></li><li id="ALM-12052__li495644512588"><span>Contact the <span id="ALM-12052__text4614151421417">O&amp;M personnel</span> and send the collected log information and files <strong id="ALM-12052__b201061554424">port_result.txt</strong> and <strong id="ALM-12052__b1210685412211">ps_result.txt</strong>. Then, delete the two residual temporary files from the environment.</span></li></ol>
</div>
<div class="section" id="ALM-12052__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12052__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
</div>
<div class="section" id="ALM-12052__s357966ed15d54556867886aeb8fb1d67"><h4 class="sectiontitle">Related Information</h4><p id="ALM-12052__en-us_topic_0070543627_p12799849">None</p>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
</div>
</div>