forked from docs/doc-exports
Reviewed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com> Co-authored-by: Yang, Tong <yangtong2@huawei.com> Co-committed-by: Yang, Tong <yangtong2@huawei.com>
92 lines
15 KiB
HTML
92 lines
15 KiB
HTML
<a name="ALM-12066"></a><a name="ALM-12066"></a>
|
|
|
|
<h1 class="topictitle1">ALM-12066 Trust Relationships Between Nodes Become Invalid</h1>
|
|
<div id="body1547168128796"><div class="section" id="ALM-12066__section10369415133116"><h4 class="sectiontitle">Description</h4><p id="ALM-12066__p324232317301">The system checks whether the trust relationship between the active OMS node and other Agent nodes is normal every hour. The alarm is generated if the mutual trust fails. This alarm is automatically cleared if this problem is resolved.</p>
|
|
</div>
|
|
<div class="section" id="ALM-12066__section8323192410322"><h4 class="sectiontitle">Attribute</h4>
|
|
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12066__table1479793583212" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12066__row107991735133210"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-12066__p18799183583212">Alarm ID</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-12066__p1680123511326">Alarm Severity</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-12066__p1980173523217">Auto Clear</p>
|
|
</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody><tr id="ALM-12066__row880183517329"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-12066__p108014356328">12066</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-12066__p19802163593213">Major</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-12066__p880215356323">Yes</p>
|
|
</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
</div>
|
|
</div>
|
|
<div class="section" id="ALM-12066__section652875914327"><h4 class="sectiontitle">Parameters</h4>
|
|
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12066__table1090459143316" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12066__row190429173313"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-12066__p129062911339">Name</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-12066__p10906093332">Meaning</p>
|
|
</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody><tr id="ALM-12066__row1035763317362"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12066__p17935380415">Source</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12066__p187931338134115">Specifies the cluster or system for which the alarm is generated.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="ALM-12066__row18907109203311"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12066__p99095916333">ServiceName</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12066__p4909159173310">Specifies the service for which the alarm is generated.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="ALM-12066__row4910691332"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12066__p39101953320">RoleName</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12066__p5911189173310">Specifies the role for which the alarm is generated.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="ALM-12066__row59118923315"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12066__p0912169123319">HostName</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12066__p169131916332">Specifies the host for which the alarm is generated.</p>
|
|
</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
</div>
|
|
</div>
|
|
<div class="section" id="ALM-12066__section2990133614335"><h4 class="sectiontitle">Impact on the System</h4><p id="ALM-12066__p531812513564">Some operations on the management plane may be abnormal.</p>
|
|
</div>
|
|
<div class="section" id="ALM-12066__section950130153414"><h4 class="sectiontitle">Possible Causes</h4><ul id="ALM-12066__ul913288183510"><li id="ALM-12066__li713414815352">The <strong id="ALM-12066__b22461400518">/etc/ssh/sshd_config</strong> configuration file is damaged.</li><li id="ALM-12066__li131351185357">The password of user <strong id="ALM-12066__b10643161513517">omm</strong> has expired.</li></ul>
|
|
</div>
|
|
<div class="section" id="ALM-12066__section071212121445"><h4 class="sectiontitle">Procedure</h4><p id="ALM-12066__p14212204913111"><strong id="ALM-12066__b4515327657">Check the status of the /etc/ssh/sshd_config configuration file.</strong></p>
|
|
<ol id="ALM-12066__ol363257182811"><li id="ALM-12066__li263016792816"><span>In the alarm list on FusionInsight Manager, locate the row that contains the alarm and click <span><img id="ALM-12066__image1663017722814" src="en-us_image_0263895789.png"></span> to view the host list in the alarm details.</span></li><li id="ALM-12066__li17631167192814"><span>Log in to the active OMS node as user <strong id="ALM-12066__b173458362104930">omm</strong>. <span id="ALM-12066__text38540585518"></span></span></li><li id="ALM-12066__li17631374283"><span>Run the <strong id="ALM-12066__b8591193761511">ssh</strong> command, for example, <strong id="ALM-12066__b1611013111616">ssh</strong> <strong id="ALM-12066__b461113131618"><em id="ALM-12066__i8702204181616">host2</em></strong>, on each node in the alarm details to check whether the connection fails. (<em id="ALM-12066__i1032492071610"><strong id="ALM-12066__b18558144131812">host2</strong></em> is a node other than the OMS node in the alarm details.)</span><p><ul id="ALM-12066__ul1963111718289"><li id="ALM-12066__li363117710285">If yes, go to <a href="#ALM-12066__li176321676280">4</a>.</li><li id="ALM-12066__li136319782815">If no, go to <a href="#ALM-12066__li9148131091317">6</a>.</li></ul>
|
|
</p></li><li id="ALM-12066__li176321676280"><a name="ALM-12066__li176321676280"></a><a name="li176321676280"></a><span>Open the <strong id="ALM-12066__b19350203172016">/etc/ssh/sshd_config</strong> configuration file on host2 and check whether <strong id="ALM-12066__b497416449207">AllowUsers</strong> or <strong id="ALM-12066__b683084712203">DenyUsers</strong> is configured for other nodes.</span><p><ul id="ALM-12066__ul263219711285"><li id="ALM-12066__li66323716289">If yes, go to <a href="#ALM-12066__li846318425575">5</a>.</li><li id="ALM-12066__li1763211732817">If no, contact OS experts.</li></ul>
|
|
</p></li><li id="ALM-12066__li846318425575"><a name="ALM-12066__li846318425575"></a><a name="li846318425575"></a><span>Modify the whitelist or blacklist to ensure that user <strong id="ALM-12066__b5862624122211">omm</strong> is in the whitelist or not in the blacklist. Check whether the alarm is cleared.</span><p><ul id="ALM-12066__ul111918318587"><li id="ALM-12066__li17191331165814">If yes, no further action is required.</li><li id="ALM-12066__li15858237195817">If no, go to <a href="#ALM-12066__li9148131091317">6</a>.</li></ul>
|
|
</p></li></ol>
|
|
<p id="ALM-12066__p31281710101318"><strong id="ALM-12066__b580872612411">Check the status of the password of user omm.</strong></p>
|
|
<ol start="6" id="ALM-12066__ol19148181010138"><li id="ALM-12066__li9148131091317"><a name="ALM-12066__li9148131091317"></a><a name="li9148131091317"></a><span>Check the interaction information of the <strong id="ALM-12066__b17968171562512">ssh</strong> command.</span><p><ul class="subitemlist" id="ALM-12066__ul181481910161315"><li id="ALM-12066__li13148310111319">If the password of user <strong id="ALM-12066__b1022313330252">omm</strong> is required, go to <a href="#ALM-12066__li81482101138">7</a>.</li><li id="ALM-12066__li121483102136">If message "Enter passphrase for key '/home/omm/.ssh/id_rsa':" is displayed, go to <a href="#ALM-12066__li106306742813">9</a>.</li></ul>
|
|
</p></li><li class="subitemlist" id="ALM-12066__li81482101138"><a name="ALM-12066__li81482101138"></a><a name="li81482101138"></a><span>Check the trust list (<strong id="ALM-12066__b75785322610">/home/omm/.ssh/authorized_keys</strong>) of user <strong id="ALM-12066__b730655672611">omm</strong> on the OMS node and host2 node. Check whether the trust list contains the public key file (<strong id="ALM-12066__b1756913136278">/home/omm/.ssh/id_rsa.pub</strong>) of user <strong id="ALM-12066__b34871732719">omm</strong> on the peer host.</span><p><ul id="ALM-12066__ul6148151021318"><li id="ALM-12066__li614861061316">If yes, contact OS experts.</li><li id="ALM-12066__li11482010131312">If no, add the public key of user <strong id="ALM-12066__b663884152710">omm</strong> of the peer host to the trust list of the local host.</li></ul>
|
|
</p></li><li id="ALM-12066__li19341633125911"><span>Add the public key of user <strong id="ALM-12066__b0377113310287">omm</strong> of the peer host to the trust list of the local host. Run the <strong id="ALM-12066__b1737092382919">ssh</strong> command, for example, <strong id="ALM-12066__b6889113012290">ssh host2</strong>, on each node in the alarm details to check whether the connection fails. (<em id="ALM-12066__i81833373014"><strong id="ALM-12066__b0720241270">host2</strong></em> is a node other than the OMS node in the alarm details.)</span><p><ul id="ALM-12066__ul137211213508"><li id="ALM-12066__li153121714307">If yes, go to <a href="#ALM-12066__li106306742813">9</a>.</li><li id="ALM-12066__li7313414402">If no, check whether the alarm is cleared. If the alarm is cleared, no further action is required; otherwise, go to <a href="#ALM-12066__li106306742813">9</a>.</li></ul>
|
|
</p></li></ol>
|
|
<p id="ALM-12066__p124132216288"><strong id="ALM-12066__b1967293410811">Collect the fault information.</strong></p>
|
|
<ol start="9" id="ALM-12066__ol146302742816"><li class="subitemlist" id="ALM-12066__li106306742813"><a name="ALM-12066__li106306742813"></a><a name="li106306742813"></a><span>On FusionInsight Manager, choose <strong id="ALM-12066__b140942549104930">O&M</strong>. In the navigation pane on the left, choose <strong id="ALM-12066__b180541324104930">Log</strong> > <strong id="ALM-12066__b1225148528104930">Download</strong>.</span></li><li id="ALM-12066__li06301476283"><span>Select <strong id="ALM-12066__b192996136104930">Controller</strong> for <strong id="ALM-12066__b345013368916">Service</strong> and click <strong id="ALM-12066__b1962404791104930">OK</strong>.</span></li><li id="ALM-12066__li126301173286"><span>Click <span><img id="ALM-12066__image863057122812" src="en-us_image_0263895540.png"></span> in the upper right corner to set the log collection time range. Generally, the time range is 10 minutes before and after the alarm generation time. Click <strong id="ALM-12066__b575409479104930">Download</strong>.</span></li><li id="ALM-12066__li2630274284"><span>Contact <span id="ALM-12066__text1793615574113">O&M personnel</span> and provide the collected logs.</span></li></ol>
|
|
</div>
|
|
<div class="section" id="ALM-12066__section169311343318"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12066__p754913417333">This alarm is automatically cleared after the fault is rectified.</p>
|
|
</div>
|
|
<div class="section" id="ALM-12066__section8222143110380"><h4 class="sectiontitle">Related Information</h4><p id="ALM-12066__p4686124105919">Perform the following steps to handle abnormal trust relationships between nodes:</p>
|
|
<div class="notice" id="ALM-12066__note64991413518"><span class="noticetitle"><img src="public_sys-resources/notice_3.0-en-us.png"> </span><div class="noticebody"><ul id="ALM-12066__ul19616958163514"><li id="ALM-12066__li1616145863512">Perform this operation as user <strong id="ALM-12066__b2165161015166">omm</strong>.</li><li id="ALM-12066__li861655833518">If the network between nodes is disconnected, rectify the network fault first. Check whether the two nodes are connected to the same security group and whether <strong id="ALM-12066__b196521759121612">hosts.deny</strong> and <strong id="ALM-12066__b1613616201712">hosts.allow</strong> are set.</li></ul>
|
|
</div></div>
|
|
<ol id="ALM-12066__ol1978732155814"><li id="ALM-12066__li597853215581">Run the <strong id="ALM-12066__b186632016173">ssh-add -l</strong> command on both nodes to check whether any identities exist.<p id="ALM-12066__p392110588248"><span><img id="ALM-12066__image8432143962413" src="en-us_image_0000001226576418.png"></span></p>
|
|
<ul id="ALM-12066__ul122791263414"><li id="ALM-12066__li122797214348">If yes, go to <a href="#ALM-12066__li09782325586">4</a>.</li><li id="ALM-12066__li14378713415">If no, go to <a href="#ALM-12066__li16978123275815">2</a>.</li></ul>
|
|
</li><li id="ALM-12066__li16978123275815"><a name="ALM-12066__li16978123275815"></a><a name="li16978123275815"></a>If no identities are displayed, run the <strong id="ALM-12066__b6267121682419">ps -ef|grep ssh-agent</strong> command to find the <strong id="ALM-12066__b1666702220243">ssh-agent</strong> process, stop the process, and wait for the process to automatically restart.<p id="ALM-12066__p629941492510"><span><img id="ALM-12066__image138828117259" src="en-us_image_0000001227056330.png"></span></p>
|
|
</li><li id="ALM-12066__li1997863215584">Run the <strong id="ALM-12066__b18989588253">ssh-add -l</strong> command to check whether the identities have been added. If yes, manually run the <strong id="ALM-12066__b559031413264">ssh</strong> command to check whether the trust relationship is normal.<p id="ALM-12066__p492712369259"><span><img id="ALM-12066__image1579143210257" src="en-us_image_0000001271536445.png"></span></p>
|
|
</li><li id="ALM-12066__li09782325586"><a name="ALM-12066__li09782325586"></a><a name="li09782325586"></a>If identities exist, check whether the <span class="filepath" id="ALM-12066__filepath1443720119218"><b>/home/omm/.ssh/authorized_keys</b></span> file contains the information in the <span class="filepath" id="ALM-12066__filepath693611119214"><b>/home/omm/.ssh/id_rsa.pub</b></span> file of the peer node. If it does not, manually add the information.</li><li id="ALM-12066__li497914322582">Check whether the permissions on the files in the <strong id="ALM-12066__b152771124143011">/home/omm/.ssh</strong> directory are modified.</li><li id="ALM-12066__li8979193218587">Check the <strong id="ALM-12066__b2982446153018">/var/log/Bigdata/nodeagent/scriptlog/ssh-agent-monitor.log</strong> file.</li><li id="ALM-12066__li3979632105814">If the <strong id="ALM-12066__b09816214325">/home</strong> directory of user <strong id="ALM-12066__b1171105173213">omm</strong> is deleted, contact MRS support personnel for assistance.</li></ol>
|
|
</div>
|
|
</div>
|
|
<div>
|
|
<div class="familylinks">
|
|
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
|
|
</div>
|
|
</div>
|
|
|