forked from docs/doc-exports
Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com> Reviewed-by: Rechenburg, Matthias <matthias.rechenburg@t-systems.com> Co-authored-by: Yang, Tong <yangtong2@huawei.com> Co-committed-by: Yang, Tong <yangtong2@huawei.com>
85 lines
11 KiB
HTML
85 lines
11 KiB
HTML
<a name="ALM-38010"></a><a name="ALM-38010"></a>
|
|
|
|
<h1 class="topictitle1">ALM-38010 Topics with Single Replica</h1>
|
|
<div id="body1557814980027"><div class="section" id="ALM-38010__section11741729163717"><h4 class="sectiontitle">Description</h4><p id="ALM-38010__p11256338133714">The system checks the number of replicas of each topic every 60 seconds on the node where the Kafka Controller resides. This alarm is generated when there is one replica for a topic.</p>
|
|
</div>
|
|
<div class="section" id="ALM-38010__section593323953414"><h4 class="sectiontitle">Attribute</h4>
|
|
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-38010__table52765276110" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-38010__row17455727516"><th align="left" class="cellrowborder" valign="top" width="33.333333333333336%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-38010__p104553271711">Alarm ID</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="33.333333333333336%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-38010__p34551127813">Alarm Severity</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="33.333333333333336%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-38010__p1945512710114">Automatically Cleared</p>
|
|
</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody><tr id="ALM-38010__row245552718110"><td class="cellrowborder" valign="top" width="33.333333333333336%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-38010__p345582713114">38010</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.333333333333336%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-38010__p9455182712120">Warning</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.333333333333336%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-38010__p114551271214">No</p>
|
|
</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
</div>
|
|
</div>
|
|
<div class="section" id="ALM-38010__section6381153211354"><h4 class="sectiontitle">Parameters</h4>
|
|
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-38010__table1328820271811" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-38010__row545511271613"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-38010__p1045512712112">Name</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-38010__p114551127219">Meaning</p>
|
|
</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody><tr id="ALM-38010__row677219234574"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-38010__p192431315431">Source</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-38010__p692551319435">Specifies the cluster for which the alarm is generated.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="ALM-38010__row154552271019"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-38010__p9455132713113">ServiceName</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-38010__p1345513270115">Specifies the service for which the alarm is generated.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="ALM-38010__row645514279110"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-38010__p645514272111">RoleName</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-38010__p184550271316">Specifies the role for which the alarm is generated.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="ALM-38010__row14455182715116"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-38010__p1945519271514">TopicName</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-38010__p682213710377">Specifies the list of topics for which the alarm is generated.</p>
|
|
</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
</div>
|
|
</div>
|
|
<div class="section" id="ALM-38010__section1629217271913"><h4 class="sectiontitle">Impact on the System</h4><p id="ALM-38010__p11236144316382">There is the single point of failure (SPOF) risk for topics with only one replica. When the node where the replica resides becomes abnormal, the partition does not have a leader, and services on the topic are affected.</p>
|
|
</div>
|
|
<div class="section" id="ALM-38010__section729222715115"><h4 class="sectiontitle">Possible Causes</h4><ul id="ALM-38010__ul93765703912"><li id="ALM-38010__li1621810219405">The number of replicas for the topic is incorrectly configured.</li></ul>
|
|
</div>
|
|
<div class="section" id="ALM-38010__section61801227174019"><h4 class="sectiontitle">Procedure</h4><p id="ALM-38010__p881510795715"><strong id="ALM-38010__b17734202545719">Check the number of replicas for the topic.</strong></p>
|
|
<ol id="ALM-38010__ol17527195434817"><li id="ALM-38010__li85278542481"><span>On <span id="ALM-38010__text34789336432">MRS</span> Manager, choose <strong id="ALM-38010__b17504113497">O&M</strong> > <strong id="ALM-38010__b194971159830">Alarm </strong>><strong id="ALM-38010__b94988595320"> <strong id="ALM-38010__b149814593314">Alarms</strong></strong>, click <span><img id="ALM-38010__image12740153301" src="en-us_image_0000001532448170.png"></span> of this alarm, and view the <strong id="ALM-38010__b1699515250564">TopicName </strong>list in <strong id="ALM-38010__b34312363594">Location</strong>.</span></li><li id="ALM-38010__li71212330522"><span>Check whether replicas need to be added for the topic for which the alarm is generated.</span><p><ul id="ALM-38010__ul1891113427523"><li id="ALM-38010__li13911194213529">If yes, go to <a href="#ALM-38010__li135931325311">3</a>.</li><li id="ALM-38010__li291154285210">If no, go to <a href="#ALM-38010__li477744715546">5</a>.</li></ul>
|
|
</p></li><li id="ALM-38010__li135931325311"><a name="ALM-38010__li135931325311"></a><a name="li135931325311"></a><span>On the <span id="ALM-38010__text3259104871714">MRS</span> client, re-plan topic replicas and describe the partition distribution of the topic in the <strong id="ALM-38010__b993192016538">add-replicas-reassignment.json</strong> file in the following format: {"partitions":[{"topic": "<em id="ALM-38010__i1993132085313">topic name</em>","partition": 1,"replicas": [1,2] }],"version":1}. Then, run the following command to add replicas:</span><p><p id="ALM-38010__p174173905"><strong id="ALM-38010__b7741193408">kafka-reassign-partitions.sh </strong><strong id="ALM-38010__b13741143602">--zookeeper </strong><em id="ALM-38010__i47411735017">{zk_host}:{port}</em><strong id="ALM-38010__b137411315013">/kafka</strong> <strong id="ALM-38010__b17741183401">--reassignment-json-file<em id="ALM-38010__i167412031409"> </em></strong><em id="ALM-38010__i187411931501">{manual assignment json file path}</em> <strong id="ALM-38010__b17411433014">--</strong><strong id="ALM-38010__b47414318010">execute</strong></p>
|
|
<p id="ALM-38010__p064754675314">For example:</p>
|
|
<p id="ALM-38010__p42913491539"><strong id="ALM-38010__b184831656175317"><span id="ALM-38010__ph1171442081313">/opt/client</span>/Kafka/kafka/bin/kafka-reassign-partitions.sh --zookeeper 192.168.0.90:2181,192.168.0.91:2181,192.168.0.92:2181/kafka --reassignment-json-file add-replicas-reassignment.json --execute</strong></p>
|
|
</p></li><li id="ALM-38010__li77044584539"><span>Run the following command to check the task execution progress:</span><p><p id="ALM-38010__p47838247418"><strong id="ALM-38010__b2078314241412">kafka-reassign-partitions.sh </strong><strong id="ALM-38010__b1378313241748">--zookeeper </strong><em id="ALM-38010__i1783142413419">{zk_host}:{port}</em><strong id="ALM-38010__b18783102418411">/kafka</strong> <strong id="ALM-38010__b187831824647">--reassignment-json-file<em id="ALM-38010__i47836241542"> </em></strong><em id="ALM-38010__i117831224542">{manual assignment json file path}</em> <strong id="ALM-38010__b2078314241240">--verify</strong></p>
|
|
<p id="ALM-38010__p3601153213543">For example:</p>
|
|
<p id="ALM-38010__p652564015547"><strong id="ALM-38010__b4525740125412"><span id="ALM-38010__ph381512063917">/opt/client</span>/Kafka/kafka/bin/kafka-reassign-partitions.sh --zookeeper 192.168.0.90:2181,192.168.0.91:2181,192.168.0.92:2181/kafka --reassignment-json-file add-replicas-reassignment.json --verify</strong></p>
|
|
</p></li><li id="ALM-38010__li477744715546"><a name="ALM-38010__li477744715546"></a><a name="li477744715546"></a><span>After completing the handling operations or confirming that the alarm has no impact, manually clear the alarm on <span id="ALM-38010__text15424250161718">MRS</span> Manager.</span></li><li id="ALM-38010__li126844814588"><span>After a period of time, check whether the alarm is cleared.</span><p><ul id="ALM-38010__ul177035543414"><li id="ALM-38010__li1877065513418">If it is, no further action is required.</li><li id="ALM-38010__li87708553342">If it is not, go to <a href="#ALM-38010__li266761095919">7</a>.</li></ul>
|
|
</p></li></ol>
|
|
<p class="tableheading" id="ALM-38010__p86548109592"><strong id="ALM-38010__b55150482154417">Collect fault information.</strong></p>
|
|
<ol start="7" id="ALM-38010__ol3667181045915"><li id="ALM-38010__li266761095919"><a name="ALM-38010__li266761095919"></a><a name="li266761095919"></a><span>On <span id="ALM-38010__text118791551101718">MRS</span> Manager, choose <strong id="ALM-38010__b966741016594">O&M </strong>><strong id="ALM-38010__b1666711005918"> Log</strong> > <strong id="ALM-38010__b0667191025914">Download</strong>.</span></li><li id="ALM-38010__li14667131017593"><span>In the <strong id="ALM-38010__b26671105590">Service </strong>area, select <strong id="ALM-38010__b16671610145910">Kafka</strong> in the required cluster.</span></li><li id="ALM-38010__li9667151005910"><span>Click <span><img id="ALM-38010__image2066761014594" src="en-us_image_0000001582927549.png"></span> in the upper right corner, and set <strong id="ALM-38010__b1667101011598">Start Date</strong> and <strong id="ALM-38010__b4667141045920">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-38010__b9667310205913">Download</strong>.</span></li><li id="ALM-38010__li15667151018595"><span>Contact the <span id="ALM-38010__text46671510125918">O&M personnel</span> and send the collected logs.</span></li></ol>
|
|
</div>
|
|
<div class="section" id="ALM-38010__section8671310154820"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-38010__p1667131014489">If the alarm has no impact, manually clear the alarm.</p>
|
|
</div>
|
|
<div class="section" id="ALM-38010__section153180271714"><h4 class="sectiontitle">Related Information</h4><p id="ALM-38010__p17459172711115">None</p>
|
|
</div>
|
|
</div>
|
|
<div>
|
|
<div class="familylinks">
|
|
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
|
|
</div>
|
|
</div>
|
|
|