Files
doc-exports/docs/css/umn/css_01_0427.html
zhengxiu 2125539080 css umn 25.1.0 version
Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com>
Co-authored-by: zhengxiu <zhengxiu@huawei.com>
Co-committed-by: zhengxiu <zhengxiu@huawei.com>
2025-07-04 09:10:17 +00:00

175 lines
11 KiB
HTML

<a name="css_01_0427"></a><a name="css_01_0427"></a>
<h1 class="topictitle1">Configuring Kernel Monitoring for an Elasticsearch Cluster</h1>
<div id="body0000001386314441"><div class="section" id="css_01_0427__section1466155714198"><h4 class="sectiontitle">Scenario</h4>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="css_01_0427__table9434132712317" frame="border" border="1" rules="all"><caption><b>Table 1 </b>Introduction to cluster kernel monitoring</caption><thead align="left"><tr id="css_01_0427__row4434527102316"><th align="left" class="cellrowborder" valign="top" width="20%" id="mcps1.3.1.2.2.5.1.1"><p id="css_01_0427__p11434172716231">Enhanced Monitoring Feature</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="40%" id="mcps1.3.1.2.2.5.1.2"><p id="css_01_0427__p29081437172318">Description</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="20%" id="mcps1.3.1.2.2.5.1.3"><p id="css_01_0427__p10434182720238">Cluster Version</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="20%" id="mcps1.3.1.2.2.5.1.4"><p id="css_01_0427__p2434927122310">Details</p>
</th>
</tr>
</thead>
<tbody><tr id="css_01_0427__row1943412715235"><td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.1.2.2.5.1.1 "><p id="css_01_0427__p10434122752316">P99 latency</p>
</td>
<td class="cellrowborder" valign="top" width="40%" headers="mcps1.3.1.2.2.5.1.2 "><p id="css_01_0427__p17345101319367">Open-source Elasticsearch provides only the average latency metric when monitoring responses to search requests. This may not accurately reflect the actual search performance of a cluster. To improve on this, the P99 latency metric is added in CSS to monitor the 99th percentile latency of each cluster.</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.1.2.2.5.1.3 "><p id="css_01_0427__p143531918718">Elasticsearch 7.6.2, Elasticsearch 7.10.2</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.1.2.2.5.1.4 "><p id="css_01_0427__p1434192702317"><a href="#css_01_0427__section7236204762019">Monitoring P99 Latency</a></p>
</td>
</tr>
<tr id="css_01_0427__row12434132792314"><td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.1.2.2.5.1.1 "><p id="css_01_0427__p1743482710239">HTTP status codes</p>
</td>
<td class="cellrowborder" valign="top" width="40%" headers="mcps1.3.1.2.2.5.1.2 "><p id="css_01_0427__p5549314412">When you access Elasticsearch through HTTP, you receive HTTP status codes in response to your requests. The native open-source Elasticsearch does not collect statistics on these status codes. To improve on this, HTTP status code monitoring is added in CSS, allowing you to monitor HTTP status codes and get a sense of how the service is running.</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.1.2.2.5.1.3 "><p id="css_01_0427__p1576663611254">Elasticsearch 7.6.2, Elasticsearch 7.10.2</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.1.2.2.5.1.4 "><p id="css_01_0427__p843422782312"><a href="#css_01_0427__section1367052018215">Monitoring HTTP Status Codes</a></p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="css_01_0427__section14450183207"><h4 class="sectiontitle">Accessing a Cluster</h4><ol id="css_01_0427__css_01_0405_ol124574417618"><li id="css_01_0427__css_01_0405_en-us_topic_0000001223594408_li1274916552817">Log in to the CSS management console.</li><li id="css_01_0427__css_01_0405_en-us_topic_0000001223594408_li1274935516811">On the <strong id="css_01_0427__css_01_0405_b83771825103117">Clusters</strong> page, locate the target cluster, and click <strong id="css_01_0427__css_01_0405_b189171427495">Access Kibana</strong> in the <strong id="css_01_0427__css_01_0405_b19896191614314">Operation</strong> column to access Kibana.</li><li id="css_01_0427__css_01_0405_en-us_topic_0000001223594408_li927171291011">Click <strong id="css_01_0427__css_01_0405_b199439323210">Dev Tools</strong> in the navigation tree on the left.</li></ol>
</div>
<div class="section" id="css_01_0427__section7236204762019"><a name="css_01_0427__section7236204762019"></a><a name="section7236204762019"></a><h4 class="sectiontitle">Monitoring P99 Latency</h4><div class="p" id="css_01_0427__p18466175123417">Run the following command to obtain the P99 latency of the current cluster:<pre class="screen" id="css_01_0427__en-us_topic_0000001335994820_screen79142157521">GET /search/stats/percentile </pre>
</div>
<p id="css_01_0427__en-us_topic_0000001335994820_p1291461510526">An example output is as follows:</p>
<pre class="screen" id="css_01_0427__en-us_topic_0000001335994820_screen179141815135212">{
"overall" : {
"1.0" : 2.0,
"5.0" : 2.0,
"25.0" : 6.5,
"50.0" : 19.5,
"75.0" : 111.0,
"95.0" : 169.0,
"99.0" : 169.0,
"max" : 169.0,
"min" : 2.0
},
"last_one_day" : {
"1.0" : 2.0,
"5.0" : 2.0,
"25.0" : 6.5,
"50.0" : 19.5,
"75.0" : 111.0,
"95.0" : 169.0,
"99.0" : 169.0,
"max" : 169.0,
"min" : 2.0
},
"latest" : {
"1.0" : 26.0,
"5.0" : 26.0,
"25.0" : 26.0,
"50.0" : 26.0,
"75.0" : 26.0,
"95.0" : 26.0,
"99.0" : 26.0,
"max" : 26.0,
"min" : 26.0
}
}</pre>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="css_01_0427__table203542219296" frame="border" border="1" rules="all"><caption><b>Table 2 </b>Response parameters</caption><thead align="left"><tr id="css_01_0427__row18351322182915"><th align="left" class="cellrowborder" valign="top" width="30%" id="mcps1.3.3.5.2.3.1.1"><p id="css_01_0427__p1436202212918">Parameter</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="70%" id="mcps1.3.3.5.2.3.1.2"><p id="css_01_0427__p14362022182917">Description</p>
</th>
</tr>
</thead>
<tbody><tr id="css_01_0427__row236152212295"><td class="cellrowborder" valign="top" width="30%" headers="mcps1.3.3.5.2.3.1.1 "><p id="css_01_0427__p43617227292">overall</p>
</td>
<td class="cellrowborder" valign="top" width="70%" headers="mcps1.3.3.5.2.3.1.2 "><p id="css_01_0427__p836102213292">Statistics between cluster starting and the current time.</p>
</td>
</tr>
<tr id="css_01_0427__row1036192292917"><td class="cellrowborder" valign="top" width="30%" headers="mcps1.3.3.5.2.3.1.1 "><p id="css_01_0427__p153617228299">last_one_day</p>
</td>
<td class="cellrowborder" valign="top" width="70%" headers="mcps1.3.3.5.2.3.1.2 "><p id="css_01_0427__p123617225295">Statistics for the most recent day.</p>
</td>
</tr>
<tr id="css_01_0427__row9362228298"><td class="cellrowborder" valign="top" width="30%" headers="mcps1.3.3.5.2.3.1.1 "><p id="css_01_0427__p03618223291">latest</p>
</td>
<td class="cellrowborder" valign="top" width="70%" headers="mcps1.3.3.5.2.3.1.2 "><p id="css_01_0427__p1336202217299">Statistics from the most recent resetting to the current time.</p>
</td>
</tr>
</tbody>
</table>
</div>
<div class="note" id="css_01_0427__en-us_topic_0000001335994820_note1324917115539"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><ul id="css_01_0427__en-us_topic_0000001335994820_ul1711710675312"><li id="css_01_0427__en-us_topic_0000001335994820_li1184510523535">The calculated P99 latency is an estimation, but it is more precise than the P50 latency.</li><li id="css_01_0427__en-us_topic_0000001335994820_li11211650135617">When a cluster is restarted, its P99 latency data is cleared, and is re-measured after the cluster restarts successfully.</li></ul>
</div></div>
<p id="css_01_0427__p183422303345">The command used for monitoring the P99 latency of clusters can also be used to set other configuration items.</p>
<ul id="css_01_0427__en-us_topic_0000001335994820_ul45331944145717"><li id="css_01_0427__en-us_topic_0000001335994820_li153314425717">You can customize the percentile of latency to be monitored.<div class="p" id="css_01_0427__en-us_topic_0000001335994820_p48411355165811"><a name="css_01_0427__en-us_topic_0000001335994820_li153314425717"></a><a name="en-us_topic_0000001335994820_li153314425717"></a>For example, run the following command to show the P1, P50, and P90 latency values:<pre class="screen" id="css_01_0427__en-us_topic_0000001335994820_screen841112813593">GET /search/stats/percentile
{
"percents": [1, 50, 90]
}</pre>
</div>
</li><li id="css_01_0427__en-us_topic_0000001335994820_li557220521582">You can manually reset the <strong id="css_01_0427__b32921054173710">latest</strong> statistics.<div class="p" id="css_01_0427__en-us_topic_0000001335994820_p38695386591">Run the following command to reset the <strong id="css_01_0427__b16758635123719">latest</strong> statistics:<pre class="screen" id="css_01_0427__en-us_topic_0000001335994820_screen151133713593">POST /search/stats/reset</pre>
</div>
<p id="css_01_0427__en-us_topic_0000001335994820_p1351153720595">If <span class="parmvalue" id="css_01_0427__parmvalue1931440183711"><b>ok</b></span> is returned, the reset is successful.</p>
<pre class="screen" id="css_01_0427__en-us_topic_0000001335994820_screen19511437115920">{
"nodes" : {
"css-c9c8-ess-esn-1-1" : "ok"
}
}</pre>
</li></ul>
</div>
<div class="section" id="css_01_0427__section1367052018215"><a name="css_01_0427__section1367052018215"></a><a name="section1367052018215"></a><h4 class="sectiontitle">Monitoring HTTP Status Codes</h4><div class="p" id="css_01_0427__p116201516123919">The command used for monitoring HTTP status codes varies with cluster versions.<ul id="css_01_0427__en-us_topic_0000001336314380_ul1428164221417"><li id="css_01_0427__en-us_topic_0000001336314380_li2028124241413">In an Elasticsearch 7.6.2 cluster, run the following command to obtain statistics on HTTP status codes:<pre class="screen" id="css_01_0427__en-us_topic_0000001336314380_screen1157216529158">GET /_nodes/http_stats</pre>
<p id="css_01_0427__en-us_topic_0000001336314380_p314812429194">Example response:</p>
<pre class="screen" id="css_01_0427__en-us_topic_0000001336314380_screen16192181232119">{
"_nodes" : {
"total" : 1,
"successful" : 1,
"failed" : 0 },
"cluster_name" : "css-8362",
"nodes" : {
"F9IFdQPARaOJI7oL7HOXtQ" : {
"http_code" : {
"200" : 114,
"201" : 5,
"429" : 0,
"400" : 7,
"404" : 0,
"405" : 0
}
}
}
}</pre>
</li><li id="css_01_0427__en-us_topic_0000001336314380_li13175163410154">In an Elasticsearch 7.10.2 cluster, run the following command to obtain statistics on HTTP status codes:<pre class="screen" id="css_01_0427__en-us_topic_0000001336314380_screen9264102616180">GET _nodes/stats/http </pre>
<p id="css_01_0427__en-us_topic_0000001336314380_p7264226121819">Example response:</p>
<pre class="screen" id="css_01_0427__en-us_topic_0000001336314380_screen19264172611181">{
......
"cluster_name" : "css-2985",
"nodes" : {
......
"omvR9_W-TsGApraMApREjA" : {
......
"http" : {
"current_open" : 4,
"total_opened" : 37,
"http_code" : {
"200" : 25,
"201" : 7,
"429" : 0,
"400" : 3,
"404" : 0,
"405" : 0
}
}
}
}
}</pre>
</li></ul>
</div>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="css_01_0426.html">Configuring Elasticsearch Cluster Monitoring</a></div>
</div>
</div>