forked from docs/doc-exports
Reviewed-by: Gergo-Bence Lorincz <a200452876@noreply.gitea.eco.tsi-dev.otc-service.com> Co-authored-by: qiujiandong1 <qiujiandong1@huawei.com> Co-committed-by: qiujiandong1 <qiujiandong1@huawei.com>
82 lines
23 KiB
HTML
82 lines
23 KiB
HTML
<a name="cce_10_0809"></a><a name="cce_10_0809"></a>
|
|
|
|
<h1 class="topictitle1">Logging FAQ</h1>
|
|
<div id="body0000001756936041"><div class="section" id="cce_10_0809__section147717212212"><h4 class="sectiontitle">Indexes</h4><ul id="cce_10_0809__ul13834112714211"><li id="cce_10_0809__li14834927325"><a href="#cce_10_0809__section79501212173811">How Do I Disable Logging?</a></li><li id="cce_10_0809__li828117421724"><a href="#cce_10_0809__section4325193116367">What Can I Do If All Components Except log-operator Are Not Ready?</a></li><li id="cce_10_0809__li5899354927"><a href="#cce_10_0809__section1674113819365">How Do I Handle the Error in Stdout Logs of log-operator?</a></li><li id="cce_10_0809__li818121015315"><a href="#cce_10_0809__section15385204383613">What Can I Do If Container File Logs Cannot Be Collected When Docker Is Used as the Container Engine?</a></li><li id="cce_10_0809__li8349194714616"><a href="#cce_10_0809__section011012583364">What Can I Do If Container File Logs Cannot Be Collected Due to the Wildcard in the Collection Directory?</a></li><li id="cce_10_0809__li1285959963"><a href="#cce_10_0809__section1880322183714">What Can I Do If fluent-bit Pod Keeps Restarting?</a></li><li id="cce_10_0809__li12970142416711"><a href="#cce_10_0809__section97887823715">What Can I Do If Job Logs Cannot Be Collected?</a></li><li id="cce_10_0809__li12993113914718"><a href="#cce_10_0809__section1571017353814">What Can I Do If the Cloud Native Log Collection Add-on Is Running Normally but Some Log Collection Policies Do Not Take Effect?</a></li><li id="cce_10_0809__li1674181112813"><a href="#cce_10_0809__section1990316243587">What Can I Do If Some Pod Information Is Missing During Log Collection Due to Excessive Node Load?</a></li><li id="cce_10_0809__li9982231385"><a href="#cce_10_0809__section1053145216227">How Do I Change the Log Storage Period on Logging?</a></li><li id="cce_10_0809__li571281610710"><a href="#cce_10_0809__section494412903313">What Can I Do If the Log Group or Stream Specified in the Log Collection Policy Does Not Exist?</a></li><li id="cce_10_0809__li3971151281316"><a href="#cce_10_0809__section242432981511">What Can I Do If FailedAssignENI Is Generated When a Node Is Created in a CCE Turbo Cluster?</a></li></ul>
|
|
</div>
|
|
<div class="section" id="cce_10_0809__section79501212173811"><a name="cce_10_0809__section79501212173811"></a><a name="section79501212173811"></a><h4 class="sectiontitle">How Do I Disable <span id="cce_10_0809__text1164315336613">Logging</span>?</h4><p id="cce_10_0809__p14431133894612"><strong id="cce_10_0809__b5605310104715">Disabling container log and Kubernetes event collection</strong></p>
|
|
<p id="cce_10_0809__p16958145254910">Method 1: Log in to the CCE console and click the cluster name to access the cluster console. In the navigation pane, choose <strong id="cce_10_0809__b186941731428">Logging</strong>. In the upper right corner, click <strong id="cce_10_0809__b8326426204912">View Log Policy</strong>. Then, locate and delete the corresponding log collection policy. </p>
|
|
<p id="cce_10_0809__p322142415210">Method 2: Access the <strong id="cce_10_0809__b14510473508">Add-ons</strong> page and uninstall the Cloud Native Log Collection add-on. <strong id="cce_10_0809__b1352681335119">Note that after you uninstall this add-on, it will no longer report Kubernetes events to AOM.</strong></p>
|
|
<p id="cce_10_0809__p521703113532"></p>
|
|
<p id="cce_10_0809__p433984175319"><strong id="cce_10_0809__b18399205610525">Disabling log collection for control plane components</strong></p>
|
|
<p id="cce_10_0809__p174109935514">Choose <strong id="cce_10_0809__b1036841015418">Logging</strong> > <strong id="cce_10_0809__b18402272549">Control Plane Logs</strong> and deselect one or more components whose logs do not need to be collected.</p>
|
|
<p id="cce_10_0809__p10155246105511"><strong id="cce_10_0809__b15786124685210">Disabling <span id="cce_10_0809__ph127879615124">Kubernetes audit log</span> collection</strong></p>
|
|
<p id="cce_10_0809__p1783671625613">Choose <span id="cce_10_0809__ph1142692911101"><strong id="cce_10_0809__b1470817560524">Logging</strong> > <strong id="cce_10_0809__b97081456105211">Kubernetes Audit Logs</strong></span> and deselect the component whose logs do not need to be collected.</p>
|
|
</div>
|
|
<div class="section" id="cce_10_0809__section4325193116367"><a name="cce_10_0809__section4325193116367"></a><a name="section4325193116367"></a><h4 class="sectiontitle">What Can I Do If All Components Except log-operator Are Not Ready?</h4><p id="cce_10_0809__p78701837173415"><strong id="cce_10_0809__b96601129203015">Symptom</strong>: All components except log-operator are not ready, and the volume failed to be mounted to the node.</p>
|
|
<p id="cce_10_0809__p16325163113363"><strong id="cce_10_0809__b1032553119366">Solution</strong>: Check the logs of log-operator. During add-on installation, the configuration files required by other components are generated by log-operator. If the configuration files are invalid, all components cannot be started.</p>
|
|
<p id="cce_10_0809__p17325193114368">The log information is as follows:</p>
|
|
<pre class="screen" id="cce_10_0809__screen332583111366">MountVolume.SetUp failed for volume "otel-collector-config-vol":configmap "log-agent-otel-collector-config" not found</pre>
|
|
</div>
|
|
<div class="section" id="cce_10_0809__section1674113819365"><a name="cce_10_0809__section1674113819365"></a><a name="section1674113819365"></a><h4 class="sectiontitle">How Do I Handle the Error in Stdout Logs of log-operator?</h4><p id="cce_10_0809__p16673638133616"><strong id="cce_10_0809__b134531892301">Symptom</strong>:</p>
|
|
<pre class="screen" id="cce_10_0809__screen267353818368">2023/05/05 12:17:20.799 [E] call 3 times failed, reason: create group failed, projectID: xxx, groupName: k8s-log-xxx, err: create groups status code: 400, response: {"error_code":"LTS.0104","error_msg":"<strong id="cce_10_0809__b182871725133512">Failed to create log group, the number of log groups exceeds the quota</strong>"}, url: https://lts.***.com/v2/xxx/groups, process will retry after 45s</pre>
|
|
<p id="cce_10_0809__p6673143812360"><strong id="cce_10_0809__b12673838163619">Solution</strong>: There is a log group quota on the LTS console. If this error occurs, go to the LTS console and delete some unnecessary log groups. </p>
|
|
</div>
|
|
<div class="section" id="cce_10_0809__section15385204383613"><a name="cce_10_0809__section15385204383613"></a><a name="section15385204383613"></a><h4 class="sectiontitle">What Can I Do If Container File Logs Cannot Be Collected When Docker Is Used as the Container Engine?</h4><p id="cce_10_0809__p265116593616"><strong id="cce_10_0809__b271415159302">Symptom</strong>:</p>
|
|
<p id="cce_10_0809__p1036207103612">A container file path is configured but is not mounted to the container, and Docker is used as the container engine. As a result, logs cannot be collected.</p>
|
|
<p id="cce_10_0809__p19384243123619"><strong id="cce_10_0809__b17384743133619">Solution</strong>:</p>
|
|
<p id="cce_10_0809__p53847438364">Check whether Device Mapper is used for the node where the workload resides. Device Mapper does not support text log collection. (This restriction is displayed when you create a log collection policy.) To check this, perform the following operations:</p>
|
|
<ol id="cce_10_0809__ol1438454353620"><li id="cce_10_0809__li4384114312367">Go to the node where the workload resides.</li><li id="cce_10_0809__li938410436365">Run the <strong id="cce_10_0809__b18384174333612">docker info | grep "Storage Driver"</strong> command.</li><li id="cce_10_0809__li10384174303611">If the value of <strong id="cce_10_0809__b51131696451933">Storage Driver</strong> is <strong id="cce_10_0809__b203609159651933">Device Mapper</strong>, text logs cannot be collected.</li></ol>
|
|
</div>
|
|
<div class="section" id="cce_10_0809__section011012583364"><a name="cce_10_0809__section011012583364"></a><a name="section011012583364"></a><h4 class="sectiontitle">What Can I Do If Container File Logs Cannot Be Collected Due to the Wildcard in the Collection Directory?</h4><p id="cce_10_0809__p16109165818369"><strong id="cce_10_0809__b7109155816364">Troubleshooting</strong>: Check the volume mounting status in the workload configuration. If a volume is attached to the data directory of a service container, this add-on cannot collect data from the parent directory. In this case, you need to set the collection directory to a complete data directory. For example, if the data volume is attached to the <strong id="cce_10_0809__b50012183651933">/var/log/service</strong> directory, logs cannot be collected from the <strong id="cce_10_0809__b122804444751933">/var/log</strong> or <strong id="cce_10_0809__b5580291551933">/var/log/*</strong> directory. In this case, you need to set the collection directory to <strong id="cce_10_0809__b137921912751933">/var/log/service</strong>.</p>
|
|
<p id="cce_10_0809__p1910916588362"><strong id="cce_10_0809__b89459745051933">Solution</strong>: If the log generation directory is <strong id="cce_10_0809__b110417886851933">/application/logs/</strong><em id="cce_10_0809__i145965667651933">{Application name}</em><strong id="cce_10_0809__b83595131851933">/*.log</strong>, attach the data volume to the <strong id="cce_10_0809__b40141966351933">/application/logs</strong> directory and set the collection directory in the log collection policy to <strong id="cce_10_0809__b143512884051933">/application/logs/*/*.log</strong>.</p>
|
|
</div>
|
|
<div class="section" id="cce_10_0809__section1880322183714"><a name="cce_10_0809__section1880322183714"></a><a name="section1880322183714"></a><h4 class="sectiontitle">What Can I Do If fluent-bit Pod Keeps Restarting?</h4><p id="cce_10_0809__p98021423375"><strong id="cce_10_0809__b42256419951933">Troubleshooting</strong>: Run the <strong id="cce_10_0809__b82134431551933">kubectl describe pod</strong> command. The output shows that the pod was restarted due to OOM. There are a large number of evicted pods on the node where the fluent-bit resides. As a result, resources are occupied, causing OOM.</p>
|
|
<p id="cce_10_0809__p980215273712"><strong id="cce_10_0809__b38021926375">Solution</strong>: Delete the evicted pods from the node.</p>
|
|
</div>
|
|
<div class="section" id="cce_10_0809__section97887823715"><a name="cce_10_0809__section97887823715"></a><a name="section97887823715"></a><h4 class="sectiontitle">What Can I Do If Job Logs Cannot Be Collected?</h4><p id="cce_10_0809__p197881687373"><strong id="cce_10_0809__b478818133716">Troubleshooting</strong>: Check the job lifetime. If the job lifetime is less than 1 minute, the pod will be destroyed before logs are collected. In this case, logs cannot be collected.</p>
|
|
<p id="cce_10_0809__p17788208143719"><strong id="cce_10_0809__b778814820377">Solution</strong>: Prolong the job lifetime.</p>
|
|
</div>
|
|
<div class="section" id="cce_10_0809__section1571017353814"><a name="cce_10_0809__section1571017353814"></a><a name="section1571017353814"></a><h4 class="sectiontitle">What Can I Do If the Cloud Native Log Collection Add-on Is Running Normally but Some Log Collection Policies Do Not Take Effect?</h4><p id="cce_10_0809__p13345103216388"><strong id="cce_10_0809__b184931954153811">Solution</strong>:</p>
|
|
<ul id="cce_10_0809__ul1847611202519"><li id="cce_10_0809__li1476120155119">If the log collection policy of the event type does not take effect or the add-on version is earlier than 1.5.0, check the stdout of the log-agent-otel-collector workload.<p id="cce_10_0809__p64001157549"><a name="cce_10_0809__li1476120155119"></a><a name="li1476120155119"></a>Go to the <strong id="cce_10_0809__b543416353268">Add-ons</strong> page and click <strong id="cce_10_0809__b114341835172618">Cloud Native Log Collection</strong>. Then, click the <strong id="cce_10_0809__b543413354266">Pods</strong> tab, locate <strong id="cce_10_0809__b1363017742916">log-agent-otel-collector</strong>, and choose <strong id="cce_10_0809__b643413517269">More</strong> > <strong id="cce_10_0809__b14434173519264">View Log</strong> in the <strong id="cce_10_0809__b1843433520262">Operation</strong> column.</p>
|
|
<p id="cce_10_0809__p453410472122"></p>
|
|
</li><li id="cce_10_0809__li10476122018518">If the log collection policy of the other type does not take effect and the add-on version is later than 1.5.0, check the log of log-agent-fluent-bit on the node where the container to be monitored is running.<p id="cce_10_0809__p137081818121114"><a name="cce_10_0809__li10476122018518"></a><a name="li10476122018518"></a>Go to the <strong id="cce_10_0809__b104997287306">Add-ons</strong> page and click <strong id="cce_10_0809__b1499122893017">Cloud Native Log Collection</strong>. Then, click the <strong id="cce_10_0809__b11499122853019">Pods</strong> tab, locate <strong id="cce_10_0809__b145721333123012">log-agent-fluent-bit</strong>, and choose <strong id="cce_10_0809__b155001528153016">More</strong> > <strong id="cce_10_0809__b55008283305">View Log</strong> in the <strong id="cce_10_0809__b6500328173012">Operation</strong> column.</p>
|
|
<p id="cce_10_0809__p1162216324137"></p>
|
|
<p id="cce_10_0809__p9388145419910">Select the fluent-bit container, search for the keyword "fail to push {event/log} data via lts exporter" in the log, and view the error message.</p>
|
|
<div class="p" id="cce_10_0809__p771714202310"><ol id="cce_10_0809__ol01414104720"><li id="cce_10_0809__li1114111010712">If the error message "The log streamId does not exist." is displayed, the log group or log stream does not exist. In this case, choose <strong id="cce_10_0809__b75787542287">Logging</strong> > <strong id="cce_10_0809__b1218610192915">View Log Policy</strong>, edit or delete the log collection policy, and recreate a log collection policy to update the log group or log stream.</li><li id="cce_10_0809__li17141151018720">For other errors, go to LTS to search for the error code and view the cause. </li></ol>
|
|
</div>
|
|
</li></ul>
|
|
</div>
|
|
<div class="section" id="cce_10_0809__section1990316243587"><a name="cce_10_0809__section1990316243587"></a><a name="section1990316243587"></a><h4 class="sectiontitle">What Can I Do If Some Pod Information Is Missing During Log Collection Due to Excessive Node Load?</h4><p id="cce_10_0809__p165142035145819">When the Cloud Native Log Collection add-on version is later than 1.5.0, some pod information, such as the pod ID and name, is missing from container file logs or stdout logs.</p>
|
|
<p id="cce_10_0809__p1826165419589"><strong id="cce_10_0809__b7358233122020">Troubleshooting</strong>:</p>
|
|
<p id="cce_10_0809__p8760301512">Go to the <strong id="cce_10_0809__b895551143412">Add-ons</strong> page and click <strong id="cce_10_0809__b78621158131310">Cloud Native Log Collection</strong>. Then, click the <strong id="cce_10_0809__b82131604618">Pods</strong> tab, locate <strong id="cce_10_0809__b12651151861615">log-agent-fluent-bit</strong>, and choose <strong id="cce_10_0809__b78611435341">More</strong> > <strong id="cce_10_0809__b159581145193414">View Log</strong> in the <strong id="cce_10_0809__b138781993018">Operation</strong> column.</p>
|
|
<p id="cce_10_0809__p165047581144"></p>
|
|
<p id="cce_10_0809__p11774195112110">Select the fluent-bit container and search for the keyword "cannot increase buffer: current=512000 requested=*** max=512000" in the log.</p>
|
|
<p id="cce_10_0809__p1699211814320"><strong id="cce_10_0809__b0181692313">Solution</strong>:</p>
|
|
<p id="cce_10_0809__p199354389313">Run the <strong id="cce_10_0809__b20649359123516">kubectl edit deploy -n monitoring log-agent-log-operator</strong> command on the node and add <strong id="cce_10_0809__b1988703414361">--kubernetes-buffer-size=20MB</strong> to the command lines of the log-operator container. The default value is <strong id="cce_10_0809__b14818172053717">16MB</strong>. You can estimate the value based on the total size of pod information on the node. <strong id="cce_10_0809__b12307194643713">0</strong> indicates no limits.</p>
|
|
<div class="caution" id="cce_10_0809__note10419131920314"><span class="cautiontitle"><img src="public_sys-resources/caution_3.0-en-us.png"> </span><div class="cautionbody"><p id="cce_10_0809__p12419171916311">If the Cloud Native Log Collection add-on is upgraded, you need to reconfigure <strong id="cce_10_0809__b1618118973515">kubernetes-buffer-size</strong>.</p>
|
|
</div></div>
|
|
<div class="fignone" id="cce_10_0809__fig9612111983220"><span class="figcap"><b>Figure 1 </b>Modifying the command line parameter of the log-operator container</span><br><span><img id="cce_10_0809__image03591217596" src="en-us_image_0000002516198997.png"></span></div>
|
|
</div>
|
|
<div class="section" id="cce_10_0809__section1053145216227"><a name="cce_10_0809__section1053145216227"></a><a name="section1053145216227"></a><h4 class="sectiontitle">How Do I Change the Log Storage Period on Logging?</h4><ol id="cce_10_0809__ol4716141315243"><li id="cce_10_0809__li171671372411"><span>Log in to the <span id="cce_10_0809__ph0698201721013">CCE console</span> and choose <strong id="cce_10_0809__b77711410114520">Clusters</strong>. On the displayed page, hover the cursor over the cluster name to view the current cluster ID.</span><p><p id="cce_10_0809__p117221029145215"></p>
|
|
</p></li><li id="cce_10_0809__li1671717134245"><span>Log in to the <span id="cce_10_0809__ph148011451196">LTS console</span>. In the navigation pane, choose <strong id="cce_10_0809__b526874494018">Log Management</strong>. In <strong id="cce_10_0809__b191488184315">Log Groups</strong>, select a search criterion. Then, query the log group and log stream by cluster ID.</span></li><li id="cce_10_0809__li371717136244"><span>Locate the log group and click <strong id="cce_10_0809__b15544174316243">Modify</strong> to configure the log storage period.</span></li></ol>
|
|
</div>
|
|
<div class="section" id="cce_10_0809__section494412903313"><a name="cce_10_0809__section494412903313"></a><a name="section494412903313"></a><h4 class="sectiontitle">What Can I Do If the Log Group or Stream Specified in the Log Collection Policy Does Not Exist?</h4><ul id="cce_10_0809__ul11884194315910"><li id="cce_10_0809__li5884443294"><strong id="cce_10_0809__b188424151165">Scenario 1: The default log group or stream does not exist.</strong><p id="cce_10_0809__p2100242799">Take Kubernetes events as an example. If the default log group or stream does not exist, a message will be displayed on the Kubernetes events page of the console. You can click <strong id="cce_10_0809__b9888511202617">Create Log Collection Policy</strong> to create a log group or stream.</p>
|
|
<p id="cce_10_0809__p111341911171312">After the log group or stream is created, the ID of the default log group or stream changes, and the existing log collection policy of the default log group or stream does not take effect. In this case, you can rectify the fault by referring to <a href="#cce_10_0809__li146683521096">Scenario 2</a>.</p>
|
|
</li><li id="cce_10_0809__li146683521096"><a name="cce_10_0809__li146683521096"></a><a name="li146683521096"></a><strong id="cce_10_0809__b10427421175718">Scenario 2: The default log group or stream exists but is inconsistent with that specified in the log collection policy.</strong><ul id="cce_10_0809__ul1768165014918"><li id="cce_10_0809__li16768195014911">The log collection policy, for example, <strong id="cce_10_0809__b14450113565811">default-stdout</strong>, can be modified as follows:<ol id="cce_10_0809__ol1772525212180"><li id="cce_10_0809__li11421236791">Log in to the <span id="cce_10_0809__ph7429367915">CCE console</span> and click the cluster name to access the cluster console. In the navigation pane, choose <strong id="cce_10_0809__b6151144913469">Logging</strong>.</li><li id="cce_10_0809__li18839418171920">In the upper right corner, click <strong id="cce_10_0809__b10269864187">View Log Policy</strong>. Then, locate the log collection policy and click <strong id="cce_10_0809__b132701262180">Edit</strong> in the <strong id="cce_10_0809__b1627096181819">Operation</strong> column.</li><li id="cce_10_0809__li1662488131911">Select <strong id="cce_10_0809__b173256338537">Custom</strong> and configure the default log group or stream.</li></ol>
|
|
</li><li id="cce_10_0809__li157511259899">If a log collection policy cannot be modified, for example, <strong id="cce_10_0809__b190613441102">default-event</strong>, you need to re-create a log collection policy as follows:<ol id="cce_10_0809__ol1583144112215"><li id="cce_10_0809__li1550312322913">Log in to the <span id="cce_10_0809__ph0388522463">CCE console</span> and click the cluster name to access the cluster console. In the navigation pane, choose <strong id="cce_10_0809__b23895284613">Logging</strong>.</li><li id="cce_10_0809__li958316416228">In the upper right corner, click <strong id="cce_10_0809__b4961142610109">View Log Policy</strong>. Then, locate the log collection policy and click <strong id="cce_10_0809__b496152618105">Delete</strong> in the <strong id="cce_10_0809__b1596102691012">Operation</strong> column.</li><li id="cce_10_0809__li258315416221">Click <strong id="cce_10_0809__b616945922613">Create Log Collection Policy</strong>. Then, select <strong id="cce_10_0809__b1516945972611">Kubernetes events</strong> and click <strong id="cce_10_0809__b716965911261">OK</strong>.</li></ol>
|
|
</li></ul>
|
|
</li><li id="cce_10_0809__li18631413105"><strong id="cce_10_0809__b106127344174">Scenario 3: The custom log group (stream) does not exist.</strong><p id="cce_10_0809__p1461149109">CCE does not support the creation of non-default log groups (streams). You can create a non-default log group (stream) on the LTS console.</p>
|
|
<p id="cce_10_0809__p5631461011">After the creation is complete, take the following steps:</p>
|
|
<ol id="cce_10_0809__ol13205114682310"><li id="cce_10_0809__li1120512463239">Log in to the <span id="cce_10_0809__ph15571522465">CCE console</span> and click the cluster name to access the cluster console. In the navigation pane, choose <strong id="cce_10_0809__b65714524469">Logging</strong>.</li><li id="cce_10_0809__li20205946122311">In the upper right corner, click <strong id="cce_10_0809__b778615346519">View Log Policy</strong>. Then, locate the log collection policy and click <strong id="cce_10_0809__b55241526756">Edit</strong> in the <strong id="cce_10_0809__b268414291355">Operation</strong> column.</li><li id="cce_10_0809__li13205154632314">Select <strong id="cce_10_0809__b14591140105310">Custom</strong> and configure a log group or stream.</li></ol>
|
|
</li></ul>
|
|
</div>
|
|
<div class="section" id="cce_10_0809__section242432981511"><a name="cce_10_0809__section242432981511"></a><a name="section242432981511"></a><h4 class="sectiontitle">What Can I Do If FailedAssignENI Is Generated When a Node Is Created in a CCE Turbo Cluster?</h4><p id="cce_10_0809__p17424729161515">If the log add-on has been installed in a CCE Turbo cluster and is running normally, when a node is created, the log-agent-fluent component on the node may report the FailedAssignENI alarm because the network interface of the new node is not ready. The alarm will not be reported again after 5 seconds. The log collection function is not affected.</p>
|
|
</div>
|
|
</div>
|
|
<div>
|
|
<div class="familylinks">
|
|
<div class="parentlink"><strong>Parent topic:</strong> <a href="cce_10_0799.html">O&M FAQ</a></div>
|
|
</div>
|
|
</div>
|
|
|