Files
doc-exports/docs/dataartsstudio/umn/dataartsstudio_03_0618.html
chenxiaoxiong f9e2808b7c DataArts UMN 20250810 version
Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com>
Co-authored-by: chenxiaoxiong <chenxiaoxiong@huawei.com>
Co-committed-by: chenxiaoxiong <chenxiaoxiong@huawei.com>
2025-09-02 10:44:13 +00:00

79 lines
6.8 KiB
HTML

<a name="dataartsstudio_03_0618"></a><a name="dataartsstudio_03_0618"></a>
<h1 class="topictitle1">What Should I Do If a Shell/Python Node Fails and Error "session is down" Is Reported?</h1>
<div id="body8662426"><p id="dataartsstudio_03_0618__p288253072217">This section uses the Shell node as an example.</p>
<div class="section" id="dataartsstudio_03_0618__en-us_topic_0000001224321725_en-us_topic_0034215906_section1348975111056"><h4 class="sectiontitle">Symptom</h4><p id="dataartsstudio_03_0618__en-us_topic_0000001224321725_p8397165618324">The Shell node fails to be executed, but the Shell script is executed successfully.</p>
<p id="dataartsstudio_03_0618__en-us_topic_0000001224321725_p47265438324"></p>
</div>
<div class="section" id="dataartsstudio_03_0618__en-us_topic_0000001224321725_en-us_topic_0034215906_section538174341119"><h4 class="sectiontitle">Possible Causes</h4><ol id="dataartsstudio_03_0618__ol20540111416118"><li id="dataartsstudio_03_0618__li175406141112">Obtain the run logs of the Shell node.<pre class="screen" id="dataartsstudio_03_0618__screen368215211380">[2021/11/17 02:00:36 GMT+0800] [INFO] No job-level agency is set, Workspace-level agency is dlg_agency, Execute job use agency dlg_agency, job id is 07572F197E4642E5BE549C2B656F157Ctm7cHkHd
[2021/11/17 02:00:36 GMT+0800] [DEBUG] ===============================================
[2021/11/17 02:00:36 GMT+0800] [INFO] Get response from agent when try to submit shell running job :
[2021/11/17 02:00:36 GMT+0800] [INFO]
{
"jobResultList":[
{
"jobId":"a567f7f5-3c9e-4dfc-a464-bd477ac5b1ea",
"status":"created",
"errorCode":0,
"failCount":0,
"result":[
]
}
],
"agentId":"614853ee-c1c6-456d-9aa6-fc84ad1281ed"
}
[2021/11/17 02:00:36 GMT+0800] [DEBUG] ===============================================
[2021/11/17 02:05:56 GMT+0800] [DEBUG] ===============================================
[2021/11/17 02:05:56 GMT+0800] [INFO] Job Run finish , the raw output is :
[2021/11/17 02:05:56 GMT+0800] [INFO]
{
"jobId":"a567f7f5-3c9e-4dfc-a464-bd477ac5b1ea",
"status":"failed",
"errorCode":3427,
"errorMessage":"<strong id="dataartsstudio_03_0618__b754281219392">Shell script job execute failed</strong>.",
"failCount":0,
"result":[
{
"is_success":false,
"exeTime":300.609
}
]
}
[2021/11/17 02:05:56 GMT+0800] [DEBUG] ===============================================
[2021/11/17 02:05:56 GMT+0800] [DEBUG] ===============================================
[2021/11/17 02:05:56 GMT+0800] [INFO] The return code is : [-1].
[2021/11/17 02:05:56 GMT+0800] [DEBUG] ===============================================
[2021/11/17 02:05:56 GMT+0800] [INFO] <strong id="dataartsstudio_03_0618__b192213173912">Execute shell script job finished</strong><strong id="dataartsstudio_03_0618__b848623410391">.</strong>
[2021/11/17 02:05:56 GMT+0800] [ERROR] Shell exit code is not 0
[2021/11/17 02:05:56 GMT+0800] [DEBUG] ===============================================
[2021/11/17 02:05:56 GMT+0800] [ERROR] <strong id="dataartsstudio_03_0618__b3506122474015">Shell script job execute failed. Please contact ECS Service.</strong>
[2021/11/17 02:05:56 GMT+0800] [ERROR] Exception message: RuntimeException: Shell script job execute failed. Please contact ECS Service.
[2021/11/17 02:05:56 GMT+0800] [ERROR] Root Cause message:RuntimeException: Shell script job execute failed. Please contact ECS Service.</pre>
</li><li id="dataartsstudio_03_0618__li0361268129">Ensure that the values of the following parameters in the <strong id="dataartsstudio_03_0618__b471583231715">sshd_config</strong> file are as follows.<p id="dataartsstudio_03_0618__en-us_topic_0000001224321725_p681592173514"><span><img id="dataartsstudio_03_0618__en-us_topic_0000001224321725_image1186612293511" src="en-us_image_0000002269195801.png"></span></p>
<p id="dataartsstudio_03_0618__en-us_topic_0000001224321725_p1343332310357">Cause: The SSH session times out and is disconnected. As a result, the Shell node fails.</p>
</li></ol>
</div>
<div class="section" id="dataartsstudio_03_0618__en-us_topic_0000001224321725_section116381019479"><h4 class="sectiontitle">Solution</h4><ol id="dataartsstudio_03_0618__ol14681143161417"><li id="dataartsstudio_03_0618__li1068174314146">Open the <strong id="dataartsstudio_03_0618__b1861655791613">/etc/ssh/sshd_config</strong> file of the ECS and add or update the following parameter values:<p id="dataartsstudio_03_0618__en-us_topic_0000001224321725_p15462245153616">ClientAliveInterval 300</p>
<p id="dataartsstudio_03_0618__en-us_topic_0000001224321725_p20462745123610">ClientAliveCountMax 3</p>
<div class="note" id="dataartsstudio_03_0618__note1273261819159"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="dataartsstudio_03_0618__p7732101851511">The ClientAliveInterval parameter specifies the interval for the server to send requests to a client. The default value is <strong id="dataartsstudio_03_0618__b1378002014421">0</strong>, indicating that the server does not send requests to the client. Value <strong id="dataartsstudio_03_0618__b10258816124218">300</strong> indicates that the server sends a request every five minutes and the client sends a response accordingly. In this process, a persistent connection is maintained. The default value of <strong id="dataartsstudio_03_0618__b5780154414422">ClientAliveCountMax</strong> is <strong id="dataartsstudio_03_0618__b9548148134210">3</strong>. If the number of times that the client does not respond to requests sent by the server reaches the value of this parameter, the server disconnects the connection to the client. Normally, the client sends responses.</p>
</div></div>
</li><li id="dataartsstudio_03_0618__li1087545791719">After the modification, restart the sshd of the ECS and run the following command:<pre class="screen" id="dataartsstudio_03_0618__screen1187282617482">restart sshd.service</pre>
<p id="dataartsstudio_03_0618__en-us_topic_0000001224321725_p1481101113815"><span><img id="dataartsstudio_03_0618__en-us_topic_0000001224321725_image4110819183813" src="en-us_image_0000002234236364.png" title="Click to enlarge" class="imgResize"></span></p>
</li><li id="dataartsstudio_03_0618__li72180521820">Check whether sshd is started successfully. (The following figure shows that sshd is started successfully.)<p id="dataartsstudio_03_0618__en-us_topic_0000001224321725_p4309193683819"><a name="dataartsstudio_03_0618__li72180521820"></a><a name="li72180521820"></a><span><img id="dataartsstudio_03_0618__en-us_topic_0000001224321725_image15380836183810" src="en-us_image_0000002269195813.png" title="Click to enlarge" class="imgResize"></span></p>
</li></ol>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="dataartsstudio_03_0035.html">DataArts Factory</a></div>
</div>
</div>
<script language="JavaScript">
<!--
initImageViewer('.imgResize');
var msg_imageMax = "view original image";
var msg_imageClose = "close";
//--></script>