Files
doc-exports/docs/cce/umn/cce_10_0193.html
qiujiandong1 23019c5a63 CCE UMN 20250415 version
Reviewed-by: Gergo-Bence Lorincz <a200452876@noreply.gitea.eco.tsi-dev.otc-service.com>
Co-authored-by: qiujiandong1 <qiujiandong1@huawei.com>
Co-committed-by: qiujiandong1 <qiujiandong1@huawei.com>
2026-04-16 09:37:48 +00:00

1208 lines
114 KiB
HTML
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<a name="cce_10_0193"></a><a name="cce_10_0193"></a>
<h1 class="topictitle1">Volcano Scheduler</h1>
<div id="body1561601911583"><div class="section" id="cce_10_0193__section173631312185614"><h4 class="sectiontitle">Introduction</h4><p id="cce_10_0193__p171356133587"><a href="https://volcano.sh/en/docs/" target="_blank" rel="noopener noreferrer">Volcano</a> is a batch processing platform based on Kubernetes. It provides a series of features required by machine learning, deep learning, bioinformatics, genomics, and other big data applications, as a powerful supplement to Kubernetes capabilities.</p>
<p id="cce_10_0193__p153070610592">Volcano provides general computing capabilities such as high-performance job scheduling, heterogeneous chip management, and job running management. It accesses the computing frameworks for various industries such as AI, big data, gene, and rendering and schedules up to 1000 pods per second for end users, greatly improving scheduling efficiency and resource utilization.</p>
<p id="cce_10_0193__p146815415012">Volcano provides job scheduling, job management, and queue management for computing applications. Its main features are as follows:</p>
<ul id="cce_10_0193__ul188217451061"><li id="cce_10_0193__li1082545564">Diverse computing frameworks, such as TensorFlow, MPI, and Spark, can run on Kubernetes in containers. Common APIs for batch computing jobs through CRD, various plugins, and advanced job lifecycle management are provided.</li><li id="cce_10_0193__li1082134517617">Advanced scheduling capabilities are provided for batch computing and high-performance computing scenarios, including group scheduling, preemptive priority scheduling, packing, resource reservation, and task topology.</li><li id="cce_10_0193__li12821845461">Queues can be effectively managed for scheduling jobs. Complex job scheduling capabilities such as queue priority and multi-level queues are supported.</li></ul>
<p id="cce_10_0193__p168654503">Volcano has been open-sourced in GitHub at <a href="https://github.com/volcano-sh/volcano" target="_blank" rel="noopener noreferrer">https://github.com/volcano-sh/volcano</a>.</p>
<p id="cce_10_0193__p0466184510195">Install and configure the Volcano add-on in CCE clusters. For details, see <a href="cce_10_0423.html">Volcano Scheduling</a>.</p>
<div class="note" id="cce_10_0193__note886421151316"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="cce_10_0193__p58642151318">When using Volcano as a scheduler, use it to schedule all workloads in the cluster. This prevents resource scheduling conflicts caused by simultaneous working of multiple schedulers.</p>
</div></div>
</div>
<div class="section" id="cce_10_0193__section18807108172913"><h4 class="sectiontitle">Notes and Constraints</h4><p id="cce_10_0193__p767593103316">If the Volcano Scheduler add-on is upgraded from 1.4.7 or earlier to a version later than 1.4.7, the <span class="uicontrol" id="cce_10_0193__uicontrol252115673319"><b>webhooks.admissionReviewVersions</b></span> field information in the new version may be incompatible with that in the old version. As a result, VolcanoJob (vcjob) resources cannot be created. </p>
</div>
<div class="section" id="cce_10_0193__section564214328158"><h4 class="sectiontitle">Installing the Add-on</h4><ol id="cce_10_0193__ol13949124616422"><li id="cce_10_0193__li330462393220"><span>Log in to the <span id="cce_10_0193__cce_10_0004_ph18314322182">CCE console</span> and click the cluster name to access the cluster console.</span></li><li id="cce_10_0193__li13183153352515"><span>In the navigation pane, choose <strong id="cce_10_0193__b145717352539"><span id="cce_10_0193__text125714351539">Add-ons</span></strong>. Locate <strong id="cce_10_0193__b95743514539">Volcano Scheduler</strong> on the right and click <strong id="cce_10_0193__b13575358538">Install</strong>.</span></li><li id="cce_10_0193__li15556183414307"><span>On the <strong id="cce_10_0193__b2153171085917">Install Add-on</strong> page, configure the specifications as needed.</span><p><ul id="cce_10_0193__ul14526143113393"><li id="cce_10_0193__li953119336397">If you selected <span class="uicontrol" id="cce_10_0193__uicontrol18901859103915"><b>Preset</b></span>, the system will configure the number of pods and resource quotas for the add-on based on the preset specifications. You can see the configurations on the console.</li><li id="cce_10_0193__li2526203153919">If you selected <strong id="cce_10_0193__b1813373391913">Custom</strong>, you can adjust the number of pods and resource quotas as needed. High availability is not possible with a single pod. If an error occurs on the node where the add-on instance runs, the add-on will fail.<p id="cce_10_0193__p182561730185012">The resource quotas of the volcano-admission component are related to the cluster scale. For details, see <a href="#cce_10_0193__table16382122344317">Table 1</a>. The resource quotas of volcano-controller and volcano-scheduler are related to the number of cluster nodes and pods. The recommended values are as follows:</p>
<ul id="cce_10_0193__ul6257163065016"><li id="cce_10_0193__li9257123012507">If the number of nodes is less than 100, retain the default configuration. The requested vCPUs are 500m, and the limit is 2000m. The requested memory is 500 MiB, and the limit is 2000 MiB.</li><li id="cce_10_0193__li152571330175015">If the number of nodes is greater than 100, increase the requested vCPUs by 500m and the requested memory by 1000 MiB each time 100 nodes (10,000 pods) are added. Increase the vCPU limit by 1500m and the memory limit by 1000 MiB.<div class="note" id="cce_10_0193__note16257930165010"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="cce_10_0193__p132571730145017">Recommended formula for calculating the requested value:</p>
<ul id="cce_10_0193__ul19257123017500"><li id="cce_10_0193__li11257830175012">Requested vCPUs: Calculate the number of target nodes multiplied by the number of target pods, perform interpolation search based on the number of nodes in the cluster multiplied by the number of target pods in <a href="#cce_10_0193__table443465675016">Table 2</a>, and round up the request value and limit value that are closest to the specifications.<p id="cce_10_0193__p1325719307502">For example, for 2000 nodes and 20,000 pods, Number of target nodes x Number of target pods = 40 million, which is close to the specification of 700/70,000 (Number of cluster nodes x Number of pods = 49 million). According to the following table, set the requested vCPUs to 4000m and the limit value to 5500m.</p>
</li><li id="cce_10_0193__li17257330155013">Requested memory: It is recommended that 2.4 GiB memory be allocated to every 1000 nodes and 1 GiB memory be allocated to every 10,000 pods. The requested memory is the sum of these two values. (The obtained value may be different from the recommended value in <a href="#cce_10_0193__table443465675016">Table 2</a>. You can use either of them.)<p id="cce_10_0193__p82574304501">Requested memory = Number of target nodes/1000 × 2.4 GiB + Number of target pods/10,000 × 1 GiB</p>
<p id="cce_10_0193__p8257173018501">For example, for 2000 nodes and 20,000 pods, the requested memory is 6.8 GiB (2000/1000 × 2.4 GiB + 20,000/10,000 × 1 GiB).</p>
</li></ul>
</div></div>
</li></ul>
<div class="tablenoborder"><a name="cce_10_0193__table16382122344317"></a><a name="table16382122344317"></a><table cellpadding="4" cellspacing="0" summary="" id="cce_10_0193__table16382122344317" frame="border" border="1" rules="all"><caption><b>Table 1 </b>Recommended requested resources and resource limits for volcano-admission</caption><thead align="left"><tr id="cce_10_0193__row121622711434"><th align="left" class="cellrowborder" valign="top" width="20%" id="mcps1.3.3.2.3.2.1.2.4.2.6.1.1"><p id="cce_10_0193__p0216227174312">Cluster Scale</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="20%" id="mcps1.3.3.2.3.2.1.2.4.2.6.1.2"><p id="cce_10_0193__p192168275433">CPU Request (m)</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="20%" id="mcps1.3.3.2.3.2.1.2.4.2.6.1.3"><p id="cce_10_0193__p8216172717430">vCPU Limit (m)</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="20%" id="mcps1.3.3.2.3.2.1.2.4.2.6.1.4"><p id="cce_10_0193__p14216132754311">Memory Request (MiB)</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="20%" id="mcps1.3.3.2.3.2.1.2.4.2.6.1.5"><p id="cce_10_0193__p18216627134314">Memory Limit (MiB)</p>
</th>
</tr>
</thead>
<tbody><tr id="cce_10_0193__row92962277439"><td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.4.2.6.1.1 "><p id="cce_10_0193__p152957272436">50 nodes</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.4.2.6.1.2 "><p id="cce_10_0193__p2295427104316">200</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.4.2.6.1.3 "><p id="cce_10_0193__p19295132774315">500</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.4.2.6.1.4 "><p id="cce_10_0193__p10296827164315">500</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.4.2.6.1.5 "><p id="cce_10_0193__p16296527144310">500</p>
</td>
</tr>
<tr id="cce_10_0193__row20296827124320"><td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.4.2.6.1.1 "><p id="cce_10_0193__p82961827184311">200 nodes</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.4.2.6.1.2 "><p id="cce_10_0193__p229672714438">500</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.4.2.6.1.3 "><p id="cce_10_0193__p62961227124312">1000</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.4.2.6.1.4 "><p id="cce_10_0193__p22961927184312">1000</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.4.2.6.1.5 "><p id="cce_10_0193__p16296327104312">2000</p>
</td>
</tr>
<tr id="cce_10_0193__row13297192754315"><td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.4.2.6.1.1 "><p id="cce_10_0193__p152961327204319">1000 or more nodes</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.4.2.6.1.2 "><p id="cce_10_0193__p22979279430">1500</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.4.2.6.1.3 "><p id="cce_10_0193__p1929715279430">2500</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.4.2.6.1.4 "><p id="cce_10_0193__p8297727114311">3000</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.4.2.6.1.5 "><p id="cce_10_0193__p9297152717439">4000</p>
</td>
</tr>
</tbody>
</table>
</div>
<div class="tablenoborder"><a name="cce_10_0193__table443465675016"></a><a name="table443465675016"></a><table cellpadding="4" cellspacing="0" summary="" id="cce_10_0193__table443465675016" frame="border" border="1" rules="all"><caption><b>Table 2 </b>Recommended requested resources and resource limits for volcano-controller and volcano-scheduler</caption><thead align="left"><tr id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_row07431329145911"><th align="left" class="cellrowborder" valign="top" width="20%" id="mcps1.3.3.2.3.2.1.2.5.2.6.1.1"><p id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_p187511737155914">Nodes/Pods in a Cluster</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="20%" id="mcps1.3.3.2.3.2.1.2.5.2.6.1.2"><p id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_p10751123712590">CPU Request (m)</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="20%" id="mcps1.3.3.2.3.2.1.2.5.2.6.1.3"><p id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_p975163785912">CPU Limit (m)</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="20%" id="mcps1.3.3.2.3.2.1.2.5.2.6.1.4"><p id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_p7751437195915">Memory Request (MiB)</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="20%" id="mcps1.3.3.2.3.2.1.2.5.2.6.1.5"><p id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_p15751737185918">Memory Limit (MiB)</p>
</th>
</tr>
</thead>
<tbody><tr id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_row107432029175915"><td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.5.2.6.1.1 "><p id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_p17440104313599">50/5000</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.5.2.6.1.2 "><p id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_p144064365912">500</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.5.2.6.1.3 "><p id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_p1244034395911">2000</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.5.2.6.1.4 "><p id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_p844034310592">500</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.5.2.6.1.5 "><p id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_p4440124305918">2000</p>
</td>
</tr>
<tr id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_row67438296598"><td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.5.2.6.1.1 "><p id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_p144034313592">100/10000</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.5.2.6.1.2 "><p id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_p1344018435599">1000</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.5.2.6.1.3 "><p id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_p24401243175910">2500</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.5.2.6.1.4 "><p id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_p134409438593">1500</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.5.2.6.1.5 "><p id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_p544018434594">2500</p>
</td>
</tr>
<tr id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_row1974318297599"><td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.5.2.6.1.1 "><p id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_p8440843175911">200/20000</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.5.2.6.1.2 "><p id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_p16440104310592">1500</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.5.2.6.1.3 "><p id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_p124401439592">3000</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.5.2.6.1.4 "><p id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_p11440843175915">2500</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.5.2.6.1.5 "><p id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_p184403436591">3500</p>
</td>
</tr>
<tr id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_row8743202985917"><td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.5.2.6.1.1 "><p id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_p14401343115919">300/30000</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.5.2.6.1.2 "><p id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_p7440174320597">2000</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.5.2.6.1.3 "><p id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_p7440184311599">3500</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.5.2.6.1.4 "><p id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_p74401543125911">3500</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.5.2.6.1.5 "><p id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_p2440543155915">4500</p>
</td>
</tr>
<tr id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_row67432029155914"><td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.5.2.6.1.1 "><p id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_p2440204375918">400/40000</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.5.2.6.1.2 "><p id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_p114401043105917">2500</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.5.2.6.1.3 "><p id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_p14401543195915">4000</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.5.2.6.1.4 "><p id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_p154401435593">4500</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.5.2.6.1.5 "><p id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_p10440164316595">5500</p>
</td>
</tr>
<tr id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_row12117414275"><td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.5.2.6.1.1 "><p id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_p1777011146326">500/50000</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.5.2.6.1.2 "><p id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_p27706142326">3000</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.5.2.6.1.3 "><p id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_p1877081413213">4500</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.5.2.6.1.4 "><p id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_p9770121416326">5500</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.5.2.6.1.5 "><p id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_p6770114113216">6500</p>
</td>
</tr>
<tr id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_row7769171212719"><td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.5.2.6.1.1 "><p id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_p9770201403218">600/60000</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.5.2.6.1.2 "><p id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_p3770514113213">3500</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.5.2.6.1.3 "><p id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_p2770121493213">5000</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.5.2.6.1.4 "><p id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_p157701014193211">6500</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.5.2.6.1.5 "><p id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_p107701714143212">7500</p>
</td>
</tr>
<tr id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_row1635881442720"><td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.5.2.6.1.1 "><p id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_p3770181416322">700/70000</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.5.2.6.1.2 "><p id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_p677010149327">4000</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.5.2.6.1.3 "><p id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_p17700148328">5500</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.5.2.6.1.4 "><p id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_p18770121453214">7500</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.3.2.1.2.5.2.6.1.5 "><p id="cce_10_0193__cce_faq_00429_en-us_topic_0000001199021208_p0770151433214">8500</p>
</td>
</tr>
</tbody>
</table>
</div>
</li></ul>
</p></li><li id="cce_10_0193__li1066432619340"><span>Configure the extended functions supported by the add-on.</span><p><ul id="cce_10_0193__ul4665523123418"><li id="cce_10_0193__li17192855153413"><strong id="cce_10_0193__b1631424172115">Descheduling</strong>: After this function is enabled, the volcano-descheduler component is automatically deployed. The scheduler will evict and reschedule pods that do not meet your policy configuration requirements. This helps to balance cluster load and reduce resource fragmentation. For details, see <a href="cce_10_0766.html">Descheduling</a>.</li><li id="cce_10_0193__li151520912359"><strong id="cce_10_0193__b11256012162320">NUMA Topology Scheduling</strong>: After this function is enabled, the resource-exporter component is automatically deployed. The scheduler will schedule pods in NUMA affinity mode, which enhances the performance of high-performance training jobs. For details, see <a href="cce_10_0425.html">NUMA Affinity Scheduling</a>.</li></ul>
</p></li><li id="cce_10_0193__li155851217011"><span>Configure deployment policies for the add-on pods.</span><p><div class="note" id="cce_10_0193__en-us_topic_0000001199341168_note32098410561"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><ul id="cce_10_0193__en-us_topic_0000001199341168_ul220911419567"><li id="cce_10_0193__en-us_topic_0000001199341168_li152095435618">Scheduling policies do not take effect on add-on pods of the DaemonSet type.</li><li id="cce_10_0193__en-us_topic_0000001199341168_li1720914445612">When configuring multi-AZ deployment or node affinity, ensure that there are nodes meeting the scheduling policy and that resources are sufficient in the cluster. Otherwise, the add-on cannot run.</li></ul>
</div></div>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="cce_10_0193__en-us_topic_0000001199341168_table52109416562" frame="border" border="1" rules="all"><caption><b>Table 3 </b>Configurations for add-on scheduling</caption><thead align="left"><tr id="cce_10_0193__en-us_topic_0000001199341168_row521016413569"><th align="left" class="cellrowborder" valign="top" width="24%" id="mcps1.3.3.2.5.2.2.2.3.1.1"><p id="cce_10_0193__en-us_topic_0000001199341168_p15210124175611">Parameter</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="76%" id="mcps1.3.3.2.5.2.2.2.3.1.2"><p id="cce_10_0193__en-us_topic_0000001199341168_p13210142565">Description</p>
</th>
</tr>
</thead>
<tbody><tr id="cce_10_0193__en-us_topic_0000001199341168_row162102049564"><td class="cellrowborder" valign="top" width="24%" headers="mcps1.3.3.2.5.2.2.2.3.1.1 "><p id="cce_10_0193__en-us_topic_0000001199341168_p421019416569">Multi-AZ Deployment</p>
</td>
<td class="cellrowborder" valign="top" width="76%" headers="mcps1.3.3.2.5.2.2.2.3.1.2 "><ul id="cce_10_0193__en-us_topic_0000001199341168_ul122101425619"><li id="cce_10_0193__en-us_topic_0000001199341168_li142101342560"><strong id="cce_10_0193__b161466894464917">Preferred</strong>: Deployment pods of the add-on will be preferentially scheduled to nodes in different AZs. If all the nodes in the cluster are deployed in the same AZ, the pods will be scheduled to different nodes in that AZ.</li><li id="cce_10_0193__en-us_topic_0000001199341168_li3210440562"><strong id="cce_10_0193__b15271147101612">Forcible</strong>: Deployment pods of the add-on are forcibly scheduled to nodes in different AZs. There can be at most one pod in each AZ. If nodes in a cluster are not in different AZs, some add-on pods cannot run properly. If a node is faulty, add-on pods on it may fail to be migrated.</li></ul>
</td>
</tr>
<tr id="cce_10_0193__en-us_topic_0000001199341168_row1121010416566"><td class="cellrowborder" valign="top" width="24%" headers="mcps1.3.3.2.5.2.2.2.3.1.1 "><p id="cce_10_0193__en-us_topic_0000001199341168_p12210114165612">Node Affinity</p>
</td>
<td class="cellrowborder" valign="top" width="76%" headers="mcps1.3.3.2.5.2.2.2.3.1.2 "><ul id="cce_10_0193__en-us_topic_0000001199341168_ul1621054145617"><li id="cce_10_0193__en-us_topic_0000001199341168_li1721017413562"><strong id="cce_10_0193__b23924179457">Not configured</strong>: Node affinity is disabled for the add-on.</li><li id="cce_10_0193__en-us_topic_0000001199341168_li52109417563"><strong id="cce_10_0193__b276545691620">Specify node</strong>: Specify the nodes where the add-on is deployed. If you do not specify the nodes, the add-on will be randomly scheduled based on the default cluster scheduling policy.</li><li id="cce_10_0193__en-us_topic_0000001199341168_li1421015415561"><strong id="cce_10_0193__b107095991615">Specify node pool</strong>: Specify the node pool where the add-on is deployed. If you do not specify the node pools, the add-on will be randomly scheduled based on the default cluster scheduling policy.</li><li id="cce_10_0193__en-us_topic_0000001199341168_li92101542568"><strong id="cce_10_0193__b1075017315175">Customize affinity</strong>: Enter the labels of the nodes where the add-on is to be deployed for more flexible scheduling policies. If you do not specify node labels, the add-on will be randomly scheduled based on the default cluster scheduling policy.<p id="cce_10_0193__en-us_topic_0000001199341168_p19210104145617">If multiple custom affinity policies are configured, ensure that there are nodes that meet all the affinity policies in the cluster. Otherwise, the add-on cannot run.</p>
</li></ul>
</td>
</tr>
<tr id="cce_10_0193__en-us_topic_0000001199341168_row3210645563"><td class="cellrowborder" valign="top" width="24%" headers="mcps1.3.3.2.5.2.2.2.3.1.1 "><p id="cce_10_0193__en-us_topic_0000001199341168_p1821012465613">Toleration</p>
</td>
<td class="cellrowborder" valign="top" width="76%" headers="mcps1.3.3.2.5.2.2.2.3.1.2 "><p id="cce_10_0193__en-us_topic_0000001199341168_p11210164125619">Using both taints and tolerations allows (not forcibly) the add-on Deployment to be scheduled to a node with the matching taints, and controls the Deployment eviction policies after the node where the Deployment is located is tainted.</p>
<p id="cce_10_0193__en-us_topic_0000001199341168_p19210174185613">The add-on adds the default tolerance policy for the <strong id="cce_10_0193__b1134619444017">node.kubernetes.io/not-ready</strong> and <strong id="cce_10_0193__b1334714410403">node.kubernetes.io/unreachable</strong> taints, respectively. The tolerance time window is 60s.</p>
<p id="cce_10_0193__en-us_topic_0000001199341168_p2210144135620">For details, see <a href="cce_10_0728.html">Configuring Tolerance Policies</a>.</p>
</td>
</tr>
</tbody>
</table>
</div>
</p></li><li id="cce_10_0193__li9455819152615"><span>Click <span class="uicontrol" id="cce_10_0193__uicontrol48361825335"><b>Install</b></span>.</span><p><div class="p" id="cce_10_0193__p13209422162913">After the add-on is installed, you can choose <strong id="cce_10_0193__b105601245165612">Settings</strong> in the navigation pane, switch to the <strong id="cce_10_0193__b7560745115611">Scheduling</strong> tab, and find the expert mode. You can customize advanced scheduling policies based on actual service scenarios. The following is an example:<pre class="screen" id="cce_10_0193__screen4274121212304">admission_kube_api_qps: 200
admissions: /jobs/mutate,/jobs/validate,/podgroups/mutate,/pods/validate,/pods/mutate,/queues/mutate,/queues/validate,/eas/pods/mutate,/eas/pods/validate,/npu/jobs/validate,/resource/validate,/resource/mutate,/workloadbalancer/balancer/validate,/workloadbalancer/balancerpolicytemplate/validate
annotations: {}
colocation_enable: 'false'
controller_kube_api_qps: 200
default_scheduler_conf:
actions: allocate, backfill, preempt
metrics:
interval: 30s
type: ''
tiers:
- plugins:
- name: priority
- enableJobStarving: false
enablePreemptable: false
name: gang
- name: conformance
- plugins:
- enablePreemptable: false
name: drf
- name: predicates
- name: nodeorder
- plugins:
- name: cce-gpu-topology-predicate
- name: cce-gpu-topology-priority
- name: xgpu
- plugins:
- name: nodelocalvolume
- name: nodeemptydirvolume
- name: nodeCSIscheduling
- name: networkresource
deschedulerPolicy:
profiles:
- name: ProfileName
pluginConfig:
- args:
nodeFit: true
name: DefaultEvictor
- args:
evictableNamespaces:
exclude:
- kube-system
thresholds:
cpu: 20
memory: 20
name: HighNodeUtilization
- args:
evictableNamespaces:
exclude:
- kube-system
metrics:
type: prometheus_adaptor
nodeFit: true
targetThresholds:
cpu: 80
memory: 85
thresholds:
cpu: 30
memory: 30
name: LoadAware
plugins:
balance:
enabled: null
descheduler_enable: 'false'
deschedulingInterval: 10m
enable_workload_balancer: false
oversubscription_method: nodeResource
oversubscription_profile_period: 300
oversubscription_ratio: 60
recommendation_enable: ''
scheduler_kube_api_qps: 200
update_pod_status_qps: 50
workload_balancer_score_annotation_key: ''
workload_balancer_third_party_types: ''</pre>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="cce_10_0193__table1227561220306" frame="border" border="1" rules="all"><caption><b>Table 4 </b>Advanced Volcano configuration parameters</caption><thead align="left"><tr id="cce_10_0193__row4275201273013"><th align="left" class="cellrowborder" valign="top" width="13.541354135413542%" id="mcps1.3.3.2.6.2.1.4.2.5.1.1"><p id="cce_10_0193__p1280519151905">Function</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="17.25172517251725%" id="mcps1.3.3.2.6.2.1.4.2.5.1.2"><p id="cce_10_0193__p1927515127309">Parameter</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="28.442844284428443%" id="mcps1.3.3.2.6.2.1.4.2.5.1.3"><p id="cce_10_0193__p1927521219307">Function</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="40.764076407640765%" id="mcps1.3.3.2.6.2.1.4.2.5.1.4"><p id="cce_10_0193__p172751012103011">Description</p>
</th>
</tr>
</thead>
<tbody><tr id="cce_10_0193__row949715353597"><td class="cellrowborder" rowspan="7" valign="top" width="13.541354135413542%" headers="mcps1.3.3.2.6.2.1.4.2.5.1.1 "><p id="cce_10_0193__p1510818141458">Basic scheduling functions</p>
<p id="cce_10_0193__p19928152221415"></p>
<p id="cce_10_0193__p192814220141"></p>
<p id="cce_10_0193__p0124222717"></p>
</td>
<td class="cellrowborder" valign="top" width="17.25172517251725%" headers="mcps1.3.3.2.6.2.1.4.2.5.1.2 "><p id="cce_10_0193__p1101323561">admission_kube_api_qps</p>
</td>
<td class="cellrowborder" valign="top" width="28.442844284428443%" headers="mcps1.3.3.2.6.2.1.4.2.5.1.3 "><p id="cce_10_0193__p193617201464">QPS of requests sent by volcano-admission to Kubernetes API server</p>
</td>
<td class="cellrowborder" valign="top" width="40.764076407640765%" headers="mcps1.3.3.2.6.2.1.4.2.5.1.4 "><p id="cce_10_0193__p11989122311617">Default value: <strong id="cce_10_0193__b18843241123015">200</strong>; parameter type: float</p>
</td>
</tr>
<tr id="cce_10_0193__row13497435135917"><td class="cellrowborder" valign="top" headers="mcps1.3.3.2.6.2.1.4.2.5.1.1 "><p id="cce_10_0193__p14536125311619">controller_kube_api_qps</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.3.2.6.2.1.4.2.5.1.2 "><p id="cce_10_0193__p1386717474615">QPS of requests sent by volcano-controller to Kubernetes API server</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.3.2.6.2.1.4.2.5.1.3 "><p id="cce_10_0193__p1674570123010">Default value: <strong id="cce_10_0193__b415014423015">200</strong>; parameter type: float</p>
</td>
</tr>
<tr id="cce_10_0193__row396582642614"><td class="cellrowborder" valign="top" headers="mcps1.3.3.2.6.2.1.4.2.5.1.1 "><p id="cce_10_0193__p10829733142618">scheduler_kube_api_qps</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.3.2.6.2.1.4.2.5.1.2 "><p id="cce_10_0193__p1482918335269">QPS of requests sent by volcano-scheduler to Kubernetes API server</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.3.2.6.2.1.4.2.5.1.3 "><p id="cce_10_0193__p17002193010">Default value: <strong id="cce_10_0193__b1552874610302">200</strong>; parameter type: float</p>
</td>
</tr>
<tr id="cce_10_0193__row169661226192617"><td class="cellrowborder" valign="top" headers="mcps1.3.3.2.6.2.1.4.2.5.1.1 "><p id="cce_10_0193__p09576526264">update_pod_status_qps</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.3.2.6.2.1.4.2.5.1.2 "><p id="cce_10_0193__p109661326182614">QPS of the requests for updating the pod status by volcano-scheduler</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.3.2.6.2.1.4.2.5.1.3 "><p id="cce_10_0193__p1577531303">Default value: <strong id="cce_10_0193__b18790749153017">50</strong>; parameter type: float</p>
</td>
</tr>
<tr id="cce_10_0193__row1249812353591"><td class="cellrowborder" valign="top" headers="mcps1.3.3.2.6.2.1.4.2.5.1.1 "><p id="cce_10_0193__p164981035185918">default_scheduler_conf</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.3.2.6.2.1.4.2.5.1.2 "><p id="cce_10_0193__p194981135145915">Used to schedule pods. It consists of a series of actions and plugins and features high scalability. You can specify and implement actions and plugins based on your requirements.</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.3.2.6.2.1.4.2.5.1.3 "><p id="cce_10_0193__p1340031311145">It consists of:</p>
<ul id="cce_10_0193__ul34001513151415"><li id="cce_10_0193__li84007139148"><strong id="cce_10_0193__b5628151875717">actions</strong>: defines the types and sequence of actions to be executed by the scheduler.</li><li id="cce_10_0193__li4400201351414"><strong id="cce_10_0193__b151714277576">tiers</strong>: configures the plugin list.</li></ul>
</td>
</tr>
<tr id="cce_10_0193__row992862215146"><td class="cellrowborder" valign="top" headers="mcps1.3.3.2.6.2.1.4.2.5.1.1 "><p id="cce_10_0193__p16928172281419">default_scheduler_conf.actions</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.3.2.6.2.1.4.2.5.1.2 "><p id="cce_10_0193__p12373117151717">Actions to be executed in each scheduling phase. The configured action sequence is the scheduler execution sequence. For details, see <a href="https://volcano.sh/en/docs/actions/" target="_blank" rel="noopener noreferrer">Actions</a>.</p>
<p id="cce_10_0193__p15373147111717">The scheduler traverses all jobs to be scheduled and performs actions such as enqueue, allocate, preempt, and backfill in the configured sequence to find the most appropriate node for each job.</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.3.2.6.2.1.4.2.5.1.3 "><p id="cce_10_0193__p15926102310177"><strong id="cce_10_0193__b411512455814">The following options are supported:</strong></p>
<ul id="cce_10_0193__ul99261623171714"><li id="cce_10_0193__li15926172320177"><strong id="cce_10_0193__b29809914586">enqueue</strong>: uses a series of filtering algorithms to filter out tasks to be scheduled and sends them to the queue to wait for scheduling. After this action, the task status changes from <strong id="cce_10_0193__b1528584525118">pending</strong> to <strong id="cce_10_0193__b4610174735118">inqueue</strong>.</li><li id="cce_10_0193__li11926102381713"><strong id="cce_10_0193__b199144511515">allocate</strong>: selects the most suitable node based on a series of pre-selection and selection algorithms.</li><li id="cce_10_0193__li692642320170"><strong id="cce_10_0193__b589211415217">preempt</strong>: performs preemption scheduling for tasks with higher priorities in the same queue based on priority rules.</li><li id="cce_10_0193__li1392662320171"><strong id="cce_10_0193__b1629131135915">backfill</strong>: schedules tasks in the pending state as much as possible to maximize node resource utilization.</li></ul>
<p id="cce_10_0193__p197051312189"><strong id="cce_10_0193__b9143105021820">Example</strong>:</p>
<pre class="screen" id="cce_10_0193__screen19269458177">actions: 'allocate, backfill, preempt'</pre>
<div class="note" id="cce_10_0193__note35265343185"><span class="notetitle"> NOTE: </span><div class="notebody"><p id="cce_10_0193__p195265349186">When configuring <strong id="cce_10_0193__b279015343819">actions</strong>, use either <strong id="cce_10_0193__b1790183193812">preempt</strong> or <strong id="cce_10_0193__b379018318389">enqueue</strong>.</p>
</div></div>
</td>
</tr>
<tr id="cce_10_0193__row15928922101420"><td class="cellrowborder" valign="top" headers="mcps1.3.3.2.6.2.1.4.2.5.1.1 "><p id="cce_10_0193__p29281522131410">default_scheduler_conf.tier.plugin</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.3.2.6.2.1.4.2.5.1.2 "><p id="cce_10_0193__p69289226143">Implementation details of algorithms in actions based on different scenarios. For details, see <a href="https://volcano.sh/en/docs/plugins/" target="_blank" rel="noopener noreferrer">Plugins</a>.</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.3.2.6.2.1.4.2.5.1.3 "><p id="cce_10_0193__p189289227145">For details, see <a href="#cce_10_0193__table1227612123306">Table 5</a>.</p>
</td>
</tr>
<tr id="cce_10_0193__row0880131516253"><td class="cellrowborder" rowspan="3" valign="top" width="13.541354135413542%" headers="mcps1.3.3.2.6.2.1.4.2.5.1.1 "><p id="cce_10_0193__p208801315202520"></p>
<p id="cce_10_0193__p7568105414198"><a href="cce_10_0766.html">Descheduling</a></p>
</td>
<td class="cellrowborder" valign="top" width="17.25172517251725%" headers="mcps1.3.3.2.6.2.1.4.2.5.1.2 "><p id="cce_10_0193__p224220118100">descheduler_enable</p>
</td>
<td class="cellrowborder" valign="top" width="28.442844284428443%" headers="mcps1.3.3.2.6.2.1.4.2.5.1.3 "><p id="cce_10_0193__p63252537202">Used to enable descheduling.</p>
</td>
<td class="cellrowborder" valign="top" width="40.764076407640765%" headers="mcps1.3.3.2.6.2.1.4.2.5.1.4 "><p id="cce_10_0193__p142106772516">This function is disabled by default. Options:</p>
<ul id="cce_10_0193__ul1421013714256"><li id="cce_10_0193__li18210167182515"><strong id="cce_10_0193__b1627216615011">true</strong>: The function is enabled.</li><li id="cce_10_0193__li152111279253"><strong id="cce_10_0193__b770116920018">false</strong> or empty: The function is disabled.</li></ul>
</td>
</tr>
<tr id="cce_10_0193__row13401446111913"><td class="cellrowborder" valign="top" headers="mcps1.3.3.2.6.2.1.4.2.5.1.1 "><p id="cce_10_0193__p858494171013">deschedulerPolicy</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.3.2.6.2.1.4.2.5.1.2 "><p id="cce_10_0193__p18584154181010">Descheduling policy</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.3.2.6.2.1.4.2.5.1.3 "><p id="cce_10_0193__p75841414107">For details about the parameters, see <a href="cce_10_0766.html#cce_10_0766__table18576915101217">Table 2</a>.</p>
</td>
</tr>
<tr id="cce_10_0193__row11402104612195"><td class="cellrowborder" valign="top" headers="mcps1.3.3.2.6.2.1.4.2.5.1.1 "><p id="cce_10_0193__p84021946201913">deschedulingInterval</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.3.2.6.2.1.4.2.5.1.2 "><p id="cce_10_0193__p2402204671914">Descheduling period</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.3.2.6.2.1.4.2.5.1.3 "><p id="cce_10_0193__p261201943520">Value range: &gt; 0s; parameter type: time</p>
</td>
</tr>
<tr id="cce_10_0193__row159224281959"><td class="cellrowborder" rowspan="4" valign="top" width="13.541354135413542%" headers="mcps1.3.3.2.6.2.1.4.2.5.1.1 "><p id="cce_10_0193__p82016451251"><a href="cce_10_0709.html">Cloud native hybrid deployment</a></p>
</td>
<td class="cellrowborder" valign="top" width="17.25172517251725%" headers="mcps1.3.3.2.6.2.1.4.2.5.1.2 "><p id="cce_10_0193__p1592316281859">colocation_enable</p>
</td>
<td class="cellrowborder" valign="top" width="28.442844284428443%" headers="mcps1.3.3.2.6.2.1.4.2.5.1.3 "><p id="cce_10_0193__p26241930191112">Used to enable cloud native hybrid deployment.</p>
</td>
<td class="cellrowborder" valign="top" width="40.764076407640765%" headers="mcps1.3.3.2.6.2.1.4.2.5.1.4 "><p id="cce_10_0193__p42083564244">This function is disabled by default. Options:</p>
<ul id="cce_10_0193__ul10325105312018"><li id="cce_10_0193__li13325553122019"><strong id="cce_10_0193__b432842910">true</strong>: The function is enabled.</li><li id="cce_10_0193__li17325195312018"><strong id="cce_10_0193__b1651011720019">false</strong> or empty: The function is disabled.</li></ul>
</td>
</tr>
<tr id="cce_10_0193__row89239282513"><td class="cellrowborder" valign="top" headers="mcps1.3.3.2.6.2.1.4.2.5.1.1 "><p id="cce_10_0193__p19231228754">oversubscription_method</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.3.2.6.2.1.4.2.5.1.2 "><p id="cce_10_0193__p59239281356">Method for calculating the oversubscription</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.3.2.6.2.1.4.2.5.1.3 "><p id="cce_10_0193__p18923182813515"><strong id="cce_10_0193__b339185015213">nodeResource</strong> and <strong id="cce_10_0193__b814915215217">podProfile</strong> are supported. The default value is <strong id="cce_10_0193__b2777195312214">nodeResource</strong>.</p>
<ul id="cce_10_0193__ul791894691318"><li id="cce_10_0193__li6918204651318"><strong id="cce_10_0193__b13821413732">nodeResource</strong>: calculates the oversubscription based on the node resource usage.</li><li id="cce_10_0193__li712542610149"><strong id="cce_10_0193__b148661481796">podProfile</strong>: calculates the oversubscription based on pod profiling.</li></ul>
</td>
</tr>
<tr id="cce_10_0193__row29231328355"><td class="cellrowborder" valign="top" headers="mcps1.3.3.2.6.2.1.4.2.5.1.1 "><p id="cce_10_0193__p20923122815513">oversubscription_ratio</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.3.2.6.2.1.4.2.5.1.2 "><p id="cce_10_0193__p17923528051">Percentage of idle resource oversubscription of a node</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.3.2.6.2.1.4.2.5.1.3 "><p id="cce_10_0193__p11195745123716">Value range: 1 to 100; parameter type: int</p>
<p id="cce_10_0193__p11884844474">For example, <strong id="cce_10_0193__b348219315814">60</strong> indicates that the maximum oversubscription resources on a node is calculated based on <span class="uicontrol" id="cce_10_0193__uicontrol3287111912319"><b>60% × Idle resources on the node</b></span>.</p>
</td>
</tr>
<tr id="cce_10_0193__row15923182817516"><td class="cellrowborder" valign="top" headers="mcps1.3.3.2.6.2.1.4.2.5.1.1 "><p id="cce_10_0193__p1392352814515">oversubscription_profile_period</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.3.2.6.2.1.4.2.5.1.2 "><p id="cce_10_0193__p10923192816510">Period of pod profiling</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.3.2.6.2.1.4.2.5.1.3 "><p id="cce_10_0193__p12923122810511">Value range: 60 to 2592000, in seconds, that is, from 1 minute to 1 month. If a pod's metrics are not collected for the entire period, the node's resources will be evaluated according to the resources requested by the pod.</p>
<p id="cce_10_0193__p594592211171">When the oversubscription algorithm based on pod profiling is enabled for the first time, the amount of collected data may not be sufficient to cover the entire period. In this case, the oversubscription on the node is temporarily 0 due to lack of initialization data. After the data of the first period is collected, the oversubscription is updated to the actual value.</p>
</td>
</tr>
</tbody>
</table>
</div>
<div class="tablenoborder"><a name="cce_10_0193__table1227612123306"></a><a name="table1227612123306"></a><table cellpadding="4" cellspacing="0" summary="" id="cce_10_0193__table1227612123306" frame="border" border="1" rules="all"><caption><b>Table 5 </b>Supported plugins</caption><thead align="left"><tr id="cce_10_0193__row8276171212301"><th align="left" class="cellrowborder" valign="top" width="12.778722127787221%" id="mcps1.3.3.2.6.2.1.5.2.5.1.1"><p id="cce_10_0193__p52761412133012">Plugins</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="22.4977502249775%" id="mcps1.3.3.2.6.2.1.5.2.5.1.2"><p id="cce_10_0193__p1727761253011">Function</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="33.38666133386661%" id="mcps1.3.3.2.6.2.1.5.2.5.1.3"><p id="cce_10_0193__p127791263011">Description</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="31.336866313368667%" id="mcps1.3.3.2.6.2.1.5.2.5.1.4"><p id="cce_10_0193__p1027718123306">Demonstration</p>
</th>
</tr>
</thead>
<tbody><tr id="cce_10_0193__row8277191253016"><td class="cellrowborder" valign="top" width="12.778722127787221%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.1 "><p id="cce_10_0193__p19277141273013">binpack</p>
</td>
<td class="cellrowborder" valign="top" width="22.4977502249775%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.2 "><p id="cce_10_0193__p202773123306">Schedule pods to nodes with high resource usage (not allocating pods to light-loaded nodes) to reduce resource fragments.</p>
</td>
<td class="cellrowborder" valign="top" width="33.38666133386661%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.3 "><p id="cce_10_0193__p42771812153011"><strong id="cce_10_0193__b1762881412566">arguments</strong>:</p>
<ul id="cce_10_0193__ul19277131213300"><li id="cce_10_0193__li927716122303"><strong id="cce_10_0193__b207143623242447">binpack.weight</strong>: weight of the binpack plugin.</li><li id="cce_10_0193__li227761273014"><strong id="cce_10_0193__b3341536195615">binpack.cpu</strong>: ratio of CPUs to all resources. The parameter value defaults to <strong id="cce_10_0193__b23411336175613">1</strong>.</li><li id="cce_10_0193__li3277412153019"><strong id="cce_10_0193__b16938118239">binpack.memory</strong>: ratio of memory resources to all resources. The parameter value defaults to <strong id="cce_10_0193__b29351132317">1</strong>.</li><li id="cce_10_0193__li19277712143017"><strong id="cce_10_0193__b124319075815">binpack.resources</strong>: other custom resource types requested by the pod, for example, <strong id="cce_10_0193__b1410716716118">nvidia.com/gpu</strong>. Multiple types can be configured and be separated by commas (,).</li><li id="cce_10_0193__li11277181217304"><strong id="cce_10_0193__b561811341415">binpack.resources.</strong><i><span class="varname" id="cce_10_0193__varname5192299319">&lt;your_resource&gt;</span></i>: weight of your custom resource in all resources. Multiple types of resources can be added. <i><span class="varname" id="cce_10_0193__varname3391545164">&lt;your_resource&gt;</span></i> indicates the resource type defined in <strong id="cce_10_0193__b8305132619176">binpack.resources</strong>, for example, <strong id="cce_10_0193__b025713376171">binpack.resources.nvidia.com/gpu</strong>.</li></ul>
</td>
<td class="cellrowborder" valign="top" width="31.336866313368667%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.4 "><pre class="screen" id="cce_10_0193__screen172771612193018">- plugins:
- <strong id="cce_10_0193__b62777120309">name: binpack</strong>
arguments:
binpack.weight: 10
binpack.cpu: 1
binpack.memory: 1
binpack.resources: nvidia.com/gpu, example.com/foo
binpack.resources.nvidia.com/gpu: 2
binpack.resources.example.com/foo: 3</pre>
</td>
</tr>
<tr id="cce_10_0193__row027712126308"><td class="cellrowborder" valign="top" width="12.778722127787221%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.1 "><p id="cce_10_0193__p4277171243013">conformance</p>
</td>
<td class="cellrowborder" valign="top" width="22.4977502249775%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.2 "><p id="cce_10_0193__p12277121215306">Prevent key pods, such as the pods in the <strong id="cce_10_0193__b47148763742516">kube-system</strong> namespace from being preempted.</p>
</td>
<td class="cellrowborder" valign="top" width="33.38666133386661%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.3 "><p id="cce_10_0193__p1527721214308">None</p>
</td>
<td class="cellrowborder" valign="top" width="31.336866313368667%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.4 "><pre class="screen" id="cce_10_0193__screen02773129300">- plugins:
- name: 'priority'
- name: 'gang'
enablePreemptable: false
- <strong id="cce_10_0193__b72771112133014">name: 'conformance'</strong></pre>
</td>
</tr>
<tr id="cce_10_0193__row1127771263019"><td class="cellrowborder" valign="top" width="12.778722127787221%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.1 "><p id="cce_10_0193__p1927701215302">lifecycle</p>
</td>
<td class="cellrowborder" valign="top" width="22.4977502249775%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.2 "><p id="cce_10_0193__p11277712143011">By collecting statistics on service scaling rules, pods with similar lifecycles are preferentially scheduled to the same node. With the horizontal scaling capability of the Autoscaler, resources can be quickly scaled in and released, reducing costs and improving resource utilization.</p>
<p id="cce_10_0193__p627721219303">1. Collects statistics on the lifecycle of pods in the service load and schedules pods with similar lifecycles to the same node.</p>
<p id="cce_10_0193__p1627717122305">2. For a cluster configured with an automatic scaling policy, adjust the scale-in annotation of the node to preferentially scale in the node with low usage.</p>
</td>
<td class="cellrowborder" valign="top" width="33.38666133386661%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.3 "><div class="p" id="cce_10_0193__p9277812103019"><strong id="cce_10_0193__b14754837194016">arguments</strong>:<ul id="cce_10_0193__ul15277412153014"><li id="cce_10_0193__li1277412193019"><strong id="cce_10_0193__b74989164117">lifecycle.WindowSize</strong>: The value is an integer greater than or equal to 1 and defaults to <strong id="cce_10_0193__b6311154814413">10</strong>.<p id="cce_10_0193__p142771912153014">Record the number of times that the number of replicas changes. If the load changes regularly and periodically, decrease the value. If the load changes irregularly and the number of replicas changes frequently, increase the value. If the value is too large, the learning period is prolonged and too many events are recorded.</p>
</li><li id="cce_10_0193__li1527731210302"><strong id="cce_10_0193__b658902417459">lifecycle.MaxGrade</strong>: The value is an integer greater than or equal to 3 and defaults to <strong id="cce_10_0193__b566211574711">3</strong>.<p id="cce_10_0193__p12277141211301">It indicates levels of replicas. For example, if the value is set to <strong id="cce_10_0193__b1744288144913">3</strong>, the replicas are classified into three levels. If the load changes regularly and periodically, decrease the value. If the load changes irregularly, increase the value. Setting an excessively small value may result in inaccurate lifecycle forecasts.</p>
</li><li id="cce_10_0193__li112772012113011"><strong id="cce_10_0193__b13193838185410">lifecycle.MaxScore</strong>: float64 floating point number. The value must be greater than or equal to 50.0. The default value is <strong id="cce_10_0193__b82511751175417">200.0</strong>.<p id="cce_10_0193__p527710129307">Maximum score (equivalent to the weight) of the lifecycle plugin.</p>
</li><li id="cce_10_0193__li162773126304"><strong id="cce_10_0193__b827533055711">lifecycle.SaturatedTresh</strong>: float64 floating point number. If the value is less than 0.5, use <strong id="cce_10_0193__b1968512418584">0.5</strong>. If the value is greater than 1, use <strong id="cce_10_0193__b564214235820">1</strong>. The default value is <strong id="cce_10_0193__b55300719581">0.8</strong>.<p id="cce_10_0193__p6277412173011">Threshold for determining whether the node usage is too high. If the node usage exceeds the threshold, the scheduler preferentially schedules jobs to other nodes.</p>
</li></ul>
</div>
</td>
<td class="cellrowborder" valign="top" width="31.336866313368667%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.4 "><pre class="screen" id="cce_10_0193__screen2277171273016">- plugins:
- name: priority
- name: gang
enablePreemptable: false
- name: conformance
- <strong id="cce_10_0193__b1527814127304">name: lifecycle</strong>
arguments:
lifecycle.MaxGrade: 3
lifecycle.MaxScore: 200.0
lifecycle.SaturatedTresh: 0.8
lifecycle.WindowSize: 10</pre>
<div class="note" id="cce_10_0193__note12278412113016"><span class="notetitle"> NOTE: </span><div class="notebody"><ul id="cce_10_0193__ul02784121309"><li id="cce_10_0193__li16278131233010">For nodes that do not want to be scaled in, manually mark them as long-period nodes and add the annotation <strong id="cce_10_0193__b611893955913">volcano.sh/long-lifecycle-node: true</strong> to them. For an unmarked node, the lifecycle plugin automatically marks the node based on the lifecycle of the load on the node.</li><li id="cce_10_0193__li142783128306">The default value of <strong id="cce_10_0193__b338565417011">MaxScore</strong> is <strong id="cce_10_0193__b210015223017">200.0</strong>, which is twice the weight of other plugins. When the lifecycle plugin does not have obvious effect or conflicts with other plugins, disable other plugins or increase the value of <strong id="cce_10_0193__b1473175114016">MaxScore</strong>.</li><li id="cce_10_0193__li11278171293010">After the scheduler is restarted, the lifecycle plugin needs to re-record the load change. The optimal scheduling effect can be achieved only after several periods of statistics are collected.</li></ul>
</div></div>
</td>
</tr>
<tr id="cce_10_0193__row927861215309"><td class="cellrowborder" valign="top" width="12.778722127787221%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.1 "><p id="cce_10_0193__p17278191211305">Gang</p>
</td>
<td class="cellrowborder" valign="top" width="22.4977502249775%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.2 "><p id="cce_10_0193__p11278201223011">Consider a group of pods as a whole for resource allocation. This plugin checks whether the number of scheduled pods in a job meets the minimum requirements for running the job. If yes, all pods in the job will be scheduled. If no, the pods will not be scheduled.</p>
<div class="note" id="cce_10_0193__note0278712143012"><span class="notetitle"> NOTE: </span><div class="notebody"><p id="cce_10_0193__p127871210308">If a gang scheduling policy is used, if the remaining resources in the cluster are greater than or equal to half of the minimum number of resources for running a job but less than the minimum of resources for running the job, Autoscaler scale-outs will not be triggered.</p>
</div></div>
</td>
<td class="cellrowborder" valign="top" width="33.38666133386661%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.3 "><ul id="cce_10_0193__ul152781112143013"><li id="cce_10_0193__li827821253011"><strong id="cce_10_0193__b7261259287">enablePreemptable</strong>:<ul id="cce_10_0193__ul927891273016"><li id="cce_10_0193__li427891213016"><strong id="cce_10_0193__b155006782813">true</strong>: Preemption enabled</li><li id="cce_10_0193__li12278201263016"><strong id="cce_10_0193__b348273010283">false</strong>: Preemption not enabled</li></ul>
</li><li id="cce_10_0193__li927851253010"><strong id="cce_10_0193__b1639582711115">enableJobStarving</strong>:<ul id="cce_10_0193__ul1927810129306"><li id="cce_10_0193__li2278191217304"><strong id="cce_10_0193__b1976675165013">true</strong>: Resources are preempted based on the <strong id="cce_10_0193__b13207711195115">minAvailable</strong> setting of jobs.</li><li id="cce_10_0193__li127891213307"><strong id="cce_10_0193__b211925375118">false</strong>: Resources are preempted based on job replicas.</li></ul>
<div class="note" id="cce_10_0193__note827821293019"><span class="notetitle"> NOTE: </span><div class="notebody"><ul id="cce_10_0193__ul82787123307"><li id="cce_10_0193__li10278912163011">The default value of <strong id="cce_10_0193__b15341101685210">minAvailable</strong> for Kubernetes-native workloads (such as Deployments) is <strong id="cce_10_0193__b232517264522">1</strong>. It is a good practice to set <strong id="cce_10_0193__b15663450165213">enableJobStarving</strong> to <strong id="cce_10_0193__b146371535521">false</strong>.</li><li id="cce_10_0193__li42781112113016">In AI and big data scenarios, you can specify the <strong id="cce_10_0193__b1983208195417">minAvailable</strong> value when creating a vcjob. It is a good practice to set <strong id="cce_10_0193__b194544449548">enableJobStarving</strong> to <strong id="cce_10_0193__b959154955416">true</strong>.</li><li id="cce_10_0193__li527821293015">In Volcano versions earlier than v1.11.5, <strong id="cce_10_0193__b33421528205513">enableJobStarving</strong> is set to <strong id="cce_10_0193__b10768113010553">true</strong> by default. In Volcano versions later than v1.11.5, <strong id="cce_10_0193__b52798560557">enableJobStarving</strong> is set to <strong id="cce_10_0193__b2050314585556">false</strong> by default.</li></ul>
</div></div>
</li></ul>
</td>
<td class="cellrowborder" valign="top" width="31.336866313368667%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.4 "><pre class="screen" id="cce_10_0193__screen62782124307">- plugins:
- name: priority
<strong id="cce_10_0193__b727841216308"> - name: gang</strong>
<strong id="cce_10_0193__b0278512153019"> enablePreemptable: false</strong>
<strong id="cce_10_0193__b7278312133014">enableJobStarving: false</strong>
- name: conformance</pre>
</td>
</tr>
<tr id="cce_10_0193__row1627817127305"><td class="cellrowborder" valign="top" width="12.778722127787221%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.1 "><p id="cce_10_0193__p1227891293015">priority</p>
</td>
<td class="cellrowborder" valign="top" width="22.4977502249775%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.2 "><p id="cce_10_0193__p5278712143015">Schedule based on custom load priorities.</p>
</td>
<td class="cellrowborder" valign="top" width="33.38666133386661%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.3 "><p id="cce_10_0193__p6278612123010">None</p>
</td>
<td class="cellrowborder" valign="top" width="31.336866313368667%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.4 "><pre class="screen" id="cce_10_0193__screen19278191216307">- plugins:
- <strong id="cce_10_0193__b72791112113018">name: priority</strong>
- name: gang
enablePreemptable: false
- name: conformance</pre>
</td>
</tr>
<tr id="cce_10_0193__row18279141214303"><td class="cellrowborder" valign="top" width="12.778722127787221%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.1 "><p id="cce_10_0193__p1027911214307">overcommit</p>
</td>
<td class="cellrowborder" valign="top" width="22.4977502249775%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.2 "><p id="cce_10_0193__p4279812123020">Resources in a cluster are scheduled after being accumulated in a certain multiple to improve the workload enqueuing efficiency. If all workloads are Deployments, remove this plugin or set the raising factor to <strong id="cce_10_0193__b159954622942558">2.0</strong>.</p>
<div class="note" id="cce_10_0193__note14279191233018"><span class="notetitle"> NOTE: </span><div class="notebody"><p id="cce_10_0193__p927915121303">This plugin is supported in Volcano 1.6.5 and later versions.</p>
</div></div>
</td>
<td class="cellrowborder" valign="top" width="33.38666133386661%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.3 "><p id="cce_10_0193__p132795120305"><strong id="cce_10_0193__b556610173569">arguments</strong>:</p>
<ul id="cce_10_0193__ul1727941212306"><li id="cce_10_0193__li6279201283013"><strong id="cce_10_0193__b3781113044618">overcommit-factor</strong>: inflation factor, which defaults to <strong id="cce_10_0193__b10781143013469">1.2</strong>.</li></ul>
</td>
<td class="cellrowborder" valign="top" width="31.336866313368667%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.4 "><pre class="screen" id="cce_10_0193__screen16279312123011">- plugins:
- <strong id="cce_10_0193__b1227941218302">name: overcommit</strong>
arguments:
overcommit-factor: 2.0</pre>
</td>
</tr>
<tr id="cce_10_0193__row327914123307"><td class="cellrowborder" valign="top" width="12.778722127787221%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.1 "><p id="cce_10_0193__p1027961214309">drf</p>
</td>
<td class="cellrowborder" valign="top" width="22.4977502249775%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.2 "><p id="cce_10_0193__p6279161213307">The Dominant Resource Fairness (DRF) scheduling algorithm, which schedules jobs based on their dominant resource share. Jobs with a smaller resource share will be scheduled with a higher priority.</p>
</td>
<td class="cellrowborder" valign="top" width="33.38666133386661%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.3 "><p id="cce_10_0193__p16279171210303">None</p>
</td>
<td class="cellrowborder" valign="top" width="31.336866313368667%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.4 "><pre class="screen" id="cce_10_0193__screen19279131223016">- plugins:
- <strong id="cce_10_0193__b1427941213019">name: 'drf'</strong>
- name: 'predicates'
- name: 'nodeorder'</pre>
</td>
</tr>
<tr id="cce_10_0193__row1327912129305"><td class="cellrowborder" valign="top" width="12.778722127787221%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.1 "><p id="cce_10_0193__p92791812183010">predicates</p>
</td>
<td class="cellrowborder" valign="top" width="22.4977502249775%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.2 "><p id="cce_10_0193__p1927915122308">Determine whether a task is bound to a node by using a series of evaluation algorithms, such as node/pod affinity, taint tolerance, node repetition, volume limits, and volume zone matching.</p>
</td>
<td class="cellrowborder" valign="top" width="33.38666133386661%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.3 "><p id="cce_10_0193__p2279312103010">None</p>
</td>
<td class="cellrowborder" valign="top" width="31.336866313368667%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.4 "><pre class="screen" id="cce_10_0193__screen127914123301">- plugins:
- name: 'drf'
- <strong id="cce_10_0193__b182799123307">name: 'predicates'</strong>
- name: 'nodeorder'</pre>
</td>
</tr>
<tr id="cce_10_0193__row1627941213305"><td class="cellrowborder" valign="top" width="12.778722127787221%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.1 "><p id="cce_10_0193__p10279161283015">nodeorder</p>
</td>
<td class="cellrowborder" valign="top" width="22.4977502249775%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.2 "><p id="cce_10_0193__p1427951213011">A common algorithm for selecting nodes. Nodes are scored in simulated resource allocation to find the most suitable node for the current job.</p>
</td>
<td class="cellrowborder" valign="top" width="33.38666133386661%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.3 "><p id="cce_10_0193__p827991210309">Scoring parameters:</p>
<ul id="cce_10_0193__ul202796124309"><li id="cce_10_0193__li13279612153011"><strong id="cce_10_0193__b202507265242655">nodeaffinity.weight</strong>: Pods are scheduled based on node affinity. This parameter defaults to <strong id="cce_10_0193__b9523119842655">2</strong>.</li><li id="cce_10_0193__li5279131217304"><strong id="cce_10_0193__b2177163615912">podaffinity.weight</strong>: Pods are scheduled based on pod affinity. This parameter defaults to <strong id="cce_10_0193__b1717811361794">2</strong>.</li><li id="cce_10_0193__li627971253015"><strong id="cce_10_0193__b42769641742714">leastrequested.weight</strong>: Pods are scheduled to the node with the least requested resources. This parameter defaults to <strong id="cce_10_0193__b201873097842714">1</strong>.</li><li id="cce_10_0193__li427931213012"><strong id="cce_10_0193__b136116848842725">balancedresource.weight</strong>: Pods are scheduled to the node with balanced resource allocation. This parameter defaults to <strong id="cce_10_0193__b211551392442725">1</strong>.</li><li id="cce_10_0193__li6279101293012"><strong id="cce_10_0193__b19280154063719">mostrequested.weight</strong>: Pods are scheduled to the node with the most requested resources. This parameter defaults to <strong id="cce_10_0193__b122801640123711">0</strong>.</li><li id="cce_10_0193__li1527971210308"><strong id="cce_10_0193__b182899724342747">tainttoleration.weight</strong>: Pods are scheduled to the node with a high taint tolerance. This parameter defaults to <strong id="cce_10_0193__b68931403942747">3</strong>.</li><li id="cce_10_0193__li17280171223015"><strong id="cce_10_0193__b175772432114">imagelocality.weight</strong>: Pods are scheduled to the node where the required images exist. This parameter defaults to <strong id="cce_10_0193__b1057774314112">1</strong>.</li><li id="cce_10_0193__li5280712183018"><strong id="cce_10_0193__b19313131381218">podtopologyspread.weight</strong>: Pods are scheduled based on the pod topology. This parameter defaults to <strong id="cce_10_0193__b133139138120">2</strong>.</li></ul>
</td>
<td class="cellrowborder" valign="top" width="31.336866313368667%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.4 "><pre class="screen" id="cce_10_0193__screen1128071213017">- plugins:
- <strong id="cce_10_0193__b14280912183017">name: nodeorder</strong>
arguments:
leastrequested.weight: 1
mostrequested.weight: 0
nodeaffinity.weight: 2
podaffinity.weight: 2
balancedresource.weight: 1
tainttoleration.weight: 3
imagelocality.weight: 1
podtopologyspread.weight: 2</pre>
</td>
</tr>
<tr id="cce_10_0193__row1280181253010"><td class="cellrowborder" valign="top" width="12.778722127787221%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.1 "><p id="cce_10_0193__p10280121217309">cce-gpu-topology-predicate</p>
</td>
<td class="cellrowborder" valign="top" width="22.4977502249775%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.2 "><p id="cce_10_0193__p6280712133011">GPU-topology scheduling preselection algorithm</p>
</td>
<td class="cellrowborder" valign="top" width="33.38666133386661%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.3 "><p id="cce_10_0193__p19280612103018">None</p>
</td>
<td class="cellrowborder" valign="top" width="31.336866313368667%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.4 "><pre class="screen" id="cce_10_0193__screen0280712133020">- plugins:
- <strong id="cce_10_0193__b728014121303">name: 'cce-gpu-topology-predicate'</strong>
- name: 'cce-gpu-topology-priority'
- name: 'xgpu'</pre>
</td>
</tr>
<tr id="cce_10_0193__row1728012122302"><td class="cellrowborder" valign="top" width="12.778722127787221%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.1 "><p id="cce_10_0193__p17280151213302">cce-gpu-topology-priority</p>
</td>
<td class="cellrowborder" valign="top" width="22.4977502249775%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.2 "><p id="cce_10_0193__p14280111210309">GPU-topology scheduling priority algorithm</p>
</td>
<td class="cellrowborder" valign="top" width="33.38666133386661%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.3 "><p id="cce_10_0193__p1328021214305">None</p>
</td>
<td class="cellrowborder" valign="top" width="31.336866313368667%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.4 "><pre class="screen" id="cce_10_0193__screen5280111263014">- plugins:
- name: 'cce-gpu-topology-predicate'
- <strong id="cce_10_0193__b728015124300">name: 'cce-gpu-topology-priority'</strong>
- name: 'xgpu'</pre>
</td>
</tr>
<tr id="cce_10_0193__row10280111213018"><td class="cellrowborder" valign="top" width="12.778722127787221%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.1 "><p id="cce_10_0193__p162801712193014">cce-gpu</p>
</td>
<td class="cellrowborder" valign="top" width="22.4977502249775%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.2 "><p id="cce_10_0193__p1728051211302">GPU resource allocation that supports decimal GPU configurations by working with the CCE AI Suite (NVIDIA GPU) add-on.</p>
</td>
<td class="cellrowborder" valign="top" width="33.38666133386661%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.3 "><p id="cce_10_0193__p112801512133015">None</p>
</td>
<td class="cellrowborder" valign="top" width="31.336866313368667%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.4 "><pre class="screen" id="cce_10_0193__screen3280151214304">- plugins:
- name: 'cce-gpu-topology-predicate'
- name: 'cce-gpu-topology-priority'
- <strong id="cce_10_0193__b132811812133012">name: 'cce-gpu'</strong></pre>
</td>
</tr>
<tr id="cce_10_0193__row142821124309"><td class="cellrowborder" valign="top" width="12.778722127787221%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.1 "><p id="cce_10_0193__p122821712133011">numa-aware</p>
</td>
<td class="cellrowborder" valign="top" width="22.4977502249775%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.2 "><p id="cce_10_0193__p1828219129307">NUMA affinity scheduling. For details, see <a href="cce_10_0425.html">NUMA Affinity Scheduling</a>.</p>
</td>
<td class="cellrowborder" valign="top" width="33.38666133386661%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.3 "><p id="cce_10_0193__p9282121213303"><strong id="cce_10_0193__b7770133114420">arguments</strong>:</p>
<ul id="cce_10_0193__ul6282512183012"><li id="cce_10_0193__li182821112133019"><strong id="cce_10_0193__b628175914114">weight</strong>: weight of the numa-aware plugin</li></ul>
</td>
<td class="cellrowborder" valign="top" width="31.336866313368667%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.4 "><pre class="screen" id="cce_10_0193__screen18282312103017">- plugins:
- name: 'nodelocalvolume'
- name: 'nodeemptydirvolume'
- name: 'nodeCSIscheduling'
- name: 'networkresource'
arguments:
NetworkType: vpc-router
- <strong id="cce_10_0193__b328313122305">name: numa-aware</strong>
arguments:
weight: 10</pre>
</td>
</tr>
<tr id="cce_10_0193__row13283812193016"><td class="cellrowborder" valign="top" width="12.778722127787221%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.1 "><p id="cce_10_0193__p428341213303">networkresource</p>
</td>
<td class="cellrowborder" valign="top" width="22.4977502249775%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.2 "><p id="cce_10_0193__p1028311216306">Filter out nodes that require elastic network interfaces. The parameters are transferred by CCE and do not need to be manually configured.</p>
</td>
<td class="cellrowborder" valign="top" width="33.38666133386661%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.3 "><p id="cce_10_0193__p92832012133012"><strong id="cce_10_0193__b16568171795619">arguments</strong>:</p>
<ul id="cce_10_0193__ul10283412183017"><li id="cce_10_0193__li18283121210307"><strong id="cce_10_0193__b44841755174219">NetworkType</strong>: network type (<strong id="cce_10_0193__b104841355164214">eni</strong> or <strong id="cce_10_0193__b248425574212">vpc-router</strong>)</li></ul>
</td>
<td class="cellrowborder" valign="top" width="31.336866313368667%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.4 "><pre class="screen" id="cce_10_0193__screen192833121301">- plugins:
- name: 'nodelocalvolume'
- name: 'nodeemptydirvolume'
- name: 'nodeCSIscheduling'
- <strong id="cce_10_0193__b132831712103020">name: 'networkresource'</strong>
arguments:
NetworkType: vpc-router</pre>
</td>
</tr>
<tr id="cce_10_0193__row13283151253012"><td class="cellrowborder" valign="top" width="12.778722127787221%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.1 "><p id="cce_10_0193__p5283121215307">nodelocalvolume</p>
</td>
<td class="cellrowborder" valign="top" width="22.4977502249775%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.2 "><p id="cce_10_0193__p12831512143019">Filter out nodes that do not meet local volume requirements.</p>
</td>
<td class="cellrowborder" valign="top" width="33.38666133386661%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.3 "><p id="cce_10_0193__p162831122307">None</p>
</td>
<td class="cellrowborder" valign="top" width="31.336866313368667%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.4 "><pre class="screen" id="cce_10_0193__screen42831912153012">- plugins:
- <strong id="cce_10_0193__b928313123307">name: 'nodelocalvolume'</strong>
- name: 'nodeemptydirvolume'
- name: 'nodeCSIscheduling'
- name: 'networkresource'</pre>
</td>
</tr>
<tr id="cce_10_0193__row428471211308"><td class="cellrowborder" valign="top" width="12.778722127787221%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.1 "><p id="cce_10_0193__p12841412113015">nodeemptydirvolume</p>
</td>
<td class="cellrowborder" valign="top" width="22.4977502249775%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.2 "><p id="cce_10_0193__p528461243016">Filter out nodes that do not meet the emptyDir requirements.</p>
</td>
<td class="cellrowborder" valign="top" width="33.38666133386661%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.3 "><p id="cce_10_0193__p1128431243017">None</p>
</td>
<td class="cellrowborder" valign="top" width="31.336866313368667%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.4 "><pre class="screen" id="cce_10_0193__screen1328418129305">- plugins:
- name: 'nodelocalvolume'
- <strong id="cce_10_0193__b1728421211306">name: 'nodeemptydirvolume'</strong>
- name: 'nodeCSIscheduling'
- name: 'networkresource'</pre>
</td>
</tr>
<tr id="cce_10_0193__row2284812103018"><td class="cellrowborder" valign="top" width="12.778722127787221%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.1 "><p id="cce_10_0193__p4284181219306">nodeCSIscheduling</p>
</td>
<td class="cellrowborder" valign="top" width="22.4977502249775%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.2 "><p id="cce_10_0193__p428451273020">Filter out nodes with malfunctional Everest.</p>
</td>
<td class="cellrowborder" valign="top" width="33.38666133386661%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.3 "><p id="cce_10_0193__p14284111283018">None</p>
</td>
<td class="cellrowborder" valign="top" width="31.336866313368667%" headers="mcps1.3.3.2.6.2.1.5.2.5.1.4 "><pre class="screen" id="cce_10_0193__screen42840122301">- plugins:
- name: 'nodelocalvolume'
- name: 'nodeemptydirvolume'
- <strong id="cce_10_0193__b7284151283016">name: 'nodeCSIscheduling'</strong>
- name: 'networkresource'</pre>
</td>
</tr>
</tbody>
</table>
</div>
</div>
</p></li></ol>
</div>
<div class="section" id="cce_10_0193__section0377457163618"><h4 class="sectiontitle">Components</h4>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="cce_10_0193__table1965341035819" frame="border" border="1" rules="all"><caption><b>Table 6 </b>Add-on components</caption><thead align="left"><tr id="cce_10_0193__row1565319102582"><th align="left" class="cellrowborder" valign="top" width="20.2020202020202%" id="mcps1.3.4.2.2.4.1.1"><p id="cce_10_0193__p14653141018584">Component</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="64.64646464646464%" id="mcps1.3.4.2.2.4.1.2"><p id="cce_10_0193__p065391025820">Description</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="15.151515151515149%" id="mcps1.3.4.2.2.4.1.3"><p id="cce_10_0193__p5653111015587">Resource Type</p>
</th>
</tr>
</thead>
<tbody><tr id="cce_10_0193__row872889165919"><td class="cellrowborder" valign="top" width="20.2020202020202%" headers="mcps1.3.4.2.2.4.1.1 "><p id="cce_10_0193__p187250372445">volcano-scheduler</p>
</td>
<td class="cellrowborder" valign="top" width="64.64646464646464%" headers="mcps1.3.4.2.2.4.1.2 "><p id="cce_10_0193__p1455113183920">Schedule pods.</p>
</td>
<td class="cellrowborder" valign="top" width="15.151515151515149%" headers="mcps1.3.4.2.2.4.1.3 "><p id="cce_10_0193__p772869115917">Deployment</p>
</td>
</tr>
<tr id="cce_10_0193__row28661422184415"><td class="cellrowborder" valign="top" width="20.2020202020202%" headers="mcps1.3.4.2.2.4.1.1 "><p id="cce_10_0193__p986602284410">volcano-controller</p>
</td>
<td class="cellrowborder" valign="top" width="64.64646464646464%" headers="mcps1.3.4.2.2.4.1.2 "><p id="cce_10_0193__p3866142214412">Synchronize CRDs.</p>
</td>
<td class="cellrowborder" valign="top" width="15.151515151515149%" headers="mcps1.3.4.2.2.4.1.3 "><p id="cce_10_0193__p452083112444">Deployment</p>
</td>
</tr>
<tr id="cce_10_0193__row1490910194447"><td class="cellrowborder" valign="top" width="20.2020202020202%" headers="mcps1.3.4.2.2.4.1.1 "><p id="cce_10_0193__p13909101919443">volcano-admission</p>
</td>
<td class="cellrowborder" valign="top" width="64.64646464646464%" headers="mcps1.3.4.2.2.4.1.2 "><p id="cce_10_0193__p1590991914418">Webhook server, which verifies and modifies resources such as pods and jobs</p>
</td>
<td class="cellrowborder" valign="top" width="15.151515151515149%" headers="mcps1.3.4.2.2.4.1.3 "><p id="cce_10_0193__p55511731144410">Deployment</p>
</td>
</tr>
<tr id="cce_10_0193__row2653710135812"><td class="cellrowborder" valign="top" width="20.2020202020202%" headers="mcps1.3.4.2.2.4.1.1 "><p id="cce_10_0193__p1246895384410">volcano-agent</p>
</td>
<td class="cellrowborder" valign="top" width="64.64646464646464%" headers="mcps1.3.4.2.2.4.1.2 "><p id="cce_10_0193__p37055414113">Cloud native hybrid agent, which is used for node QoS assurance, CPU burst, and dynamic resource oversubscription</p>
</td>
<td class="cellrowborder" valign="top" width="15.151515151515149%" headers="mcps1.3.4.2.2.4.1.3 "><p id="cce_10_0193__p365411016585">DaemonSet</p>
</td>
</tr>
<tr id="cce_10_0193__row12328654165713"><td class="cellrowborder" valign="top" width="20.2020202020202%" headers="mcps1.3.4.2.2.4.1.1 "><p id="cce_10_0193__p3330165495715">resource-exporter</p>
</td>
<td class="cellrowborder" valign="top" width="64.64646464646464%" headers="mcps1.3.4.2.2.4.1.2 "><p id="cce_10_0193__p193306546577">Report the NUMA topology information of nodes.</p>
</td>
<td class="cellrowborder" valign="top" width="15.151515151515149%" headers="mcps1.3.4.2.2.4.1.3 "><p id="cce_10_0193__p1833012543573">DaemonSet</p>
</td>
</tr>
<tr id="cce_10_0193__row3972337183014"><td class="cellrowborder" valign="top" width="20.2020202020202%" headers="mcps1.3.4.2.2.4.1.1 "><p id="cce_10_0193__p7972537183017">volcano-descheduler</p>
</td>
<td class="cellrowborder" valign="top" width="64.64646464646464%" headers="mcps1.3.4.2.2.4.1.2 "><p id="cce_10_0193__p14972437153017">Reschedule pods in a cluster. After the rescheduling capability is enabled, pods will be automatically deployed on nodes.</p>
</td>
<td class="cellrowborder" valign="top" width="15.151515151515149%" headers="mcps1.3.4.2.2.4.1.3 "><p id="cce_10_0193__p189729375303">Deployment</p>
</td>
</tr>
<tr id="cce_10_0193__row48152038153012"><td class="cellrowborder" valign="top" width="20.2020202020202%" headers="mcps1.3.4.2.2.4.1.1 "><p id="cce_10_0193__p381543815301">volcano-recommender</p>
</td>
<td class="cellrowborder" valign="top" width="64.64646464646464%" headers="mcps1.3.4.2.2.4.1.2 "><p id="cce_10_0193__p1281583816307">Generate recommendations for CPU and memory requests based on the historical CPU and memory usage of a container.</p>
</td>
<td class="cellrowborder" valign="top" width="15.151515151515149%" headers="mcps1.3.4.2.2.4.1.3 "><p id="cce_10_0193__p181518380305">Deployment</p>
</td>
</tr>
<tr id="cce_10_0193__row1843963910305"><td class="cellrowborder" valign="top" width="20.2020202020202%" headers="mcps1.3.4.2.2.4.1.1 "><p id="cce_10_0193__p3439153933019">volcano-recommender-prometheus-adapter</p>
</td>
<td class="cellrowborder" valign="top" width="64.64646464646464%" headers="mcps1.3.4.2.2.4.1.2 "><p id="cce_10_0193__p20439439153010">Collect historical CPU and memory metrics of containers from Prometheus.</p>
</td>
<td class="cellrowborder" valign="top" width="15.151515151515149%" headers="mcps1.3.4.2.2.4.1.3 "><p id="cce_10_0193__p13597130163120">Deployment</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="cce_10_0193__section03891044143310"><h4 class="sectiontitle">Modifying the volcano-scheduler Configurations Using the Console</h4><p id="cce_10_0193__p128991128182">volcano-scheduler is the component responsible for pod scheduling. It consists of a series of actions and plugins. Actions should be executed in every step. Plugins provide the action algorithm details in different scenarios. volcano-scheduler is highly scalable. You can specify and implement actions and plugins based on your requirements.</p>
<p id="cce_10_0193__p12255183843218">After the add-on is installed, you can choose <strong id="cce_10_0193__b17113156458">Settings</strong> in the navigation pane, switch to the <strong id="cce_10_0193__b56685919515">Scheduling</strong> tab, and configure the basic scheduling capabilities. You can also use the expert mode to customize advanced scheduling policies based on service scenarios.</p>
<p id="cce_10_0193__p6646145622517">This section describes how to configure volcano-scheduler.</p>
<div class="note" id="cce_10_0193__note13388133393710"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="cce_10_0193__p83326372378">Only Volcano of v1.7.1 and later support this function. </p>
</div></div>
<p id="cce_10_0193__p195053623613">Log in to the CCE console and click the cluster name to access the cluster console. In the navigation pane, choose <strong id="cce_10_0193__b081319114263">Settings</strong> and click the <strong id="cce_10_0193__b081317162615">Scheduling</strong> tab. In the <strong id="cce_10_0193__b6813815264">Default Cluster Scheduler</strong> area, find the expert mode and click <strong id="cce_10_0193__b198133152611">Try Now</strong>.</p>
<p id="cce_10_0193__p112531142104212"></p>
<p id="cce_10_0193__p1566143416357"></p>
<ul id="cce_10_0193__ul6676425408"><li id="cce_10_0193__li46762264018">Using <strong id="cce_10_0193__b073919506719">resource_exporter</strong>:<pre class="screen" id="cce_10_0193__screen7651947143817">...
"default_scheduler_conf": {
"actions": "allocate, backfill, preempt",
"tiers": [
{
"plugins": [
{
"name": "priority"
},
{
"name": "gang"
},
{
"name": "conformance"
}
]
},
{
"plugins": [
{
"name": "drf"
},
{
"name": "predicates"
},
{
"name": "nodeorder"
}
]
},
{
"plugins": [
{
"name": "cce-gpu-topology-predicate"
},
{
"name": "cce-gpu-topology-priority"
},
{
"name": "cce-gpu"
},
{
"name": "numa-aware" # add this also enable resource_exporter
}
]
},
{
"plugins": [
{
"name": "nodelocalvolume"
},
{
"name": "nodeemptydirvolume"
},
{
"name": "nodeCSIscheduling"
},
{
"name": "networkresource"
}
]
}
]
},
...</pre>
<p id="cce_10_0193__p20961144974013">After this function is enabled, you can use the functions of both numa-aware and resource_exporter.</p>
</li></ul>
</div>
<div class="section" id="cce_10_0193__section480816812444"><h4 class="sectiontitle">Collecting Prometheus Metrics</h4><p id="cce_10_0193__p1125971444418">volcano-scheduler exposes Prometheus metrics through port 8080. You can build a Prometheus collector to identify and obtain volcano-scheduler scheduling metrics from <strong id="cce_10_0193__b188888972316">http://{{<em id="cce_10_0193__i19794204610239">volcano-schedulerPodIP</em>}}:{{<em id="cce_10_0193__i12387204919231">volcano-schedulerPodPort</em>}}/metrics</strong>.</p>
<div class="note" id="cce_10_0193__note1129170194511"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="cce_10_0193__p3302034511">Prometheus metrics can be exposed only by the Volcano add-on of version 1.8.5 or later.</p>
</div></div>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="cce_10_0193__table142921923164517" frame="border" border="1" rules="all"><caption><b>Table 7 </b>Key metrics</caption><thead align="left"><tr id="cce_10_0193__row132921223144513"><th align="left" class="cellrowborder" valign="top" width="21%" id="mcps1.3.6.4.2.5.1.1"><p id="cce_10_0193__p129382314515">Metric</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="14.000000000000002%" id="mcps1.3.6.4.2.5.1.2"><p id="cce_10_0193__p429352324520">Type</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="39%" id="mcps1.3.6.4.2.5.1.3"><p id="cce_10_0193__p142939237452">Description</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="26%" id="mcps1.3.6.4.2.5.1.4"><p id="cce_10_0193__p1361935519493">Label</p>
</th>
</tr>
</thead>
<tbody><tr id="cce_10_0193__row13293122313457"><td class="cellrowborder" valign="top" width="21%" headers="mcps1.3.6.4.2.5.1.1 "><p id="cce_10_0193__p32935235456">e2e_scheduling_latency_milliseconds</p>
</td>
<td class="cellrowborder" valign="top" width="14.000000000000002%" headers="mcps1.3.6.4.2.5.1.2 "><p id="cce_10_0193__p19293152354519">Histogram</p>
</td>
<td class="cellrowborder" valign="top" width="39%" headers="mcps1.3.6.4.2.5.1.3 "><p id="cce_10_0193__p3293182311459">E2E scheduling latency (ms) (scheduling algorithm + binding)</p>
</td>
<td class="cellrowborder" valign="top" width="26%" headers="mcps1.3.6.4.2.5.1.4 "><p id="cce_10_0193__p161965513490">None</p>
</td>
</tr>
<tr id="cce_10_0193__row229322319459"><td class="cellrowborder" valign="top" width="21%" headers="mcps1.3.6.4.2.5.1.1 "><p id="cce_10_0193__p1829352354514">e2e_job_scheduling_latency_milliseconds</p>
</td>
<td class="cellrowborder" valign="top" width="14.000000000000002%" headers="mcps1.3.6.4.2.5.1.2 "><p id="cce_10_0193__p122933239454">Histogram</p>
</td>
<td class="cellrowborder" valign="top" width="39%" headers="mcps1.3.6.4.2.5.1.3 "><p id="cce_10_0193__p1729372354513">E2E job scheduling latency (ms)</p>
</td>
<td class="cellrowborder" valign="top" width="26%" headers="mcps1.3.6.4.2.5.1.4 "><p id="cce_10_0193__p062045513495">None</p>
</td>
</tr>
<tr id="cce_10_0193__row19293162324518"><td class="cellrowborder" valign="top" width="21%" headers="mcps1.3.6.4.2.5.1.1 "><p id="cce_10_0193__p2029372394511">e2e_job_scheduling_duration</p>
</td>
<td class="cellrowborder" valign="top" width="14.000000000000002%" headers="mcps1.3.6.4.2.5.1.2 "><p id="cce_10_0193__p1629312314514">Gauge</p>
</td>
<td class="cellrowborder" valign="top" width="39%" headers="mcps1.3.6.4.2.5.1.3 "><p id="cce_10_0193__p1429302394516">E2E job scheduling duration</p>
</td>
<td class="cellrowborder" valign="top" width="26%" headers="mcps1.3.6.4.2.5.1.4 "><p id="cce_10_0193__p1162045564916">labels=["job_name", "queue", "job_namespace"]</p>
</td>
</tr>
<tr id="cce_10_0193__row6293112318458"><td class="cellrowborder" valign="top" width="21%" headers="mcps1.3.6.4.2.5.1.1 "><p id="cce_10_0193__p1629322310454">plugin_scheduling_latency_microseconds</p>
</td>
<td class="cellrowborder" valign="top" width="14.000000000000002%" headers="mcps1.3.6.4.2.5.1.2 "><p id="cce_10_0193__p12293112364514">Histogram</p>
</td>
<td class="cellrowborder" valign="top" width="39%" headers="mcps1.3.6.4.2.5.1.3 "><p id="cce_10_0193__p12293923174518">Add-on scheduling latency (&micro;s)</p>
</td>
<td class="cellrowborder" valign="top" width="26%" headers="mcps1.3.6.4.2.5.1.4 "><p id="cce_10_0193__p962019553499">labels=["plugin", "OnSession"]</p>
</td>
</tr>
<tr id="cce_10_0193__row92933231456"><td class="cellrowborder" valign="top" width="21%" headers="mcps1.3.6.4.2.5.1.1 "><p id="cce_10_0193__p429352354519">action_scheduling_latency_microseconds</p>
</td>
<td class="cellrowborder" valign="top" width="14.000000000000002%" headers="mcps1.3.6.4.2.5.1.2 "><p id="cce_10_0193__p22936235455">Histogram</p>
</td>
<td class="cellrowborder" valign="top" width="39%" headers="mcps1.3.6.4.2.5.1.3 "><p id="cce_10_0193__p16293723174517">Action scheduling latency (&micro;s)</p>
</td>
<td class="cellrowborder" valign="top" width="26%" headers="mcps1.3.6.4.2.5.1.4 "><p id="cce_10_0193__p5620105594918">labels=["action"]</p>
</td>
</tr>
<tr id="cce_10_0193__row1529310236453"><td class="cellrowborder" valign="top" width="21%" headers="mcps1.3.6.4.2.5.1.1 "><p id="cce_10_0193__p1729362364510">task_scheduling_latency_milliseconds</p>
</td>
<td class="cellrowborder" valign="top" width="14.000000000000002%" headers="mcps1.3.6.4.2.5.1.2 "><p id="cce_10_0193__p10293823114519">Histogram</p>
</td>
<td class="cellrowborder" valign="top" width="39%" headers="mcps1.3.6.4.2.5.1.3 "><p id="cce_10_0193__p8293172354519">Task scheduling latency (ms)</p>
</td>
<td class="cellrowborder" valign="top" width="26%" headers="mcps1.3.6.4.2.5.1.4 "><p id="cce_10_0193__p1662019553492">None</p>
</td>
</tr>
<tr id="cce_10_0193__row8293142394510"><td class="cellrowborder" valign="top" width="21%" headers="mcps1.3.6.4.2.5.1.1 "><p id="cce_10_0193__p529316233458">schedule_attempts_total</p>
</td>
<td class="cellrowborder" valign="top" width="14.000000000000002%" headers="mcps1.3.6.4.2.5.1.2 "><p id="cce_10_0193__p13294323104520">Counter</p>
</td>
<td class="cellrowborder" valign="top" width="39%" headers="mcps1.3.6.4.2.5.1.3 "><p id="cce_10_0193__p794644164716">Number of pod scheduling attempts. <strong id="cce_10_0193__b18656101511301">unschedulable</strong> indicates that the pods cannot be scheduled, and <strong id="cce_10_0193__b12239143915305">error</strong> indicates that the internal scheduler is faulty.</p>
</td>
<td class="cellrowborder" valign="top" width="26%" headers="mcps1.3.6.4.2.5.1.4 "><p id="cce_10_0193__p16217557495">labels=["result"]</p>
</td>
</tr>
<tr id="cce_10_0193__row6885928194219"><td class="cellrowborder" valign="top" width="21%" headers="mcps1.3.6.4.2.5.1.1 "><p id="cce_10_0193__p78855289423">pod_preemption_victims</p>
</td>
<td class="cellrowborder" valign="top" width="14.000000000000002%" headers="mcps1.3.6.4.2.5.1.2 "><p id="cce_10_0193__p434641618433">Gauge</p>
</td>
<td class="cellrowborder" valign="top" width="39%" headers="mcps1.3.6.4.2.5.1.3 "><p id="cce_10_0193__p1294612414714">Number of selected preemption victims</p>
</td>
<td class="cellrowborder" valign="top" width="26%" headers="mcps1.3.6.4.2.5.1.4 "><p id="cce_10_0193__p662115594918">None</p>
</td>
</tr>
<tr id="cce_10_0193__row1656312296421"><td class="cellrowborder" valign="top" width="21%" headers="mcps1.3.6.4.2.5.1.1 "><p id="cce_10_0193__p956322912427">total_preemption_attempts</p>
</td>
<td class="cellrowborder" valign="top" width="14.000000000000002%" headers="mcps1.3.6.4.2.5.1.2 "><p id="cce_10_0193__p1756316292429">Counter</p>
</td>
<td class="cellrowborder" valign="top" width="39%" headers="mcps1.3.6.4.2.5.1.3 "><p id="cce_10_0193__p856316298429">Total number of preemption attempts in a cluster</p>
</td>
<td class="cellrowborder" valign="top" width="26%" headers="mcps1.3.6.4.2.5.1.4 "><p id="cce_10_0193__p162113557494">None</p>
</td>
</tr>
<tr id="cce_10_0193__row1495410293427"><td class="cellrowborder" valign="top" width="21%" headers="mcps1.3.6.4.2.5.1.1 "><p id="cce_10_0193__p4954122924218">unschedule_task_count</p>
</td>
<td class="cellrowborder" valign="top" width="14.000000000000002%" headers="mcps1.3.6.4.2.5.1.2 "><p id="cce_10_0193__p11484119194310">Gauge</p>
</td>
<td class="cellrowborder" valign="top" width="39%" headers="mcps1.3.6.4.2.5.1.3 "><p id="cce_10_0193__p1295422919429">Number of unschedulable tasks</p>
</td>
<td class="cellrowborder" valign="top" width="26%" headers="mcps1.3.6.4.2.5.1.4 "><p id="cce_10_0193__p36212552495">labels=["job_id"]</p>
</td>
</tr>
<tr id="cce_10_0193__row83731230164210"><td class="cellrowborder" valign="top" width="21%" headers="mcps1.3.6.4.2.5.1.1 "><p id="cce_10_0193__p23734300424">unschedule_job_count</p>
</td>
<td class="cellrowborder" valign="top" width="14.000000000000002%" headers="mcps1.3.6.4.2.5.1.2 "><p id="cce_10_0193__p097971974312">Gauge</p>
</td>
<td class="cellrowborder" valign="top" width="39%" headers="mcps1.3.6.4.2.5.1.3 "><p id="cce_10_0193__p10374203054215">Number of unschedulable jobs</p>
</td>
<td class="cellrowborder" valign="top" width="26%" headers="mcps1.3.6.4.2.5.1.4 "><p id="cce_10_0193__p146211455174910">None</p>
</td>
</tr>
<tr id="cce_10_0193__row2124162119433"><td class="cellrowborder" valign="top" width="21%" headers="mcps1.3.6.4.2.5.1.1 "><p id="cce_10_0193__p71242213435">job_retry_counts</p>
</td>
<td class="cellrowborder" valign="top" width="14.000000000000002%" headers="mcps1.3.6.4.2.5.1.2 "><p id="cce_10_0193__p912412134310">Counter</p>
</td>
<td class="cellrowborder" valign="top" width="39%" headers="mcps1.3.6.4.2.5.1.3 "><p id="cce_10_0193__p16125142154316">Number of job retries</p>
</td>
<td class="cellrowborder" valign="top" width="26%" headers="mcps1.3.6.4.2.5.1.4 "><p id="cce_10_0193__p3621175524911">labels=["job_id"]</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="cce_10_0193__section1684473112533"><h4 class="sectiontitle">Uninstalling the Volcano Add-on</h4><p id="cce_10_0193__p184973334533">After the add-on is uninstalled, all custom Volcano resources (<a href="#cce_10_0193__table148801381540">Table 8</a>) will be deleted, including the created resources. Reinstalling the add-on will not inherit or restore the tasks before the uninstallation. It is a good practice to uninstall the Volcano add-on only when no custom Volcano resources are being used in the cluster.</p>
<div class="tablenoborder"><a name="cce_10_0193__table148801381540"></a><a name="table148801381540"></a><table cellpadding="4" cellspacing="0" summary="" id="cce_10_0193__table148801381540" frame="border" border="1" rules="all"><caption><b>Table 8 </b>Custom Volcano resources</caption><thead align="left"><tr id="cce_10_0193__row158804380548"><th align="left" class="cellrowborder" valign="top" width="22%" id="mcps1.3.7.3.2.5.1.1"><p id="cce_10_0193__p128801438185419">Item</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="28.000000000000004%" id="mcps1.3.7.3.2.5.1.2"><p id="cce_10_0193__p1988063814547">API Group</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="25%" id="mcps1.3.7.3.2.5.1.3"><p id="cce_10_0193__p58801538185415">API Version</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="25%" id="mcps1.3.7.3.2.5.1.4"><p id="cce_10_0193__p10880113816543">Resource Level</p>
</th>
</tr>
</thead>
<tbody><tr id="cce_10_0193__row3880163885413"><td class="cellrowborder" valign="top" width="22%" headers="mcps1.3.7.3.2.5.1.1 "><p id="cce_10_0193__p888083818549">Command</p>
</td>
<td class="cellrowborder" valign="top" width="28.000000000000004%" headers="mcps1.3.7.3.2.5.1.2 "><p id="cce_10_0193__p3880153835419">bus.volcano.sh</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.7.3.2.5.1.3 "><p id="cce_10_0193__p18801938145413">v1alpha1</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.7.3.2.5.1.4 "><p id="cce_10_0193__p188018383541">Namespaced</p>
</td>
</tr>
<tr id="cce_10_0193__row1588073819544"><td class="cellrowborder" valign="top" width="22%" headers="mcps1.3.7.3.2.5.1.1 "><p id="cce_10_0193__p188023855412">Job</p>
</td>
<td class="cellrowborder" valign="top" width="28.000000000000004%" headers="mcps1.3.7.3.2.5.1.2 "><p id="cce_10_0193__p1588043815418">batch.volcano.sh</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.7.3.2.5.1.3 "><p id="cce_10_0193__p10880638135411">v1alpha1</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.7.3.2.5.1.4 "><p id="cce_10_0193__p12880183818548">Namespaced</p>
</td>
</tr>
<tr id="cce_10_0193__row1088015381545"><td class="cellrowborder" valign="top" width="22%" headers="mcps1.3.7.3.2.5.1.1 "><p id="cce_10_0193__p38807382545">Numatopology</p>
</td>
<td class="cellrowborder" valign="top" width="28.000000000000004%" headers="mcps1.3.7.3.2.5.1.2 "><p id="cce_10_0193__p1588020383545">nodeinfo.volcano.sh</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.7.3.2.5.1.3 "><p id="cce_10_0193__p8880638105414">v1alpha1</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.7.3.2.5.1.4 "><p id="cce_10_0193__p4880143812548">Cluster</p>
</td>
</tr>
<tr id="cce_10_0193__row417317553556"><td class="cellrowborder" valign="top" width="22%" headers="mcps1.3.7.3.2.5.1.1 "><p id="cce_10_0193__p617412554556">PodGroup</p>
</td>
<td class="cellrowborder" valign="top" width="28.000000000000004%" headers="mcps1.3.7.3.2.5.1.2 "><p id="cce_10_0193__p717425515511">scheduling.volcano.sh</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.7.3.2.5.1.3 "><p id="cce_10_0193__p1517435565518">v1beta1</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.7.3.2.5.1.4 "><p id="cce_10_0193__p17174135545510">Namespaced</p>
</td>
</tr>
<tr id="cce_10_0193__row2373210125620"><td class="cellrowborder" valign="top" width="22%" headers="mcps1.3.7.3.2.5.1.1 "><p id="cce_10_0193__p19373181012562">Queue</p>
</td>
<td class="cellrowborder" valign="top" width="28.000000000000004%" headers="mcps1.3.7.3.2.5.1.2 "><p id="cce_10_0193__p1373111013562">scheduling.volcano.sh</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.7.3.2.5.1.3 "><p id="cce_10_0193__p11373151016568">v1beta1</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.7.3.2.5.1.4 "><p id="cce_10_0193__p17373171035613">Cluster</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="cce_10_0193__section183121449435"><h4 class="sectiontitle">Release History</h4><div class="notice" id="cce_10_0193__note1531165814374"><span class="noticetitle"><img src="public_sys-resources/notice_3.0-en-us.png"> </span><div class="noticebody"><p id="cce_10_0193__p1496155911309">It is a good practice to upgrade Volcano to the latest version that is supported by the cluster.</p>
</div></div>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="cce_10_0193__table88489551792" frame="border" border="1" rules="all"><caption><b>Table 9 </b>Volcano Scheduler add-on</caption><thead align="left"><tr id="cce_10_0193__en-us_topic_0000001609894173_row139251455994"><th align="left" class="cellrowborder" valign="top" width="15.21%" id="mcps1.3.8.3.2.4.1.1"><p id="cce_10_0193__en-us_topic_0000001609894173_p13601510205420">Add-on Version</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="33.37%" id="mcps1.3.8.3.2.4.1.2"><p id="cce_10_0193__en-us_topic_0000001609894173_p156011107542">Supported Cluster Version</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="51.42%" id="mcps1.3.8.3.2.4.1.3"><p id="cce_10_0193__en-us_topic_0000001609894173_p1160131045411">New Feature</p>
</th>
</tr>
</thead>
<tbody><tr id="cce_10_0193__en-us_topic_0000001609894173_row193005482128"><td class="cellrowborder" valign="top" width="15.21%" headers="mcps1.3.8.3.2.4.1.1 "><p id="cce_10_0193__en-us_topic_0000001609894173_p1930044861212">1.21.2</p>
</td>
<td class="cellrowborder" valign="top" width="33.37%" headers="mcps1.3.8.3.2.4.1.2 "><p id="cce_10_0193__en-us_topic_0000001609894173_p1935419821314">v1.28</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p635419810134">v1.29</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p16354881134">v1.30</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p03549821310">v1.31</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p3354208121311">v1.32</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p183542820138">v1.33</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p535417811132">v1.34</p>
</td>
<td class="cellrowborder" valign="top" width="51.42%" headers="mcps1.3.8.3.2.4.1.3 "><ul id="cce_10_0193__en-us_topic_0000001609894173_ul168004226159"><li id="cce_10_0193__en-us_topic_0000001609894173_li16800102281511">Supported inference workload management.</li><li id="cce_10_0193__en-us_topic_0000001609894173_li181410310155">Supported multi-dimensional parallelism and collaborative scheduling across multi-level network topologies for training tasks.</li></ul>
</td>
</tr>
<tr id="cce_10_0193__en-us_topic_0000001609894173_row58451619174212"><td class="cellrowborder" valign="top" width="15.21%" headers="mcps1.3.8.3.2.4.1.1 "><p id="cce_10_0193__en-us_topic_0000001609894173_p18260162894210">1.19.6</p>
</td>
<td class="cellrowborder" valign="top" width="33.37%" headers="mcps1.3.8.3.2.4.1.2 "><p id="cce_10_0193__en-us_topic_0000001609894173_p18260152834219">v1.27</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p62601128184213">v1.28</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p192607283428">v1.29</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p526042817427">v1.30</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p17260182816423">v1.31</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p8260152810426">v1.32</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p1426012819422">v1.33</p>
</td>
<td class="cellrowborder" valign="top" width="51.42%" headers="mcps1.3.8.3.2.4.1.3 "><ul id="cce_10_0193__en-us_topic_0000001609894173_ul5704550114210"><li id="cce_10_0193__en-us_topic_0000001609894173_li1270417507422">Added support for scalable scheduling of logical NPU pools.</li><li id="cce_10_0193__en-us_topic_0000001609894173_li1704175010425">Added support for two-level group network topology-aware scheduling in training jobs.</li></ul>
</td>
</tr>
<tr id="cce_10_0193__en-us_topic_0000001609894173_row17503815619"><td class="cellrowborder" valign="top" width="15.21%" headers="mcps1.3.8.3.2.4.1.1 "><p id="cce_10_0193__en-us_topic_0000001609894173_p55183815617">1.18.3</p>
</td>
<td class="cellrowborder" valign="top" width="33.37%" headers="mcps1.3.8.3.2.4.1.2 "><p id="cce_10_0193__en-us_topic_0000001609894173_p11973115315569">v1.27</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p1697365335617">v1.28</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p597395395616">v1.29</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p897335375611">v1.30</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p197325314563">v1.31</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p99731536566">v1.32</p>
</td>
<td class="cellrowborder" valign="top" width="51.42%" headers="mcps1.3.8.3.2.4.1.3 "><ul id="cce_10_0193__en-us_topic_0000001609894173_ul19963296575"><li id="cce_10_0193__en-us_topic_0000001609894173_li16963189175717">Supported multi-xGPU preemption.</li></ul>
</td>
</tr>
<tr id="cce_10_0193__en-us_topic_0000001609894173_row5570113616"><td class="cellrowborder" valign="top" width="15.21%" headers="mcps1.3.8.3.2.4.1.1 "><p id="cce_10_0193__en-us_topic_0000001609894173_p1432359363">1.16.17</p>
</td>
<td class="cellrowborder" valign="top" width="33.37%" headers="mcps1.3.8.3.2.4.1.2 "><p id="cce_10_0193__en-us_topic_0000001609894173_p11327543620">v1.25</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p183313563615">v1.27</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p13337533610">v1.28</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p3331555368">v1.29</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p13331655366">v1.30</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p3332553616">v1.31</p>
</td>
<td class="cellrowborder" valign="top" width="51.42%" headers="mcps1.3.8.3.2.4.1.3 "><p id="cce_10_0193__en-us_topic_0000001609894173_p13973151703611">Supported even scheduling in virtual GPUs.</p>
</td>
</tr>
<tr id="cce_10_0193__en-us_topic_0000001609894173_row3439105011257"><td class="cellrowborder" valign="top" width="15.21%" headers="mcps1.3.8.3.2.4.1.1 "><p id="cce_10_0193__en-us_topic_0000001609894173_p6440155062519">1.15.11</p>
</td>
<td class="cellrowborder" valign="top" width="33.37%" headers="mcps1.3.8.3.2.4.1.2 "><p id="cce_10_0193__en-us_topic_0000001609894173_p1579410810262">v1.23</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p137945815265">v1.25</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p37941815268">v1.27</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p15794188142618">v1.28</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p1579414819268">v1.29</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p1189122042616">v1.30</p>
</td>
<td class="cellrowborder" valign="top" width="51.42%" headers="mcps1.3.8.3.2.4.1.3 "><p id="cce_10_0193__en-us_topic_0000001609894173_p6147153212265">Fixed some issues.</p>
</td>
</tr>
<tr id="cce_10_0193__en-us_topic_0000001609894173_row577110132033"><td class="cellrowborder" valign="top" width="15.21%" headers="mcps1.3.8.3.2.4.1.1 "><p id="cce_10_0193__en-us_topic_0000001609894173_p163891031533">1.15.6</p>
</td>
<td class="cellrowborder" valign="top" width="33.37%" headers="mcps1.3.8.3.2.4.1.2 "><p id="cce_10_0193__en-us_topic_0000001609894173_p186131183317">v1.23</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p19613518234">v1.25</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p461391812316">v1.27</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p1461315184319">v1.28</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p961310181037">v1.29</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p18613191815319">v1.30</p>
</td>
<td class="cellrowborder" valign="top" width="51.42%" headers="mcps1.3.8.3.2.4.1.3 "><p id="cce_10_0193__en-us_topic_0000001609894173_p13197144813319">Resources can be oversubscribed based on pod profiling.</p>
</td>
</tr>
<tr id="cce_10_0193__en-us_topic_0000001609894173_row787273535716"><td class="cellrowborder" valign="top" width="15.21%" headers="mcps1.3.8.3.2.4.1.1 "><p id="cce_10_0193__en-us_topic_0000001609894173_p1355213919578">1.13.3</p>
</td>
<td class="cellrowborder" valign="top" width="33.37%" headers="mcps1.3.8.3.2.4.1.2 "><p id="cce_10_0193__en-us_topic_0000001609894173_p1455223919578">v1.21</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p8553939115716">v1.23</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p1855313914575">v1.25</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p17553139155714">v1.27</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p1555323917572">v1.28</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p15553143914573">v1.29</p>
</td>
<td class="cellrowborder" valign="top" width="51.42%" headers="mcps1.3.8.3.2.4.1.3 "><ul id="cce_10_0193__en-us_topic_0000001609894173_ul66881155195718"><li id="cce_10_0193__en-us_topic_0000001609894173_li126886552578">Supported scale-in of custom resources based on node priorities.</li><li id="cce_10_0193__en-us_topic_0000001609894173_li56881955165710">Optimized the association between preemption and node scale-out.</li></ul>
</td>
</tr>
<tr id="cce_10_0193__en-us_topic_0000001609894173_row8615102613918"><td class="cellrowborder" valign="top" width="15.21%" headers="mcps1.3.8.3.2.4.1.1 "><p id="cce_10_0193__en-us_topic_0000001609894173_p98061347391">1.12.1</p>
</td>
<td class="cellrowborder" valign="top" width="33.37%" headers="mcps1.3.8.3.2.4.1.2 "><p id="cce_10_0193__en-us_topic_0000001609894173_p58061334183917">v1.19.16</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p08065347395">v1.21</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p1806734103917">v1.23</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p20806634143913">v1.25</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p1680693418392">v1.27</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p8806133419399">v1.28</p>
</td>
<td class="cellrowborder" valign="top" width="51.42%" headers="mcps1.3.8.3.2.4.1.3 "><p id="cce_10_0193__en-us_topic_0000001609894173_p26158265394">Optimized application auto scaling performance.</p>
</td>
</tr>
<tr id="cce_10_0193__en-us_topic_0000001609894173_row64767043110"><td class="cellrowborder" valign="top" width="15.21%" headers="mcps1.3.8.3.2.4.1.1 "><p id="cce_10_0193__en-us_topic_0000001609894173_p1147715013310">1.11.21</p>
</td>
<td class="cellrowborder" valign="top" width="33.37%" headers="mcps1.3.8.3.2.4.1.2 "><p id="cce_10_0193__en-us_topic_0000001609894173_p67421188313">v1.19.16</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p1774219814319">v1.21</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p1274215813317">v1.23</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p1674268133112">v1.25</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p474210820312">v1.27</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p17428863113">v1.28</p>
</td>
<td class="cellrowborder" valign="top" width="51.42%" headers="mcps1.3.8.3.2.4.1.3 "><ul id="cce_10_0193__en-us_topic_0000001609894173_ul176081014153116"><li id="cce_10_0193__en-us_topic_0000001609894173_li4608141443119">Supported Kubernetes v1.28.</li><li id="cce_10_0193__en-us_topic_0000001609894173_li3608114193113">Supported load-aware scheduling.</li><li id="cce_10_0193__en-us_topic_0000001609894173_li760821417312">Changed the image OS to <span id="cce_10_0193__en-us_topic_0000001609894173_ph14203823133013">HCE OS 2.0</span>.</li><li id="cce_10_0193__en-us_topic_0000001609894173_li176081014123118">Optimized CSI resource preemption.</li><li id="cce_10_0193__en-us_topic_0000001609894173_li166081714103120">Optimized load-aware rescheduling.</li><li id="cce_10_0193__en-us_topic_0000001609894173_li1608141453118">Optimized preemption in hybrid deployment scenarios.</li></ul>
</td>
</tr>
<tr id="cce_10_0193__en-us_topic_0000001609894173_row292137208"><td class="cellrowborder" valign="top" width="15.21%" headers="mcps1.3.8.3.2.4.1.1 "><p id="cce_10_0193__en-us_topic_0000001609894173_p82755117013">1.11.6</p>
</td>
<td class="cellrowborder" valign="top" width="33.37%" headers="mcps1.3.8.3.2.4.1.2 "><p id="cce_10_0193__en-us_topic_0000001609894173_p2276511904">v1.19.16</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p32761611902">v1.21</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p1427651110020">v1.23</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p152761119019">v1.25</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p06631925908">v1.27</p>
</td>
<td class="cellrowborder" valign="top" width="51.42%" headers="mcps1.3.8.3.2.4.1.3 "><ul id="cce_10_0193__en-us_topic_0000001609894173_ul071414332003"><li id="cce_10_0193__en-us_topic_0000001609894173_li19714103315017">Supported Kubernetes v1.27.</li><li id="cce_10_0193__en-us_topic_0000001609894173_li187141633201">Supported rescheduling.</li><li id="cce_10_0193__en-us_topic_0000001609894173_li3715163318016">Supported affinity scheduling of nodes in node pools.</li><li id="cce_10_0193__en-us_topic_0000001609894173_li107157331202">Optimized the scheduling performance.</li></ul>
</td>
</tr>
<tr id="cce_10_0193__en-us_topic_0000001609894173_row546617333191"><td class="cellrowborder" valign="top" width="15.21%" headers="mcps1.3.8.3.2.4.1.1 "><p id="cce_10_0193__en-us_topic_0000001609894173_p246663341914">1.9.1</p>
</td>
<td class="cellrowborder" valign="top" width="33.37%" headers="mcps1.3.8.3.2.4.1.2 "><p id="cce_10_0193__en-us_topic_0000001609894173_p282274411918">v1.19.16</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p1391061692212">v1.21</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p4910191692212">v1.23</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p091021662211">v1.25</p>
</td>
<td class="cellrowborder" valign="top" width="51.42%" headers="mcps1.3.8.3.2.4.1.3 "><ul id="cce_10_0193__en-us_topic_0000001609894173_ul72541248141916"><li id="cce_10_0193__en-us_topic_0000001609894173_li122546488191">Fixed the issue where the counting pipeline pod of the networkresource plugin occupies supplementary network interfaces.</li><li id="cce_10_0193__en-us_topic_0000001609894173_li16254248191912">Fixed the issue where the binpack plugin scores nodes with insufficient resources.</li><li id="cce_10_0193__en-us_topic_0000001609894173_li8254204861913">Fixed the issue of processing resources in the pod with an unknown end status.</li><li id="cce_10_0193__en-us_topic_0000001609894173_li2255448151914">Optimized event output.</li><li id="cce_10_0193__en-us_topic_0000001609894173_li1525519484195">Supported HA deployment by default.</li></ul>
</td>
</tr>
<tr id="cce_10_0193__en-us_topic_0000001609894173_row328716461974"><td class="cellrowborder" valign="top" width="15.21%" headers="mcps1.3.8.3.2.4.1.1 "><p id="cce_10_0193__en-us_topic_0000001609894173_p132879461770">1.7.1</p>
</td>
<td class="cellrowborder" valign="top" width="33.37%" headers="mcps1.3.8.3.2.4.1.2 "><p id="cce_10_0193__en-us_topic_0000001609894173_p68811572319">v1.19.16</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p28818515231">v1.21</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p18825132313">v1.23</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p888165152320">v1.25</p>
</td>
<td class="cellrowborder" valign="top" width="51.42%" headers="mcps1.3.8.3.2.4.1.3 "><p id="cce_10_0193__en-us_topic_0000001609894173_p13744731163113">Supported clusters v1.25.</p>
</td>
</tr>
<tr id="cce_10_0193__en-us_topic_0000001609894173_row340212477539"><td class="cellrowborder" valign="top" width="15.21%" headers="mcps1.3.8.3.2.4.1.1 "><p id="cce_10_0193__en-us_topic_0000001609894173_p2402947155311">1.4.5</p>
</td>
<td class="cellrowborder" valign="top" width="33.37%" headers="mcps1.3.8.3.2.4.1.2 "><p id="cce_10_0193__en-us_topic_0000001609894173_p1240224745315">v1.17</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p15763524122317">v1.19</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p8763132442320">v1.21</p>
</td>
<td class="cellrowborder" valign="top" width="51.42%" headers="mcps1.3.8.3.2.4.1.3 "><p id="cce_10_0193__en-us_topic_0000001609894173_p1423247151310">Changed the deployment mode of volcano-scheduler from StatefulSet to Deployment, and fixed the issue where pods cannot be automatically migrated when the node is abnormal.</p>
</td>
</tr>
<tr id="cce_10_0193__en-us_topic_0000001609894173_row11369153312315"><td class="cellrowborder" valign="top" width="15.21%" headers="mcps1.3.8.3.2.4.1.1 "><p id="cce_10_0193__en-us_topic_0000001609894173_p1811415316237">1.3.7</p>
</td>
<td class="cellrowborder" valign="top" width="33.37%" headers="mcps1.3.8.3.2.4.1.2 "><p id="cce_10_0193__en-us_topic_0000001609894173_p1566303814231">v1.15</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p46638387233">v1.17</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p1366373832317">v1.19</p>
<p id="cce_10_0193__en-us_topic_0000001609894173_p2663338182315">v1.21</p>
</td>
<td class="cellrowborder" valign="top" width="51.42%" headers="mcps1.3.8.3.2.4.1.3 "><ul id="cce_10_0193__en-us_topic_0000001609894173_ul67351117245"><li id="cce_10_0193__en-us_topic_0000001609894173_li47318111245">Supported hybrid deployment of online and offline jobs and resource oversubscription.</li><li id="cce_10_0193__en-us_topic_0000001609894173_li373611102419">Optimized the scheduling throughput for clusters.</li><li id="cce_10_0193__en-us_topic_0000001609894173_li1573151114240">Fixed the issue where the scheduler panics in certain scenarios.</li><li id="cce_10_0193__en-us_topic_0000001609894173_li673161112248">Fixed the issue where the <strong id="cce_10_0193__en-us_topic_0000001609894173_b59611444181419">volumes.secret</strong> verification of the volcano job in the CCE clusters v1.15 fails.</li><li id="cce_10_0193__en-us_topic_0000001609894173_li17314117246">Fixed the issue where jobs fail to be scheduled when volumes are mounted.</li></ul>
</td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="cce_10_0907.html">Scheduling and Elasticity Add-ons</a></div>
</div>
</div>