Files
doc-exports/docs/cce/umn/cce_10_0777.html
qiujiandong1 71d5c814e7 CCE UMN 20250311 version
Reviewed-by: Eotvos, Oliver <oliver.eotvos@t-systems.com>
Co-authored-by: qiujiandong1 <qiujiandong1@huawei.com>
Co-committed-by: qiujiandong1 <qiujiandong1@huawei.com>
2025-06-16 14:58:53 +00:00

27 lines
5.0 KiB
HTML

<a name="cce_10_0777"></a><a name="cce_10_0777"></a>
<h1 class="topictitle1">DRF</h1>
<div id="body0000001719075297"><p id="cce_10_0777__p73914110206">Dominant Resource Fairness (DRF) is a scheduling algorithm based on the dominant resource of a container group. DRF scheduling can be used to enhance the service throughput of a cluster, shorten the overall service execution time, and improve service running performance. It is suitable for batch AI training and big data jobs.</p>
<div class="section" id="cce_10_0777__section66128486204"><h4 class="sectiontitle">Prerequisites</h4><ul id="cce_10_0777__ul336494616217"><li id="cce_10_0777__li11364346142116">A cluster of v1.19 or later is available. For details, see <a href="cce_10_0028.html">Creating a CCE Standard/Turbo Cluster</a>.</li><li id="cce_10_0777__li1536419463211">The Volcano add-on has been installed. For details, see <a href="cce_10_0193.html">Volcano Scheduler</a>.</li></ul>
</div>
<div class="section" id="cce_10_0777__section75462794511"><h4 class="sectiontitle">How It Works</h4><p id="cce_10_0777__p33913142018">In actual services, limited cluster resources are often allocated to multiple users. Each user has the same rights to obtain resources, but the number of resources they need may be different. It is crucial to fairly allocate resources to each user. A common scheduling algorithm is the max-min fairness share, which allocates resources to meet users' minimum requirements as far as possible and then fairly allocates the remaining resources. The rules are as follows:</p>
<ol id="cce_10_0777__ol539181162015"><li id="cce_10_0777__li113915113208">Resources are allocated in order of increasing demand.</li><li id="cce_10_0777__li339131172014">No source gets a resource share larger than its demand.</li><li id="cce_10_0777__li143912122017">Sources with unsatisfied demands get an equal share of the resource.</li></ol>
</div>
<p id="cce_10_0777__p2391181172012">The max-min fairness algorithm applies to the single resource scenario, where all jobs are requesting the same resources. However, in actual situations, multiple resources are involved. For example, CPU, memory, and GPU resources are requested for allocation. DRF can be used to resolve the preceding issue. DRF can be considered as a general version of the max-min fairness algorithm and supports fair allocation of multiple types of resources so that the dominant resource of each user meets the max-min fairness requirement.</p>
<p id="cce_10_0777__p2073237619">The share value of each job resource is calculated using the following formula:</p>
<p id="cce_10_0777__p1718201616615"><strong id="cce_10_0777__b629299716">Share = Total requested resources/Cluster resources</strong></p>
<p id="cce_10_0777__p63420447713">If a job involves multiple resources, the resource with the largest share value is the dominant resource. The share value of the dominant resource will be used in priority-based scheduling.</p>
<p id="cce_10_0777__p6391181172010">For example, there are two workloads, job 1 and job 2. The following figure shows the resources requested by the two jobs. After DRF calculation, the dominant resource of job 1 is memory, and its share value is 0.4; the dominant resource of job 2 is CPU, and its share value is 0.5. Since the dominant resource share of job 1 is less than that of job 2, job 1 takes precedence over job 2 in scheduling according to the max-min fairness policy.</p>
<div class="fignone" id="cce_10_0777__fig1617020329418"><span class="figcap"><b>Figure 1 </b>DRF scheduling</span><br><span><img class="eddx" id="cce_10_0777__image14171143213411" src="en-us_image_0000002253620425.png"></span></div>
<div class="section" id="cce_10_0777__section157122005014"><h4 class="sectiontitle">Configuring DRF</h4><p id="cce_10_0777__p18391121132011">After Volcano is installed, you can enable or disable DRF scheduling on the <strong id="cce_10_0777__b12960173134318">Scheduling</strong> page. This function is enabled by default.</p>
<ol id="cce_10_0777__ol082013495339"><li id="cce_10_0777__li118206499335"><span>Log in to the CCE console.</span></li><li id="cce_10_0777__li12624191243420"><span>Click the cluster name to access the cluster console. Choose <span class="uicontrol" id="cce_10_0777__uicontrol13363325534167"><b>Settings</b></span> in the navigation pane. In the right pane, click the <strong id="cce_10_0777__b11137530004167">Scheduling</strong> tab.</span></li><li id="cce_10_0777__li17388143951012"><span>In the <strong id="cce_10_0777__b14532969375332">AI task performance enhanced scheduling</strong> pane, select whether to enable DRF.</span><p><p id="cce_10_0777__p4153185621210">This function helps you enhance the service throughput of the cluster and improve service running performance.</p>
</p></li><li id="cce_10_0777__li612818546220"><span>Click <strong id="cce_10_0777__b85203635542042">Confirm</strong>.</span></li></ol>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="cce_10_0776.html">AI Performance-based Scheduling</a></div>
</div>
</div>