Files
doc-exports/docs/mrs/umn/admin_guide_000409.html
yangtong c285e88a17 MRS UMN 20250806 version
Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com>
Co-authored-by: yangtong <yangtong2@huawei.com>
Co-committed-by: yangtong <yangtong2@huawei.com>
2025-09-02 10:43:57 +00:00

397 lines
45 KiB
HTML

<a name="admin_guide_000409"></a><a name="admin_guide_000409"></a>
<h1 class="topictitle1">Adding an SQL Inspection</h1>
<div id="body0000001971075810"><div class="section" id="admin_guide_000409__en-us_topic_0000001662442869_section893021811223"><h4 class="sectiontitle">Scenario</h4><p id="admin_guide_000409__en-us_topic_0000001662442869_p16912142613228">You can add rules for specified tenants and SQL engines on MRS Manager. The system will display hints on, intercept, or block SQL requests matched by the rules.</p>
<div class="note" id="admin_guide_000409__en-us_topic_0000001662442869_note10353738185310"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="admin_guide_000409__en-us_topic_0000001662442869_p535313813539">Exercise caution when you add or modify a SQL inspection rule for a cluster, enable a rule, and set the threshold. An improper rule may cause upper-layer service interruption.</p>
</div></div>
</div>
<div class="section" id="admin_guide_000409__en-us_topic_0000001662442869_section121644305262"><h4 class="sectiontitle">Adding a Rule</h4><ol id="admin_guide_000409__en-us_topic_0000001662442869_ol18912336114318"><li id="admin_guide_000409__en-us_topic_0000001662442869_li16912123616434"><span>Log in to MRS Manager as a user with the Manager administrator rights.</span></li><li id="admin_guide_000409__en-us_topic_0000001662442869_li1276264713432"><span>Click <strong id="admin_guide_000409__en-us_topic_0000001662442869_b2246124117469">Cluster</strong> and choose <strong id="admin_guide_000409__en-us_topic_0000001662442869_b84737393477">SQL Inspector</strong>. The <strong id="admin_guide_000409__en-us_topic_0000001662442869_b15023214472">SQL Inspector</strong> page is displayed.</span><p><p id="admin_guide_000409__en-us_topic_0000001662442869_p06651098527">You can click <strong id="admin_guide_000409__en-us_topic_0000001662442869_b19538123114815">View Supported Rules</strong> to view all SQL inspection rules supported by the current cluster.</p>
</p></li><li id="admin_guide_000409__en-us_topic_0000001662442869_li1241714358493"><span>Click <strong id="admin_guide_000409__en-us_topic_0000001662442869_b329713151498">Add Rule</strong>. After the password of the current user is verified, the <strong id="admin_guide_000409__en-us_topic_0000001662442869_b693119287492">Add Rule</strong> page is displayed.</span></li><li id="admin_guide_000409__en-us_topic_0000001662442869_li10801174411448"><span>Set the required parameters and click <strong id="admin_guide_000409__en-us_topic_0000001662442869_b540233044912">OK</strong>.</span><p>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="admin_guide_000409__en-us_topic_0000001662442869_table733222512455" frame="border" border="1" rules="all"><thead align="left"><tr id="admin_guide_000409__en-us_topic_0000001662442869_row63338258452"><th align="left" class="cellrowborder" valign="top" width="39.32%" id="mcps1.3.2.2.4.2.1.1.3.1.1"><p id="admin_guide_000409__en-us_topic_0000001662442869_p3333132544512">Parameter</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="60.68%" id="mcps1.3.2.2.4.2.1.1.3.1.2"><p id="admin_guide_000409__en-us_topic_0000001662442869_p9334182514520">Description</p>
</th>
</tr>
</thead>
<tbody><tr id="admin_guide_000409__en-us_topic_0000001662442869_row193346254451"><td class="cellrowborder" valign="top" width="39.32%" headers="mcps1.3.2.2.4.2.1.1.3.1.1 "><p id="admin_guide_000409__en-us_topic_0000001662442869_p63342255457">Name</p>
</td>
<td class="cellrowborder" valign="top" width="60.68%" headers="mcps1.3.2.2.4.2.1.1.3.1.2 "><p id="admin_guide_000409__en-us_topic_0000001662442869_p1533432515453">Name of a SQL inspection rule</p>
</td>
</tr>
<tr id="admin_guide_000409__en-us_topic_0000001662442869_row1533462534512"><td class="cellrowborder" valign="top" width="39.32%" headers="mcps1.3.2.2.4.2.1.1.3.1.1 "><p id="admin_guide_000409__en-us_topic_0000001662442869_p18334122564511">ID</p>
</td>
<td class="cellrowborder" valign="top" width="60.68%" headers="mcps1.3.2.2.4.2.1.1.3.1.2 "><p id="admin_guide_000409__en-us_topic_0000001662442869_p18963930154718">Rule ID</p>
<p id="admin_guide_000409__en-us_topic_0000001662442869_p1256064155315">For details about meaning of the rules corresponding to the IDs, see <a href="#admin_guide_000409__table1960601265610">Table 1</a>.</p>
</td>
</tr>
<tr id="admin_guide_000409__en-us_topic_0000001662442869_row63341259451"><td class="cellrowborder" valign="top" width="39.32%" headers="mcps1.3.2.2.4.2.1.1.3.1.1 "><p id="admin_guide_000409__en-us_topic_0000001662442869_p2033452519453">Tenant</p>
</td>
<td class="cellrowborder" valign="top" width="60.68%" headers="mcps1.3.2.2.4.2.1.1.3.1.2 "><p id="admin_guide_000409__en-us_topic_0000001662442869_p1256112617484">Click <strong id="admin_guide_000409__en-us_topic_0000001662442869_b1603171185318">Add</strong> to select the name of the tenant to which the current rule will be associated.</p>
<p id="admin_guide_000409__en-us_topic_0000001662442869_p2239181806">If you need to add a new tenant, plan and create a cluster tenant by referring to <a href="admin_guide_000087.html">Tenant Resources</a>.</p>
</td>
</tr>
<tr id="admin_guide_000409__en-us_topic_0000001662442869_row11334102520456"><td class="cellrowborder" valign="top" width="39.32%" headers="mcps1.3.2.2.4.2.1.1.3.1.1 "><p id="admin_guide_000409__en-us_topic_0000001662442869_p33343255457">Services and Actions</p>
</td>
<td class="cellrowborder" valign="top" width="60.68%" headers="mcps1.3.2.2.4.2.1.1.3.1.2 "><p id="admin_guide_000409__en-us_topic_0000001662442869_p1742741313210">Click <strong id="admin_guide_000409__en-us_topic_0000001662442869_b7612134175417">Add</strong> to specify the SQL engine to which this rule will be associated with and set the threshold parameters of the rule.</p>
<p id="admin_guide_000409__en-us_topic_0000001662442869_p8180423114913">Each rule can be associated with one SQL engine. If you want to configure a rule for other SQL engines, add new rules.</p>
<ul id="admin_guide_000409__en-us_topic_0000001662442869_ul10280205119496"><li id="admin_guide_000409__en-us_topic_0000001662442869_li8280451174918"><strong id="admin_guide_000409__en-us_topic_0000001662442869_b17670238194918">Service</strong>: Select the SQL engine associated with the current rule.</li><li id="admin_guide_000409__en-us_topic_0000001662442869_li1583934315617">If an SQL request meets the rule, the system performs the following operations:<ul id="admin_guide_000409__en-us_topic_0000001662442869_ul1445514501361"><li id="admin_guide_000409__en-us_topic_0000001662442869_li346531411529"><strong id="admin_guide_000409__en-us_topic_0000001662442869_b1512484534913">Hint</strong>: Record logs and display a hint for handling the SQL request. If the rule has parameters, you need to configure the threshold.</li><li id="admin_guide_000409__en-us_topic_0000001662442869_li7282112911526"><strong id="admin_guide_000409__en-us_topic_0000001662442869_b1178743145012">Intercept</strong>: Intercept the SQL request that meets the rule. If the rule has parameters, you need to configure the threshold.</li><li id="admin_guide_000409__en-us_topic_0000001662442869_li152341121768"><strong id="admin_guide_000409__en-us_topic_0000001662442869_b31001449185013">Block</strong>: Block the SQL request that meets the rule. If the rule has parameters, you need to configure the threshold.<div class="note" id="admin_guide_000409__en-us_topic_0000001662442869_note1934815319818"><span class="notetitle"> NOTE: </span><div class="notebody"><p id="admin_guide_000409__en-us_topic_0000001662442869_p1597117131618">For static and dynamic interception rules, <strong id="admin_guide_000409__en-us_topic_0000001662442869_b12406144055417">Hint</strong> and <strong id="admin_guide_000409__en-us_topic_0000001662442869_b51511390545">Block</strong> operations are supported. For blocking rules, only the <strong id="admin_guide_000409__en-us_topic_0000001662442869_b488335175512">Block</strong> operation is supported.</p>
</div></div>
</li></ul>
</li></ul>
</td>
</tr>
</tbody>
</table>
</div>
</p></li><li id="admin_guide_000409__en-us_topic_0000001662442869_li175116711302"><span>View the added prevention rule on the <strong id="admin_guide_000409__en-us_topic_0000001662442869_b5642423466">SQL Defense</strong> page. The rule takes effect dynamically.</span><p><p id="admin_guide_000409__en-us_topic_0000001662442869_p163618873019">To adjust the current rule, click <strong id="admin_guide_000409__en-us_topic_0000001662442869_b1634272834619">Modify</strong> in the <strong id="admin_guide_000409__en-us_topic_0000001662442869_b970983344615">Operation</strong> column of the row that contains the target rule. After the user password is verified, you can modify rule parameters.</p>
<div class="fignone" id="admin_guide_000409__en-us_topic_0000001662442869_fig01314297165"><span class="figcap"><b>Figure 1 </b>Viewing SQL inspection rules</span><br><span><img id="admin_guide_000409__en-us_topic_0000001662442869_image479362481612" src="en-us_image_0000001971077930.png"></span></div>
</p></li></ol>
</div>
<div class="section" id="admin_guide_000409__en-us_topic_0000001662442869_section19510043143814"><a name="admin_guide_000409__en-us_topic_0000001662442869_section19510043143814"></a><a name="en-us_topic_0000001662442869_section19510043143814"></a><h4 class="sectiontitle">MRS SQL Inspection Rules</h4>
<div class="tablenoborder"><a name="admin_guide_000409__table1960601265610"></a><a name="table1960601265610"></a><table cellpadding="4" cellspacing="0" summary="" id="admin_guide_000409__table1960601265610" frame="border" border="1" rules="all"><caption><b>Table 1 </b>MRS SQL inspection rules</caption><thead align="left"><tr id="admin_guide_000409__row7879161265614"><th align="left" class="cellrowborder" valign="top" width="12.5%" id="mcps1.3.3.2.2.7.1.1"><p id="admin_guide_000409__p5879141214560">ID</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="12.5%" id="mcps1.3.3.2.2.7.1.2"><p id="admin_guide_000409__p5879191211568">Description</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="12.5%" id="mcps1.3.3.2.2.7.1.3"><p id="admin_guide_000409__p16879141217569">Engine</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="12.5%" id="mcps1.3.3.2.2.7.1.4"><p id="admin_guide_000409__p19879181210567">Threshold</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="25%" id="mcps1.3.3.2.2.7.1.5"><p id="admin_guide_000409__p417615384374">Example SQL Statement</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="25%" id="mcps1.3.3.2.2.7.1.6"><p id="admin_guide_000409__p1373734617111">Impact</p>
</th>
</tr>
</thead>
<tbody><tr id="admin_guide_000409__row1087971210565"><td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.1 "><p id="admin_guide_000409__p28791012125612">static_0001</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.2 "><p id="admin_guide_000409__p1187921215561">Check whether the number of the <strong id="admin_guide_000409__b99205217393">count(distinct)</strong> functions used in a SQL statement exceeds the preconfigured threshold.</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.3 "><ul id="admin_guide_000409__ul837312815586"><li id="admin_guide_000409__li19373158145815">Hive</li><li id="admin_guide_000409__li437312810583">Spark</li><li id="admin_guide_000409__li3374138195817">HetuEngine</li><li id="admin_guide_000409__li155031134182919">Doris</li></ul>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.4 "><p id="admin_guide_000409__p12881112115614">Number of the <strong id="admin_guide_000409__b01702246391">count(distinct)</strong> functions</p>
<p id="admin_guide_000409__p1372034318410">Recommended value: <strong id="admin_guide_000409__b20644126566">10</strong></p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.3.2.2.7.1.5 "><p id="admin_guide_000409__p19301235445">SELECT COUNT(DISTINCT deviceId), COUNT(DISTINCT collDeviceId)</p>
<p id="admin_guide_000409__p14480152711440">FROM table</p>
<p id="admin_guide_000409__p10177143817372">GROUP BY deviceName, collDeviceName, collCurrentVersion;</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.3.2.2.7.1.6 "><p id="admin_guide_000409__p6737184612117">The <strong id="admin_guide_000409__b3109546437">select count(distinct)</strong> syntax generates only one Reduce. When a large table is processed, the data volume to shuffle is large and the execution is slow. If there are multiple <strong id="admin_guide_000409__b134285812426">count distinct</strong>, multiple records are generated for the same record for shuffling, increasing the shuffling amount and slowing down the job execution.</p>
</td>
</tr>
<tr id="admin_guide_000409__row2881151225614"><td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.1 "><p id="admin_guide_000409__p18881131220565">static_0002</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.2 "><p id="admin_guide_000409__p588118125564">Check whether <strong id="admin_guide_000409__b537931783914">not in &lt;subquery&gt;</strong> is used in a SQL statement.</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.3 "><ul id="admin_guide_000409__ul797413110367"><li id="admin_guide_000409__li1397411143615">Hive</li><li id="admin_guide_000409__li797418117367">Spark</li><li id="admin_guide_000409__li2974114367">HetuEngine</li><li id="admin_guide_000409__li19756184613293">Doris</li></ul>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.4 "><p id="admin_guide_000409__p72966451574">N/A</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.3.2.2.7.1.5 "><p id="admin_guide_000409__p7415184414443">SELECT *</p>
<p id="admin_guide_000409__p10884104618443">FROM Orders o</p>
<p id="admin_guide_000409__p1726495611447">WHERE Orders.Order_ID not in (Select Order_ID</p>
<p id="admin_guide_000409__p17273902452">FROM HeldOrders h</p>
<p id="admin_guide_000409__p121771438133716">where h.order_id = o.order_id);</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.3.2.2.7.1.6 "><p id="admin_guide_000409__p1773714621119">The <strong id="admin_guide_000409__b357117484416">not in</strong> subquery performance is poor. If the listed values of the <strong id="admin_guide_000409__b1615072684418">not in</strong> clause contains <strong id="admin_guide_000409__b18419027144418">null</strong>, no data is returned in the result.</p>
</td>
</tr>
<tr id="admin_guide_000409__row2881141275620"><td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.1 "><p id="admin_guide_000409__p6882512185619">static_0003</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.2 "><p id="admin_guide_000409__p88829128568">Check whether the number of joins in a SQL statement exceeds the threshold.</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.3 "><ul id="admin_guide_000409__ul145610233614"><li id="admin_guide_000409__li4561426360">Hive</li><li id="admin_guide_000409__li8562213611">Spark</li><li id="admin_guide_000409__li4561223614">HetuEngine</li><li id="admin_guide_000409__li1268085172919">Doris</li></ul>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.4 "><p id="admin_guide_000409__p688219129561">Number of joins</p>
<p id="admin_guide_000409__p186864574474">Recommended value: <strong id="admin_guide_000409__b76357140719">20</strong></p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.3.2.2.7.1.5 "><p id="admin_guide_000409__p19582183555215">N/A</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.3.2.2.7.1.6 "><p id="admin_guide_000409__p873794619113">The more tables are joined, the more files, partitions, and data are scanned. As a result, the SQL statement occupies too much memory, affecting cluster stability.</p>
</td>
</tr>
<tr id="admin_guide_000409__row28821127562"><td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.1 "><p id="admin_guide_000409__p388271285613">static_0004</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.2 "><p id="admin_guide_000409__p1188214127565">Check whether the number of the <strong id="admin_guide_000409__b12499163394020">union all</strong> operators in a SQL statement exceeds the threshold.</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.3 "><ul id="admin_guide_000409__ul96292133619"><li id="admin_guide_000409__li116215214361">Hive</li><li id="admin_guide_000409__li16282103619">Spark</li><li id="admin_guide_000409__li8624213613">HetuEngine</li><li id="admin_guide_000409__li15599115719297">Doris</li></ul>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.4 "><p id="admin_guide_000409__p11882912105618">Number of the <strong id="admin_guide_000409__b96361310204111">union all</strong> operators in a statement</p>
<p id="admin_guide_000409__p1513815124815">Recommended value: <strong id="admin_guide_000409__b758795713719">20</strong></p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.3.2.2.7.1.5 "><p id="admin_guide_000409__p112256955517">select * from tables t1</p>
<p id="admin_guide_000409__p4225296552">union all select * from tables t2</p>
<p id="admin_guide_000409__p18225129125515">union all select * from tables t3</p>
<p id="admin_guide_000409__p5225199195518">union all select * from tables t4</p>
<p id="admin_guide_000409__p1822549135511">union all select * from tables t5</p>
<p id="admin_guide_000409__p1922514995518">union all select * from tables t6</p>
<p id="admin_guide_000409__p1122513965516">union all select * from tables t7</p>
<p id="admin_guide_000409__p102255945516">union all select * from tables t8</p>
<p id="admin_guide_000409__p162253945512">union all select * from tables t9;</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.3.2.2.7.1.6 "><p id="admin_guide_000409__p187371646101114">A large number of <strong id="admin_guide_000409__b194749174411">union all</strong> operations may generate ultra-large result sets. As a result, a large number of HDFS and Yarn resources are occupied during shuffling.</p>
</td>
</tr>
<tr id="admin_guide_000409__row1988281295617"><td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.1 "><p id="admin_guide_000409__p1488211215610">static_0005</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.2 "><p id="admin_guide_000409__p6882151214564">Check whether the number of nested subqueries exceeds the threshold.</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.3 "><ul id="admin_guide_000409__ul6698243614"><li id="admin_guide_000409__li146918203616">Hive</li><li id="admin_guide_000409__li166972113611">Spark</li><li id="admin_guide_000409__li96911253610">HetuEngine</li><li id="admin_guide_000409__li9166201193014">Doris</li></ul>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.4 "><p id="admin_guide_000409__p1088201225615">Number of nested subqueries</p>
<p id="admin_guide_000409__p18428121619495">Recommended value: <strong id="admin_guide_000409__b1734473417813">20</strong></p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.3.2.2.7.1.5 "><p id="admin_guide_000409__p16891855165810">select * from (</p>
<p id="admin_guide_000409__p154113245916">with temp1 as (select * from tables)</p>
<p id="admin_guide_000409__p51771238143713">select * from temp1);</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.3.2.2.7.1.6 "><p id="admin_guide_000409__p273754615115">If there are too many SQL nesting layers, temporary data is generated for multiple times, and SQL statements are difficult to maintain and modify. You are advised to avoid multiple nested queries to improve execution efficiency and SQL maintainability.</p>
</td>
</tr>
<tr id="admin_guide_000409__row28821712165617"><td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.1 "><p id="admin_guide_000409__p12882181255618">static_0006</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.2 "><p id="admin_guide_000409__p1883121215614">Check whether the length of a SQL statement exceeds the threshold.</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.3 "><ul id="admin_guide_000409__ul1375326368"><li id="admin_guide_000409__li167518210367">Hive</li><li id="admin_guide_000409__li167511218361">Spark</li><li id="admin_guide_000409__li1756293620">HetuEngine</li><li id="admin_guide_000409__li22842717308">Doris</li></ul>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.4 "><p id="admin_guide_000409__p988361235616">Length of the SQL statement, in KB</p>
<p id="admin_guide_000409__p593512074917">Recommended value: <strong id="admin_guide_000409__b5521421292">10</strong></p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.3.2.2.7.1.5 "><p id="admin_guide_000409__p9177738143712">N/A</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.3.2.2.7.1.6 "><p id="admin_guide_000409__p177371466112">If a SQL string is too long, the SQL statement may be too complex, which may cause memory and performance problems. In addition, the SQL statement is difficult to maintain.</p>
</td>
</tr>
<tr id="admin_guide_000409__row1488301210561"><td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.1 "><p id="admin_guide_000409__p1288391265611">static_0007</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.2 "><p id="admin_guide_000409__p788321218568">Check whether the Cartesian product exists when multiple tables are associated.</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.3 "><ul id="admin_guide_000409__ul15208101315360"><li id="admin_guide_000409__li1020812138366">Hive</li><li id="admin_guide_000409__li2208131303615">Spark</li><li id="admin_guide_000409__li1320891323619">HetuEngine</li><li id="admin_guide_000409__li175107916309">Doris</li></ul>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.4 "><p id="admin_guide_000409__p163026216113">N/A</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.3.2.2.7.1.5 "><p id="admin_guide_000409__p1817773813374">select * from A,B;</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.3.2.2.7.1.6 "><p id="admin_guide_000409__p373874681116">The Cartesian product causes data expansion. When a task is running, a large amount of HDFS space and YARN resources may be occupied, affecting the execution of other tasks.</p>
</td>
</tr>
<tr id="admin_guide_000409__row188831612135617"><td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.1 "><p id="admin_guide_000409__p198831812175614">static_0008</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.2 "><p id="admin_guide_000409__p738919204392">Check whether <strong id="admin_guide_000409__b1138330104314">alter table update</strong> is performed at the cluster level (<strong id="admin_guide_000409__b19402643144312">on cluster</strong>).</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.3 "><p id="admin_guide_000409__p147491220103618">ClickHouse</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.4 "><p id="admin_guide_000409__p83111821413">N/A</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.3.2.2.7.1.5 "><p id="admin_guide_000409__p9661162914398">alter table testtb1 on cluster default_cluster update price=10.0 where id='100'</p>
</td>
<td class="cellrowborder" rowspan="2" valign="top" width="25%" headers="mcps1.3.3.2.2.7.1.6 "><p id="admin_guide_000409__p15897151711300">Updating and deleting data consume a large number of CPU and memory resources. Cluster-level operations pose high pressure on the database. As a result, tasks on some nodes may fail to be executed, time out, or do not respond for a long time, affecting cluster stability.</p>
</td>
</tr>
<tr id="admin_guide_000409__row118841120561"><td class="cellrowborder" valign="top" headers="mcps1.3.3.2.2.7.1.1 "><p id="admin_guide_000409__p20884812115610">static_0009</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.3.2.2.7.1.2 "><p id="admin_guide_000409__p1599116713710">Check whether <strong id="admin_guide_000409__b195091532174111">alter table delete</strong> is performed at the cluster level (<strong id="admin_guide_000409__b250943212416">on cluster</strong>).</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.3.2.2.7.1.3 "><p id="admin_guide_000409__p6198102410366">ClickHouse</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.3.2.2.7.1.4 "><p id="admin_guide_000409__p153151226114">N/A</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.3.2.2.7.1.5 "><p id="admin_guide_000409__p186551153143916">alter table testtb1 on cluster default_cluster delete where id ='10'</p>
</td>
</tr>
<tr id="admin_guide_000409__row2088417126564"><td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.1 "><p id="admin_guide_000409__p1588413127563">static_0010</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.2 "><p id="admin_guide_000409__p1136164112719">Check whether <strong id="admin_guide_000409__b78491812144414">alter table add column</strong> is performed at the cluster level (<strong id="admin_guide_000409__b1924231714419">on cluster</strong>).</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.3 "><p id="admin_guide_000409__p721192418366">ClickHouse</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.4 "><p id="admin_guide_000409__p13181421513">N/A</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.3.2.2.7.1.5 "><p id="admin_guide_000409__p6166131684017">alter table testtb1 on cluster default_cluster add column testc String</p>
</td>
<td class="cellrowborder" rowspan="2" valign="top" width="25%" headers="mcps1.3.3.2.2.7.1.6 "><p id="admin_guide_000409__p5738194661114">Adding and deleting columns consume a large number of CPU and memory resources. Cluster-level operations pose high pressure on the database. As a result, tasks on some nodes may fail to be executed, time out, or do not respond for a long time, causing metadata inconsistency and affecting cluster stability.</p>
</td>
</tr>
<tr id="admin_guide_000409__row9885012135618"><td class="cellrowborder" valign="top" headers="mcps1.3.3.2.2.7.1.1 "><p id="admin_guide_000409__p20885121245610">static_0011</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.3.2.2.7.1.2 "><p id="admin_guide_000409__p198853126562">Check whether <strong id="admin_guide_000409__b1725710259445">alter table drop column</strong> is performed at the cluster level (<strong id="admin_guide_000409__b18980123384414">on cluster</strong>).</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.3.2.2.7.1.3 "><p id="admin_guide_000409__p1721615240361">ClickHouse</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.3.2.2.7.1.4 "><p id="admin_guide_000409__p13321625115">N/A</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.3.2.2.7.1.5 "><p id="admin_guide_000409__p19456344174013">alter table testtb1 on cluster default_cluster drop column testc</p>
</td>
</tr>
<tr id="admin_guide_000409__row148856124560"><td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.1 "><p id="admin_guide_000409__p3885191225613">static_0012</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.2 "><p id="admin_guide_000409__p45042585409">Check whether <strong id="admin_guide_000409__b15841194113444">optimize final</strong> is performed at the cluster level (<strong id="admin_guide_000409__b43394713444">on cluster</strong>).</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.3 "><p id="admin_guide_000409__p52226246364">ClickHouse</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.4 "><p id="admin_guide_000409__p103252217116">N/A</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.3.2.2.7.1.5 "><p id="admin_guide_000409__p11631481419">optimize table testtb1 on cluster default_cluster final</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.3.2.2.7.1.6 "><p id="admin_guide_000409__p97386465118">Manual combination consumes a large number of CPU and memory resources and disk I/O resources when the table data volume is large. Cluster-level operations pose high pressure on the database. As a result, tasks on some nodes may fail to be executed, time out, or do not respond for a long time, affecting cluster stability.</p>
</td>
</tr>
<tr id="admin_guide_000409__row108851312115613"><td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.1 "><p id="admin_guide_000409__p4885812155616">static_0013</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.2 "><p id="admin_guide_000409__p988521265616">Check whether <strong id="admin_guide_000409__b1520415556443">drop</strong> is performed at the cluster level (<strong id="admin_guide_000409__b419813514457">on cluster</strong>).</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.3 "><p id="admin_guide_000409__p62281524183617">ClickHouse</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.4 "><p id="admin_guide_000409__p8330723111">N/A</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.3.2.2.7.1.5 "><p id="admin_guide_000409__p966511597103">drop table/database test on cluster default_cluster;</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.3.2.2.7.1.6 "><p id="admin_guide_000409__p17738146131120">Dropping tables consumes a large number of CPU and memory resources and disk I/O resources when the metadata volume and data volume are large. Cluster-level operations pose high pressure on the database. As a result, tasks on some nodes may fail to be executed, time out, or do not respond for a long time, affecting cluster stability.</p>
</td>
</tr>
<tr id="admin_guide_000409__row38866124569"><td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.1 "><p id="admin_guide_000409__p6886141215612">static_0014</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.2 "><p id="admin_guide_000409__p388613122561">Check whether <strong id="admin_guide_000409__b193907142450">truncate table</strong> is performed at the cluster level (<strong id="admin_guide_000409__b0868219134514">on cluster</strong>).</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.3 "><p id="admin_guide_000409__p72333246368">ClickHouse</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.4 "><p id="admin_guide_000409__p16334629116">N/A</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.3.2.2.7.1.5 "><p id="admin_guide_000409__p12653445411">truncate table testtb1 on cluster default_cluster;</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.3.2.2.7.1.6 "><p id="admin_guide_000409__p1473894617114">Deleting table data consumes a large number of CPU and memory resources and disk I/O resources when the table data volume is large. Cluster-level operations pose high pressure on the database. As a result, tasks on some nodes may fail to be executed, time out, or do not respond for a long time, affecting cluster stability.</p>
</td>
</tr>
<tr id="admin_guide_000409__row11886121225616"><td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.1 "><p id="admin_guide_000409__p2886012105610">dynamic_0001</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.2 "><p id="admin_guide_000409__p888631275611">Check whether the number of scanned files exceeds the threshold.</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.3 "><ul id="admin_guide_000409__ul9263101020253"><li id="admin_guide_000409__li8263210202519">Hive</li><li id="admin_guide_000409__li826310101257">Spark</li><li id="admin_guide_000409__li132630102254">HetuEngine</li><li id="admin_guide_000409__li1123911303301">Doris</li></ul>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.4 "><p id="admin_guide_000409__p178867123562">Number of files that will be scanned or have been scanned</p>
<p id="admin_guide_000409__p12757152813492">Recommended value: <strong id="admin_guide_000409__b2449195319164">100,000</strong></p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.3.2.2.7.1.5 "><p id="admin_guide_000409__p149615513449">SELECT ss_ticket_number FROM store_sales WHERE ss_ticket_number=72291252 LIMIT 10;</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.3.2.2.7.1.6 "><p id="admin_guide_000409__p9738546201111">Scanning a large number of files with a SQL statement can generate a large number of slices, overloading HiveServer memory and potentially causing the instance to crash. This can also consume a significant amount of cluster resources, delaying other tasks.</p>
</td>
</tr>
<tr id="admin_guide_000409__row3886121219566"><td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.1 "><p id="admin_guide_000409__p158862121567">dynamic_0002</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.2 "><p id="admin_guide_000409__p1886151217560">Check whether the number of partitions involved in a table operation (select, delete, update, or alter) exceeds the threshold.</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.3 "><ul id="admin_guide_000409__ul395973615368"><li id="admin_guide_000409__li295911363364">Hive</li><li id="admin_guide_000409__li095963663612">Spark</li><li id="admin_guide_000409__li595973633612">HetuEngine</li><li id="admin_guide_000409__li3211124017264">ClickHouse</li><li id="admin_guide_000409__li6129532143017">Doris</li></ul>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.4 "><p id="admin_guide_000409__p8886151225613">Number of partitions involved in the delete or alter operation</p>
<p id="admin_guide_000409__p26251437124919">Recommended value: <strong id="admin_guide_000409__b14581612193">10,000</strong></p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.3.2.2.7.1.5 "><p id="admin_guide_000409__p3562453114412">DELETE FROM table_name WHERE column_name = value</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.3.2.2.7.1.6 "><p id="admin_guide_000409__p1073894610111">Scanning too many partitions can overload the database, causing it to run slowly and consume excessive memory in HiveServer and MetaStore. This can lead to frequent GC, disrupting other tasks and potentially causing the instance to restart unexpectedly.</p>
</td>
</tr>
<tr id="admin_guide_000409__row1188613129561"><td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.1 "><p id="admin_guide_000409__p8886201210567">dynamic_0003</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.2 "><p id="admin_guide_000409__p1088771225616">When the right table of a join is a distributed table, check whether the data volume of the right table exceeds the threshold.</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.3 "><p id="admin_guide_000409__p18475113816366">ClickHouse</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.4 "><p id="admin_guide_000409__p1888761218562">Number of rows in the right table of a join</p>
<p id="admin_guide_000409__p15736848184919">Recommended value: <strong id="admin_guide_000409__b1710012011247">100,000,000</strong></p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.3.2.2.7.1.5 "><p id="admin_guide_000409__p167093419452">SELECT name, text FROM table_1 JOIN table_2 ON table_1.Id = table_2.Id</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.3.2.2.7.1.6 "><p id="admin_guide_000409__p18738124612118">Large data volumes in the right table can cause the join operation to consume excessive memory, potentially leading to memory insufficiency, service failure, and cluster instability.</p>
</td>
</tr>
<tr id="admin_guide_000409__row1376814247586"><td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.1 "><p id="admin_guide_000409__p548303185820">dynamic_0004</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.2 "><p id="admin_guide_000409__p1977116245586">Check whether a SQL statement overwrites the same table where it reads data.</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.3 "><ul id="admin_guide_000409__ul1866451116019"><li id="admin_guide_000409__li66643112011">Hive</li><li id="admin_guide_000409__li66647114016">Spark</li></ul>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.4 "><p id="admin_guide_000409__p748613360011">N/A</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.3.2.2.7.1.5 "><p id="admin_guide_000409__p14566195217011">N/A</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.3.2.2.7.1.6 "><p id="admin_guide_000409__p13772192405814">Such SQL statements may cause data loss or inconsistency.</p>
</td>
</tr>
<tr id="admin_guide_000409__row158879120562"><td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.1 "><p id="admin_guide_000409__p4887151215617">running_0001</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.2 "><p id="admin_guide_000409__p1388715129568">Check whether the number of rows returned by a <strong id="admin_guide_000409__b15170191044915">Select</strong> statement to the client exceeds the threshold.</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.3 "><ul id="admin_guide_000409__ul165713652519"><li id="admin_guide_000409__li557117618256">Hive</li><li id="admin_guide_000409__li69466492613">Spark</li><li id="admin_guide_000409__li35729614253">HetuEngine</li><li id="admin_guide_000409__li65721765256">ClickHouse</li><li id="admin_guide_000409__li1784125710300">Doris</li></ul>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.4 "><p id="admin_guide_000409__p1588720127560">Number of rows in the query result</p>
<p id="admin_guide_000409__p1558175184918">Recommended value: <strong id="admin_guide_000409__b25686595243">100,000</strong></p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.3.2.2.7.1.5 "><p id="admin_guide_000409__p11177153893710">select * from table</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.3.2.2.7.1.6 "><p id="admin_guide_000409__p1273814651113">Large query results can cause server memory overload, leading to OOM exceptions and instability. Excessive results also slow down query efficiency.</p>
</td>
</tr>
<tr id="admin_guide_000409__row8887161265612"><td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.1 "><p id="admin_guide_000409__p13293672918">running_0002</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.2 "><p id="admin_guide_000409__p198871712145611">Check whether the peak memory usage of a SQL statement exceeds the threshold (absolute value).</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.3 "><ul id="admin_guide_000409__ul964125215360"><li id="admin_guide_000409__li56414523368">Hive</li><li id="admin_guide_000409__li0645522361">Spark</li><li id="admin_guide_000409__li11641652193611">HetuEngine</li><li id="admin_guide_000409__li1364145233617">ClickHouse</li><li id="admin_guide_000409__li1458045953014">Doris</li></ul>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.4 "><p id="admin_guide_000409__p3887141295619">Memory occupied by a SQL statement during runtime, in MB</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.3.2.2.7.1.5 "><p id="admin_guide_000409__p1217773815379">N/A</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.3.2.2.7.1.6 "><p id="admin_guide_000409__p17381646151110">Long-running tasks consume cluster resources, slowing down other tasks and overall performance.</p>
</td>
</tr>
<tr id="admin_guide_000409__row188885123562"><td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.1 "><p id="admin_guide_000409__p1988718122561">running_0003</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.2 "><p id="admin_guide_000409__p48881123563">Check whether the running duration of a SQL statement exceeds the threshold.</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.3 "><ul id="admin_guide_000409__ul468135213611"><li id="admin_guide_000409__li468165233614">Hive</li><li id="admin_guide_000409__li1068195243617">Spark</li><li id="admin_guide_000409__li106815523361">HetuEngine</li><li id="admin_guide_000409__li16805263620">ClickHouse</li><li id="admin_guide_000409__li1787414123116">Doris</li></ul>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.4 "><p id="admin_guide_000409__p08881512185612">Running duration of a SQL statement, in seconds</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.3.2.2.7.1.5 "><p id="admin_guide_000409__p017733817378">N/A</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.3.2.2.7.1.6 "><p id="admin_guide_000409__p1738144619118">Long-running tasks can monopolize cluster resources, reducing overall utilization. They also generate a significant amount of intermediate data. To avoid delays, tasks that fail to meet expectations should be adjusted promptly.</p>
</td>
</tr>
<tr id="admin_guide_000409__row288881214560"><td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.1 "><p id="admin_guide_000409__p1588841215613">running_0004</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.2 "><p id="admin_guide_000409__p138881212105616">Check whether the size of data scanned by a SQL statement exceeds the threshold.</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.3 "><ul id="admin_guide_000409__ul147255216362"><li id="admin_guide_000409__li17235253616">Hive</li><li id="admin_guide_000409__li1772452193616">Spark</li><li id="admin_guide_000409__li14721352173614">HetuEngine</li><li id="admin_guide_000409__li1672195212361">ClickHouse</li></ul>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.4 "><p id="admin_guide_000409__p198891212195612">Data scanned by a SQL statement, in GB</p>
<p id="admin_guide_000409__p13159182712506">Recommended value: <strong id="admin_guide_000409__b1165543011228">10,240</strong></p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.3.2.2.7.1.5 "><p id="admin_guide_000409__p12177123823711">N/A</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.3.2.2.7.1.6 "><p id="admin_guide_000409__p10738134671115">Large datasets can consume significant memory resources, impacting other tasks' performance. Intermediate data can also fill disk space, compromising cluster stability.</p>
</td>
</tr>
<tr id="admin_guide_000409__row1843037122318"><td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.1 "><p id="admin_guide_000409__p04314782313">running_0005</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.2 "><p id="admin_guide_000409__p743137132311">Check whether the amount of shuffle data that has been written by a SQL statement exceeds the threshold.</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.3 "><p id="admin_guide_000409__p465815364309">Spark</p>
</td>
<td class="cellrowborder" valign="top" width="12.5%" headers="mcps1.3.3.2.2.7.1.4 "><p id="admin_guide_000409__p176820587273">Amount of shuffle data written by a SQL statement, in GB</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.3.2.2.7.1.5 "><p id="admin_guide_000409__p743177172311">N/A</p>
</td>
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.3.2.2.7.1.6 "><p id="admin_guide_000409__p073894611115">When executing SQL statements with operators like join and aggregation, a significant amount of data is shuffled, leading to high disk usage and potentially causing disk space exhaustion, which can compromise cluster stability.</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="admin_guide_000407.html">SQL Inspector</a></div>
</div>
</div>