Files
doc-exports/docs/dws/dev/dws_04_0459.html
luhuayi 177cd61a57 DWS DEVG 910.211 version
Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com>
Co-authored-by: luhuayi <luhuayi@huawei.com>
Co-committed-by: luhuayi <luhuayi@huawei.com>
2025-05-05 07:44:03 +00:00

48 lines
15 KiB
HTML

<a name="EN-US_TOPIC_0000002080670190"></a><a name="EN-US_TOPIC_0000002080670190"></a>
<h1 class="topictitle1">Stream Operation Hints</h1>
<div id="body8662426"><div class="section" id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_section290819468377"><h4 class="sectiontitle">Function</h4><p id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_p6952105124120">Specifies the stream method, which can be broadcast, redistribute, or specifying the distribution key for <strong id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_b1371453865616">Agg</strong> redistribution.</p>
<div class="note" id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_note1490818191354"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_p139081919751">Specifies the hint for the distribution column during the Agg process. This parameter is supported only by clusters of version 8.1.3.100 or later.</p>
</div></div>
</div>
<div class="section" id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_section17783121184117"><h4 class="sectiontitle">Syntax</h4><div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_screen682272534119"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span></pre></div></td><td class="code"><div><pre><span></span><span class="p">[</span><span class="k">no</span><span class="p">]</span><span class="w"> </span><span class="n">broadcast</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="n">redistribute</span><span class="p">([</span><span class="o">@</span><span class="n">block_name</span><span class="p">]</span><span class="w"> </span><span class="n">table_list</span><span class="p">)</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="n">redistribute</span><span class="w"> </span><span class="p">([</span><span class="o">@</span><span class="n">block_name</span><span class="p">]</span><span class="w"> </span><span class="p">(</span><span class="o">*</span><span class="p">)</span><span class="w"> </span><span class="p">(</span><span class="n">columns</span><span class="p">))</span>
</pre></div></td></tr></table></div>
</div>
</div>
<div class="section" id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_section3914251214383"><h4 class="sectiontitle">Parameter Description</h4><ul id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_ul14347162691914"><li id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_li143471926121911"><strong id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_b227216306198">no</strong> indicates that the hinted stream method is not used. When the hint is specified for the distribution columns in the <strong id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_b73711148415">Agg</strong> redistribution, <strong id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_b23628288412">no</strong> is invalid.</li><li id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_li128261809226"><em id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_en-us_topic_0000001510402197_i172885331829">block_name</em> indicates the block name of the statement block. For details, see <a href="dws_04_0456.html#EN-US_TOPIC_0000002080515442__en-us_topic_0000001460722632_li99021444551">block_name</a>.</li><li id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_li138266017229"><em id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_i25311812172811">table_list</em> specifies the tables to be joined. For details, see <a href="dws_04_0457.html#EN-US_TOPIC_0000002116194489__en-us_topic_0000001510402197_section35948678143011">Parameter Description</a>.</li><li id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_li58269012215">When hints are specified for distribution columns, the asterisk (*) is fixed and the table name cannot be specified.</li><li id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_li17826506225"><strong id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_b876423114010">columns</strong> specifies one or more columns in the <strong id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_b197641131194014">GROUP BY</strong> clause. When there are no <strong id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_b147656313409">GROUP BY</strong> clauses, it can specify the columns in the <strong id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_b1176514319403">DISTINCT</strong> clause.<div class="note" id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_note133365763113"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><ul id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_ul3357282517550"><li id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_li68881926161517">The specified distribution column must be specified using the column sequence number or column name in <strong id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_b98191337174011">group by</strong> or <strong id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_b13820133754019">distinct</strong>. The columns in <strong id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_b13820837124016">count(distinct)</strong> can only be specified using column names.</li><li id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_li959018694111">For a multi-layer query, you can specify the distribution column hint at each layer. The hint takes effect only at the corresponding layer.</li><li id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_li16824237154319">The column specified in <strong id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_b66043551225">count(distinct)</strong> takes effect only for two-level hashagg plans. Otherwise, the specified distribution column is invalid.</li><li id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_li189951539204017">If the optimizer finds that redistribution is not required after estimation, the specified distribution column is invalid.</li></ul>
</div></div>
</li></ul>
</div>
<div class="section" id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_section99281150122819"><h4 class="sectiontitle">Tips</h4><ul id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_ul19827192311618"><li id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_li208275234618">Generally, the optimizer selects a group of non-skew distribution keys for data redistribution based on statistics. If the default distribution keys have data skew, you can manually specify the distribution columns to avoid data skew.</li><li id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_li11250352919">When selecting a distribution key, select a group of columns with high distinct values as the distribution key based on data distribution features. In this way, data can be evenly distributed to each DN after redistribution.</li><li id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_li463512618717">After writing hints, you can run <strong id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_b17472185774611">explain verbose</strong> to print the execution plan and check whether the specified distribution key is valid. If the specified distribution key is invalid, a warning is displayed.</li></ul>
</div>
<div class="section" id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_section1127715590585"><h4 class="sectiontitle">Example</h4><ul id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_ul1188819236385"><li id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_li20889723173819">Hint the query plan in <a href="dws_04_0455.html#EN-US_TOPIC_0000002053159594__en-us_topic_0000001658028034_section671421102912">Examples</a> as follows:<div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_screen86017464385"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span>
<span class="normal">2</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">explain</span>
<span class="k">select</span><span class="w"> </span><span class="cm">/*+ no redistribute(store_sales store_returns item store) leading(((store_sales store_returns item store) customer)) */</span><span class="w"> </span><span class="n">i_product_name</span><span class="w"> </span><span class="n">product_name</span><span class="w"> </span><span class="p">...</span>
</pre></div></td></tr></table></div>
</div>
<p id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_p18889162315383">In the original plan, the join result of <strong id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_b14473528114717">store_sales</strong>, <strong id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_b8474028164717">store_returns</strong>, <strong id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_b7474122884715">item</strong>, and <strong id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_b6474142834710">store</strong> is redistributed before it is joined with <strong id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_b15474928154715">customer</strong>. After the hinting, the redistribution is disabled and the join order is retained. The optimized plan is as follows:</p>
<p id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_p288982313816"><span><img id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_image313117593911" src="figure/en-us_image_0000001460882748.png"></span></p>
</li></ul>
</div>
<ul id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_ul4727174618186"><li id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_li185061372569">Specifies the distribution columns for Agg redistribution.<div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_screen1347124316443"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span>
<span class="normal">2</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">explain</span><span class="w"> </span><span class="p">(</span><span class="k">verbose</span><span class="w"> </span><span class="k">on</span><span class="p">,</span><span class="w"> </span><span class="n">costs</span><span class="w"> </span><span class="k">off</span><span class="p">,</span><span class="w"> </span><span class="n">nodes</span><span class="w"> </span><span class="k">off</span><span class="p">)</span>
<span class="k">select</span><span class="w"> </span><span class="cm">/*+ redistribute ((*) (2 3)) */</span><span class="w"> </span><span class="n">a1</span><span class="p">,</span><span class="w"> </span><span class="n">b1</span><span class="p">,</span><span class="w"> </span><span class="n">c1</span><span class="p">,</span><span class="w"> </span><span class="k">count</span><span class="p">(</span><span class="n">c1</span><span class="p">)</span><span class="w"> </span><span class="k">from</span><span class="w"> </span><span class="n">t1</span><span class="w"> </span><span class="k">group</span><span class="w"> </span><span class="k">by</span><span class="w"> </span><span class="n">a1</span><span class="p">,</span><span class="w"> </span><span class="n">b1</span><span class="p">,</span><span class="w"> </span><span class="n">c1</span><span class="w"> </span><span class="k">having</span><span class="w"> </span><span class="k">count</span><span class="p">(</span><span class="n">c1</span><span class="p">)</span><span class="w"> </span><span class="o">&gt;</span><span class="w"> </span><span class="mi">10</span><span class="w"> </span><span class="k">and</span><span class="w"> </span><span class="k">sum</span><span class="p">(</span><span class="n">d1</span><span class="p">)</span><span class="w"> </span><span class="o">&gt;</span><span class="w"> </span><span class="mi">100</span>
</pre></div></td></tr></table></div>
</div>
<p id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_p188611642125516">In the following example, the last two columns of the specified <strong id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_b95311441639">GROUP BY</strong> columns are used as distribution keys.</p>
<p id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_p8113111745216"><span><img id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_image10799112145914" src="figure/en-us_image_0000001510402789.png"></span></p>
</li><li id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_li48281631155310">If the statement does not contain the <strong id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_b3568796552">GROUP BY</strong> clause, specify the distinct column as the distribution columns.<div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_screen1451032819214"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span>
<span class="normal">2</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">explain</span><span class="w"> </span><span class="p">(</span><span class="k">verbose</span><span class="w"> </span><span class="k">on</span><span class="p">,</span><span class="w"> </span><span class="n">costs</span><span class="w"> </span><span class="k">off</span><span class="p">,</span><span class="w"> </span><span class="n">nodes</span><span class="w"> </span><span class="k">off</span><span class="p">)</span>
<span class="k">select</span><span class="w"> </span><span class="cm">/*+ redistribute ((*) (3 1)) */</span><span class="w"> </span><span class="k">distinct</span><span class="w"> </span><span class="n">a1</span><span class="p">,</span><span class="w"> </span><span class="n">b1</span><span class="p">,</span><span class="w"> </span><span class="n">c1</span><span class="w"> </span><span class="k">from</span><span class="w"> </span><span class="n">t1</span><span class="p">;</span>
</pre></div></td></tr></table></div>
</div>
<p id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_p129281951213"><span><img id="EN-US_TOPIC_0000002080670190__en-us_topic_0000001510283513_image136481681929" src="figure/en-us_image_0000001460563240.png"></span></p>
</li></ul>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="dws_04_0454.html">Hint-based Tuning</a></div>
</div>
</div>