doc-exports/docs/dws/dev/dws_06_0329.html
Lu, Huayi e6fa411af0 DWS DEV 830.201 version
Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com>
Co-authored-by: Lu, Huayi <luhuayi@huawei.com>
Co-committed-by: Lu, Huayi <luhuayi@huawei.com>
2024-05-16 07:24:04 +00:00

131 lines
18 KiB
HTML
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<a name="EN-US_TOPIC_0000001495586257"></a><a name="EN-US_TOPIC_0000001495586257"></a>
<h1 class="topictitle1">Aggregate Functions</h1>
<div id="body0000001495586257"><div class="section" id="EN-US_TOPIC_0000001495586257__section92414131400"><h4 class="sectiontitle">hll_add_agg(hll_hashval)</h4><p id="EN-US_TOPIC_0000001495586257__p1723914131803">Description: Groups hashed data into HLL.</p>
<p id="EN-US_TOPIC_0000001495586257__p923914131404">Return type: hll</p>
<p id="EN-US_TOPIC_0000001495586257__p723914139012">Example:</p>
<ol id="EN-US_TOPIC_0000001495586257__ol1112410742418"><li id="EN-US_TOPIC_0000001495586257__li01242711242">Prepare data.<div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000001495586257__screen1239105214248"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span>
<span class="normal">2</span>
<span class="normal">3</span>
<span class="normal">4</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">CREATE</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">t_id</span><span class="p">(</span><span class="n">id</span><span class="w"> </span><span class="nb">int</span><span class="p">);</span>
<span class="k">INSERT</span><span class="w"> </span><span class="k">INTO</span><span class="w"> </span><span class="n">t_id</span><span class="w"> </span><span class="k">VALUES</span><span class="p">(</span><span class="n">generate_series</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span><span class="mi">500</span><span class="p">));</span>
<span class="k">CREATE</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">t_data</span><span class="p">(</span><span class="n">a</span><span class="w"> </span><span class="nb">int</span><span class="p">,</span><span class="w"> </span><span class="k">c</span><span class="w"> </span><span class="nb">text</span><span class="p">);</span>
<span class="k">INSERT</span><span class="w"> </span><span class="k">INTO</span><span class="w"> </span><span class="n">t_data</span><span class="w"> </span><span class="k">SELECT</span><span class="w"> </span><span class="k">mod</span><span class="p">(</span><span class="n">id</span><span class="p">,</span><span class="mi">2</span><span class="p">),</span><span class="w"> </span><span class="n">id</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">t_id</span><span class="p">;</span>
</pre></div></td></tr></table></div>
</div>
</li><li id="EN-US_TOPIC_0000001495586257__li015711702412">Create another table and specify an HLL column:<div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000001495586257__screen62521259132412"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">CREATE</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">t_a_c_hll</span><span class="p">(</span><span class="n">a</span><span class="w"> </span><span class="nb">int</span><span class="p">,</span><span class="w"> </span><span class="k">c</span><span class="w"> </span><span class="n">hll</span><span class="p">);</span>
</pre></div></td></tr></table></div>
</div>
</li><li id="EN-US_TOPIC_0000001495586257__li56871722142419">Use <strong id="EN-US_TOPIC_0000001495586257__b2058053465816">GROUP BY</strong> on column <strong id="EN-US_TOPIC_0000001495586257__b958043416580">a</strong> to group data, and insert the data to the HLL column:<div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000001495586257__screen9319196192515"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">INSERT</span><span class="w"> </span><span class="k">INTO</span><span class="w"> </span><span class="n">t_a_c_hll</span><span class="w"> </span><span class="k">SELECT</span><span class="w"> </span><span class="n">a</span><span class="p">,</span><span class="w"> </span><span class="n">hll_add_agg</span><span class="p">(</span><span class="n">hll_hash_text</span><span class="p">(</span><span class="k">c</span><span class="p">))</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">t_data</span><span class="w"> </span><span class="k">GROUP</span><span class="w"> </span><span class="k">BY</span><span class="w"> </span><span class="n">a</span><span class="p">;</span>
</pre></div></td></tr></table></div>
</div>
</li><li id="EN-US_TOPIC_0000001495586257__li1091233132414">Calculate the number of distinct values for each group in the HLL column:<div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000001495586257__screen6936181017255"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span>
<span class="normal">2</span>
<span class="normal">3</span>
<span class="normal">4</span>
<span class="normal">5</span>
<span class="normal">6</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">SELECT</span><span class="w"> </span><span class="n">a</span><span class="p">,</span><span class="w"> </span><span class="o">#</span><span class="k">c</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="k">cardinality</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">t_a_c_hll</span><span class="w"> </span><span class="k">order</span><span class="w"> </span><span class="k">by</span><span class="w"> </span><span class="n">a</span><span class="p">;</span>
<span class="w"> </span><span class="n">a</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="k">cardinality</span><span class="w"> </span>
<span class="c1">---+------------------</span>
<span class="w"> </span><span class="mi">0</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="mi">250</span><span class="p">.</span><span class="mi">741759091658</span>
<span class="w"> </span><span class="mi">1</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="mi">250</span><span class="p">.</span><span class="mi">741759091658</span>
<span class="p">(</span><span class="mi">2</span><span class="w"> </span><span class="k">rows</span><span class="p">)</span>
</pre></div></td></tr></table></div>
</div>
</li></ol>
</div>
<div class="section" id="EN-US_TOPIC_0000001495586257__section9120324107"><h4 class="sectiontitle">hll_add_agg(hll_hashval, int32 log2m)</h4><p id="EN-US_TOPIC_0000001495586257__p1987284112595">Description: Groups hashed data into HLL and sets the <strong id="EN-US_TOPIC_0000001495586257__b78113703093957">log2m</strong> parameter. The parameter value ranges from 10 to 16.</p>
<p id="EN-US_TOPIC_0000001495586257__p6872204112591">Return type: hll</p>
<p id="EN-US_TOPIC_0000001495586257__p1087204113597">Example:</p>
<div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000001495586257__screen1144945161113"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span>
<span class="normal">2</span>
<span class="normal">3</span>
<span class="normal">4</span>
<span class="normal">5</span></pre></div></td><td class="code"><div><pre><span></span><span class="w"> </span><span class="k">SELECT</span><span class="w"> </span><span class="n">hll_cardinality</span><span class="p">(</span><span class="n">hll_add_agg</span><span class="p">(</span><span class="n">hll_hash_text</span><span class="p">(</span><span class="k">c</span><span class="p">),</span><span class="w"> </span><span class="mi">10</span><span class="p">))</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">t_data</span><span class="p">;</span>
<span class="w"> </span><span class="n">hll_cardinality</span><span class="w"> </span>
<span class="c1">------------------</span>
<span class="w"> </span><span class="mi">503</span><span class="p">.</span><span class="mi">932348927339</span>
<span class="p">(</span><span class="mi">1</span><span class="w"> </span><span class="k">row</span><span class="p">)</span>
</pre></div></td></tr></table></div>
</div>
</div>
<div class="section" id="EN-US_TOPIC_0000001495586257__section1673473711013"><h4 class="sectiontitle">hll_add_agg(hll_hashval, int32 log2m, int32 regwidth)</h4><p id="EN-US_TOPIC_0000001495586257__p1423816181547">Description: Groups hashed data into HLL. and sets the <strong id="EN-US_TOPIC_0000001495586257__b47379490093957">log2m</strong> and <strong id="EN-US_TOPIC_0000001495586257__b71275575793957">regwidth</strong> parameters in sequence. The value of <strong id="EN-US_TOPIC_0000001495586257__b111949971993957">regwidth</strong> ranges from 1 to 5.</p>
<p id="EN-US_TOPIC_0000001495586257__p1023881817415">Return type: hll</p>
<p id="EN-US_TOPIC_0000001495586257__p122380181943">Example:</p>
<div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000001495586257__screen13205143014228"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span>
<span class="normal">2</span>
<span class="normal">3</span>
<span class="normal">4</span>
<span class="normal">5</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">SELECT</span><span class="w"> </span><span class="n">hll_cardinality</span><span class="p">(</span><span class="n">hll_add_agg</span><span class="p">(</span><span class="n">hll_hash_text</span><span class="p">(</span><span class="k">c</span><span class="p">),</span><span class="w"> </span><span class="k">NULL</span><span class="p">,</span><span class="w"> </span><span class="mi">1</span><span class="p">))</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">t_data</span><span class="p">;</span>
<span class="w"> </span><span class="n">hll_cardinality</span><span class="w"> </span>
<span class="c1">------------------</span>
<span class="w"> </span><span class="mi">496</span><span class="p">.</span><span class="mi">628982624022</span>
<span class="p">(</span><span class="mi">1</span><span class="w"> </span><span class="k">row</span><span class="p">)</span>
</pre></div></td></tr></table></div>
</div>
</div>
<div class="section" id="EN-US_TOPIC_0000001495586257__section91109351606"><h4 class="sectiontitle">hll_add_agg(hll_hashval, int32 log2m, int32 regwidth, int64 expthresh)</h4><p id="EN-US_TOPIC_0000001495586257__p1053925568">Description: Groups hashed data into HLL and sets the parameters <strong id="EN-US_TOPIC_0000001495586257__b144342646493957">log2m</strong>, <strong id="EN-US_TOPIC_0000001495586257__b21317714893957">regwidth</strong>, and <strong id="EN-US_TOPIC_0000001495586257__b189500032093957">expthresh</strong> in sequence. The value of <strong id="EN-US_TOPIC_0000001495586257__b31570864393957">expthresh</strong> is an integer ranging from 1 to 7. <strong id="EN-US_TOPIC_0000001495586257__b209361223693957">expthresh</strong> is used to specify the threshold for switching from the <strong id="EN-US_TOPIC_0000001495586257__b189331756593957">explicit</strong> mode to the <strong id="EN-US_TOPIC_0000001495586257__b199014624693957">sparse</strong> mode. <strong id="EN-US_TOPIC_0000001495586257__b179764081993957">1</strong> indicates the auto mode; <strong id="EN-US_TOPIC_0000001495586257__b97448187293957">0</strong> indicates that the <strong id="EN-US_TOPIC_0000001495586257__b163653046193957">explicit</strong> mode is skipped; a value from 1 to 7 indicates that the mode is switched when the number of distinct values reaches 2<sup id="EN-US_TOPIC_0000001495586257__sup21327631493957"><strong id="EN-US_TOPIC_0000001495586257__b122930258193957">expthresh</strong></sup>.</p>
<p id="EN-US_TOPIC_0000001495586257__p13531025367">Return type: hll</p>
<p id="EN-US_TOPIC_0000001495586257__p8534254610">Example:</p>
<div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000001495586257__screen99971539142319"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span>
<span class="normal">2</span>
<span class="normal">3</span>
<span class="normal">4</span>
<span class="normal">5</span></pre></div></td><td class="code"><div><pre><span></span><span class="w"> </span><span class="k">SELECT</span><span class="w"> </span><span class="n">hll_cardinality</span><span class="p">(</span><span class="n">hll_add_agg</span><span class="p">(</span><span class="n">hll_hash_text</span><span class="p">(</span><span class="k">c</span><span class="p">),</span><span class="w"> </span><span class="k">NULL</span><span class="p">,</span><span class="w"> </span><span class="mi">1</span><span class="p">,</span><span class="w"> </span><span class="mi">4</span><span class="p">))</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">t_data</span><span class="p">;</span>
<span class="w"> </span><span class="n">hll_cardinality</span><span class="w"> </span>
<span class="c1">------------------</span>
<span class="w"> </span><span class="mi">496</span><span class="p">.</span><span class="mi">628982624022</span>
<span class="p">(</span><span class="mi">1</span><span class="w"> </span><span class="k">row</span><span class="p">)</span>
</pre></div></td></tr></table></div>
</div>
</div>
<div class="section" id="EN-US_TOPIC_0000001495586257__section7234032809"><h4 class="sectiontitle">hll_add_agg(hll_hashval, int32 log2m, int32 regwidth, int64 expthresh, int32 sparseon)</h4><p id="EN-US_TOPIC_0000001495586257__p142701419188">Description: Groups hashed data into HLL and sets the parameters <strong id="EN-US_TOPIC_0000001495586257__b133713550596">log2m</strong>, <strong id="EN-US_TOPIC_0000001495586257__b333717552594">regwidth</strong>, <strong id="EN-US_TOPIC_0000001495586257__b19337115516595">expthresh</strong>, and <strong id="EN-US_TOPIC_0000001495586257__b522722518019">sparseon</strong> in sequence. The value of <strong id="EN-US_TOPIC_0000001495586257__b21321948729">sparseon</strong> is 0 or 1.</p>
<p id="EN-US_TOPIC_0000001495586257__p102701919181">Return type: hll</p>
<p id="EN-US_TOPIC_0000001495586257__p1027014194811">Example:</p>
<div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000001495586257__screen4897191820299"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span>
<span class="normal">2</span>
<span class="normal">3</span>
<span class="normal">4</span>
<span class="normal">5</span></pre></div></td><td class="code"><div><pre><span></span><span class="w"> </span><span class="k">SELECT</span><span class="w"> </span><span class="n">hll_cardinality</span><span class="p">(</span><span class="n">hll_add_agg</span><span class="p">(</span><span class="n">hll_hash_text</span><span class="p">(</span><span class="k">c</span><span class="p">),</span><span class="w"> </span><span class="k">NULL</span><span class="p">,</span><span class="w"> </span><span class="mi">1</span><span class="p">,</span><span class="w"> </span><span class="mi">4</span><span class="p">,</span><span class="w"> </span><span class="mi">0</span><span class="p">))</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">t_data</span><span class="p">;</span>
<span class="w"> </span><span class="n">hll_cardinality</span><span class="w"> </span>
<span class="c1">------------------</span>
<span class="w"> </span><span class="mi">496</span><span class="p">.</span><span class="mi">628982624022</span>
<span class="p">(</span><span class="mi">1</span><span class="w"> </span><span class="k">row</span><span class="p">)</span>
</pre></div></td></tr></table></div>
</div>
</div>
<div class="section" id="EN-US_TOPIC_0000001495586257__section16542172910010"><h4 class="sectiontitle">hll_union_agg(hll)</h4><p id="EN-US_TOPIC_0000001495586257__p1619444182">Description: Perform the <strong id="EN-US_TOPIC_0000001495586257__b23215531393957">UNION</strong> operation on multiple pieces of data of the hll type to obtain one HLL.</p>
<p id="EN-US_TOPIC_0000001495586257__p8894142313186">Return type: hll</p>
<p id="EN-US_TOPIC_0000001495586257__p10483122972212">Example:</p>
<p id="EN-US_TOPIC_0000001495586257__p7206133211186">Perform the <strong id="EN-US_TOPIC_0000001495586257__b1055414498212">UNION</strong> operation on data of the HLL type in each group to obtain one HLL, and calculate the number of distinct values:</p>
<div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000001495586257__screen1668214125305"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span>
<span class="normal">2</span>
<span class="normal">3</span>
<span class="normal">4</span>
<span class="normal">5</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">SELECT</span><span class="w"> </span><span class="o">#</span><span class="n">hll_union_agg</span><span class="p">(</span><span class="k">c</span><span class="p">)</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="k">cardinality</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">t_a_c_hll</span><span class="p">;</span>
<span class="w"> </span><span class="k">cardinality</span><span class="w"> </span>
<span class="c1">------------------</span>
<span class="w"> </span><span class="mi">496</span><span class="p">.</span><span class="mi">628982624022</span>
<span class="p">(</span><span class="mi">1</span><span class="w"> </span><span class="k">row</span><span class="p">)</span>
</pre></div></td></tr></table></div>
</div>
</div>
<div class="note" id="EN-US_TOPIC_0000001495586257__note42005853413"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="EN-US_TOPIC_0000001495586257__p02075810346">To perform <strong id="EN-US_TOPIC_0000001495586257__b162005173393957">UNION</strong> on data in multiple HLLs, ensure that the HLLs have the same precision. Otherwise, <strong id="EN-US_TOPIC_0000001495586257__b189159259393957">UNION</strong> cannot be performed. This restriction also applies to the hll_union(hll, hll) function.</p>
</div></div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="dws_06_0042.html">HLL Functions and Operators</a></div>
</div>
</div>