forked from docs/doc-exports
Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com> Co-authored-by: Su, Xiaomeng <suxiaomeng1@huawei.com> Co-committed-by: Su, Xiaomeng <suxiaomeng1@huawei.com>
94 lines
7.5 KiB
HTML
94 lines
7.5 KiB
HTML
<a name="dli_08_15072"></a><a name="dli_08_15072"></a>
|
|
|
|
<h1 class="topictitle1">Window Top-N</h1>
|
|
<div id="body0000001870793697"><div class="section" id="dli_08_15072__section1352713692319"><h4 class="sectiontitle">Function</h4><p id="dli_08_15072__p1727618612214">Window Top-N is a special Top-N which returns the N smallest or largest values for each window and other partitioned keys.</p>
|
|
<p id="dli_08_15072__p1334225512429">Unlike regular Top-N on continuous tables, window Top-N does not emit intermediate results but only a final result, the total top N records at the end of the window. Moreover, window Top-N purges all intermediate state when no longer needed.</p>
|
|
<p id="dli_08_15072__p611725904212">Window Top-N queries have better performance if users do not need results updated per record. Usually, Window Top-N is used with <a href="dli_08_15070.html#dli_08_15070__section3516193316120">Windowing Table-Valued Functions (Windowing TVFs)</a> directly. Besides, Window Top-N could be used with other operations based on <a href="dli_08_15070.html#dli_08_15070__section3516193316120">Windowing Table-Valued Functions (Windowing TVFs)</a>, such as Window Aggregation, Window TopN and Window Join.</p>
|
|
<p id="dli_08_15072__p827746122211">Window Top-N can be defined in the same syntax as regular Top-N, see Top-N documentation for more information. Besides that, Window Top-N requires the <strong id="dli_08_15072__b61561034516">PARTITION BY</strong> clause contains <strong id="dli_08_15072__b696339517">window_start</strong> and <strong id="dli_08_15072__b137410421114">window_end</strong> columns of the relation applied Windowing TVF or Window Aggregation. Otherwise, the optimizer will not be able to translate the query.</p>
|
|
<p id="dli_08_15072__p0535194614238">For more information, see <a href="https://nightlies.apache.org/flink/flink-docs-release-1.15/zh/docs/dev/table/sql/queries/window-topn/" target="_blank" rel="noopener noreferrer">Window Top-N</a>.</p>
|
|
</div>
|
|
<div class="section" id="dli_08_15072__section18618142518248"><h4 class="sectiontitle">Syntax</h4><pre class="screen" id="dli_08_15072__screen27199303248">SELECT [column_list]
|
|
FROM (
|
|
SELECT [column_list],
|
|
ROW_NUMBER() OVER (PARTITION BY window_start, window_end [, col_key1...]
|
|
ORDER BY col1 [asc|desc][, col2 [asc|desc]...]) AS rownum
|
|
FROM table_name) -- relation applied windowing TVF
|
|
WHERE rownum <= N [AND conditions]</pre>
|
|
</div>
|
|
<div class="section" id="dli_08_15072__section711153872620"><h4 class="sectiontitle">Caveats</h4><p id="dli_08_15072__p163991401268">Flink only supports Window Top-N follows after Windowing TVF with Tumble Windows, Hop Windows and Cumulate Windows.</p>
|
|
</div>
|
|
<div class="section" id="dli_08_15072__section74251944112410"><h4 class="sectiontitle">Example</h4><p id="dli_08_15072__p7201174210259"><strong id="dli_08_15072__b883175311317">Window Top-N follows after Window Aggregation</strong></p>
|
|
<p id="dli_08_15072__p16274655202415">The following example shows how to calculate Top 3 suppliers who have the highest sales for every tumbling 10 minutes window.</p>
|
|
<pre class="screen" id="dli_08_15072__screen490352312253">-- tables must have time attribute, e.g. `bidtime` in this table
|
|
Flink SQL> desc Bid;
|
|
+-------------+------------------------+------+-----+--------+---------------------------------+
|
|
| name | type | null | key | extras | watermark |
|
|
+-------------+------------------------+------+-----+--------+---------------------------------+
|
|
| bidtime | TIMESTAMP(3) *ROWTIME* | true | | | `bidtime` - INTERVAL '1' SECOND |
|
|
| price | DECIMAL(10, 2) | true | | | |
|
|
| item | STRING | true | | | |
|
|
| supplier_id | STRING | true | | | |
|
|
+-------------+------------------------+------+-----+--------+---------------------------------+
|
|
|
|
Flink SQL> SELECT * FROM Bid;
|
|
+------------------+-------+------+-------------+
|
|
| bidtime | price | item | supplier_id |
|
|
+------------------+-------+------+-------------+
|
|
| 2020-04-15 08:05 | 4.00 | A | supplier1 |
|
|
| 2020-04-15 08:06 | 4.00 | C | supplier2 |
|
|
| 2020-04-15 08:07 | 2.00 | G | supplier1 |
|
|
| 2020-04-15 08:08 | 2.00 | B | supplier3 |
|
|
| 2020-04-15 08:09 | 5.00 | D | supplier4 |
|
|
| 2020-04-15 08:11 | 2.00 | B | supplier3 |
|
|
| 2020-04-15 08:13 | 1.00 | E | supplier1 |
|
|
| 2020-04-15 08:15 | 3.00 | H | supplier2 |
|
|
| 2020-04-15 08:17 | 6.00 | F | supplier5 |
|
|
+------------------+-------+------+-------------+
|
|
|
|
Flink SQL> SELECT *
|
|
FROM (
|
|
SELECT *, ROW_NUMBER() OVER (PARTITION BY window_start, window_end ORDER BY price DESC) as rownum
|
|
FROM (
|
|
SELECT window_start, window_end, supplier_id, SUM(price) as price, COUNT(*) as cnt
|
|
FROM TABLE(
|
|
TUMBLE(TABLE Bid, DESCRIPTOR(bidtime), INTERVAL '10' MINUTES))
|
|
GROUP BY window_start, window_end, supplier_id
|
|
)
|
|
) WHERE rownum <= 3;
|
|
+------------------+------------------+-------------+-------+-----+--------+
|
|
| window_start | window_end | supplier_id | price | cnt | rownum |
|
|
+------------------+------------------+-------------+-------+-----+--------+
|
|
| 2020-04-15 08:00 | 2020-04-15 08:10 | supplier1 | 6.00 | 2 | 1 |
|
|
| 2020-04-15 08:00 | 2020-04-15 08:10 | supplier4 | 5.00 | 1 | 2 |
|
|
| 2020-04-15 08:00 | 2020-04-15 08:10 | supplier2 | 4.00 | 1 | 3 |
|
|
| 2020-04-15 08:10 | 2020-04-15 08:20 | supplier5 | 6.00 | 1 | 1 |
|
|
| 2020-04-15 08:10 | 2020-04-15 08:20 | supplier2 | 3.00 | 1 | 2 |
|
|
| 2020-04-15 08:10 | 2020-04-15 08:20 | supplier3 | 2.00 | 1 | 3 |
|
|
+------------------+------------------+-------------+-------+-----+--------+</pre>
|
|
</div>
|
|
<p id="dli_08_15072__p999716559253"><strong id="dli_08_15072__b172251334648">Window Top-N follows after Windowing TVF</strong></p>
|
|
<p id="dli_08_15072__p1573116580254">The following example shows how to calculate Top 3 items which have the highest price for every tumbling 10 minutes window.</p>
|
|
<pre class="screen" id="dli_08_15072__screen138514415268">Flink SQL> SELECT *
|
|
FROM (
|
|
SELECT *, ROW_NUMBER() OVER (PARTITION BY window_start, window_end ORDER BY price DESC) as rownum
|
|
FROM TABLE(
|
|
TUMBLE(TABLE Bid, DESCRIPTOR(bidtime), INTERVAL '10' MINUTES))
|
|
) WHERE rownum <= 3;
|
|
+------------------+-------+------+-------------+------------------+------------------+--------+
|
|
| bidtime | price | item | supplier_id | window_start | window_end | rownum |
|
|
+------------------+-------+------+-------------+------------------+------------------+--------+
|
|
| 2020-04-15 08:05 | 4.00 | A | supplier1 | 2020-04-15 08:00 | 2020-04-15 08:10 | 2 |
|
|
| 2020-04-15 08:06 | 4.00 | C | supplier2 | 2020-04-15 08:00 | 2020-04-15 08:10 | 3 |
|
|
| 2020-04-15 08:09 | 5.00 | D | supplier4 | 2020-04-15 08:00 | 2020-04-15 08:10 | 1 |
|
|
| 2020-04-15 08:11 | 2.00 | B | supplier3 | 2020-04-15 08:10 | 2020-04-15 08:20 | 3 |
|
|
| 2020-04-15 08:15 | 3.00 | H | supplier2 | 2020-04-15 08:10 | 2020-04-15 08:20 | 2 |
|
|
| 2020-04-15 08:17 | 6.00 | F | supplier5 | 2020-04-15 08:10 | 2020-04-15 08:20 | 1 |
|
|
+------------------+-------+------+-------------+------------------+------------------+--------+</pre>
|
|
</div>
|
|
<div>
|
|
<div class="familylinks">
|
|
<div class="parentlink"><strong>Parent topic:</strong> <a href="dli_08_15069.html">Window</a></div>
|
|
</div>
|
|
</div>
|
|
|