Files
doc-exports/docs/dws/dev/dws_04_0436.html
luhuayi 177cd61a57 DWS DEVG 910.211 version
Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com>
Co-authored-by: luhuayi <luhuayi@huawei.com>
Co-committed-by: luhuayi <luhuayi@huawei.com>
2025-05-05 07:44:03 +00:00

467 lines
61 KiB
HTML

<a name="EN-US_TOPIC_0000002052655454"></a><a name="EN-US_TOPIC_0000002052655454"></a>
<h1 class="topictitle1">Updating Statistics</h1>
<div id="body8662426"><p id="EN-US_TOPIC_0000002052655454__en-us_topic_0000001233681757_p78113337546">In a database, statistics indicate the source data of a plan generated by a planner. If statistics are unavailable or out of date, the execution plan may seriously deteriorate, leading to low performance.</p>
<div class="section" id="EN-US_TOPIC_0000002052655454__en-us_topic_0000001233681757_sdbde8a9d41484527a9706162eaec0ea1"><h4 class="sectiontitle">Scenario</h4><p id="EN-US_TOPIC_0000002052655454__en-us_topic_0000001233681757_p9599144717409">The <strong id="EN-US_TOPIC_0000002052655454__b76141331203818">ANALYZE</strong> statement collects statistics on database table contents. These statistics will be stored in the <strong id="EN-US_TOPIC_0000002052655454__b6614193112382">PG_STATISTIC</strong> system catalog. Then, the query optimizer uses the statistics to work out the most efficient execution plan.</p>
<p id="EN-US_TOPIC_0000002052655454__en-us_topic_0000001233681757_p41451454144012">After executing batch <strong id="EN-US_TOPIC_0000002052655454__b17435154723817">INSERT</strong> and <strong id="EN-US_TOPIC_0000002052655454__b243574723813">DELETE</strong> operations, you are advised to run the <strong id="EN-US_TOPIC_0000002052655454__b15435144773811">ANALYZE</strong> statement on the table or the entire database to update statistics. By default, 30,000 rows of statistics are sampled. That is, the default value of the GUC parameter <strong id="EN-US_TOPIC_0000002052655454__b199154110395">default_statistics_target</strong> is <strong id="EN-US_TOPIC_0000002052655454__b18915316398">100</strong>. If the total number of rows in the table exceeds 1,600,000, you are advised to set <strong id="EN-US_TOPIC_0000002052655454__b199158123914">default_statistics_target</strong> to <strong id="EN-US_TOPIC_0000002052655454__b199150103916">-2</strong>, indicating that 2% of the statistics are collected.</p>
<p id="EN-US_TOPIC_0000002052655454__en-us_topic_0000001233681757_aa4f59ffe8c4e4520a777b61db2ff0dde">For an intermediate table generated during the execution of scripts or stored procedures in batch, you also need to run the <strong id="EN-US_TOPIC_0000002052655454__b205181727153913">ANALYZE</strong> statement.</p>
<p id="EN-US_TOPIC_0000002052655454__en-us_topic_0000001233681757_p83021639060">If there are multiple inter-related columns in a table and the conditions or grouping operations based on these columns are involved in the query, collect statistics about these columns so that the query optimizer can accurately estimate the number of rows and generate an effective execution plan.</p>
</div>
<div class="section" id="EN-US_TOPIC_0000002052655454__en-us_topic_0000001233681757_s2df36e544a5e4ccea6342438f9d1f6ce"><h4 class="sectiontitle">Generating Statistics</h4><ul id="EN-US_TOPIC_0000002052655454__ul31868592515"><li id="EN-US_TOPIC_0000002052655454__li51869514259">Update statistics on a single table.<div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000002052655454__screen525117145238"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">ANALYZE</span><span class="w"> </span><span class="n">tablename</span><span class="p">;</span><span class="w"> </span>
</pre></div></td></tr></table></div>
</div>
</li><li id="EN-US_TOPIC_0000002052655454__li418655192514">Update the statistics of the entire database.<div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000002052655454__en-us_topic_0000001233681757_s8328d559ffa349d3bffcc9ad313676c3"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">ANALYZE</span><span class="p">;</span><span class="w"> </span>
</pre></div></td></tr></table></div>
</div>
</li></ul>
</div>
<ul id="EN-US_TOPIC_0000002052655454__ul10778152192911"><li id="EN-US_TOPIC_0000002052655454__li147781728296">Collect statistics from multiple columns.<ul id="EN-US_TOPIC_0000002052655454__ul191931136297"><li id="EN-US_TOPIC_0000002052655454__li20193141342913">Collect statistics on the <strong id="EN-US_TOPIC_0000002052655454__b160894908171823">column_1</strong> and <strong id="EN-US_TOPIC_0000002052655454__b1677891871823">column_2</strong> columns of the <strong id="EN-US_TOPIC_0000002052655454__b58470297371823">tablename</strong> table.<div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000002052655454__screen111718192267"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">ANALYZE</span><span class="w"> </span><span class="n">tablename</span><span class="w"> </span><span class="p">((</span><span class="n">column_1</span><span class="p">,</span><span class="w"> </span><span class="n">column_2</span><span class="p">));</span><span class="w"> </span>
</pre></div></td></tr></table></div>
</div>
</li><li id="EN-US_TOPIC_0000002052655454__li101931313122920">--Add declarations for the <strong id="EN-US_TOPIC_0000002052655454__b95236334111">column_1</strong> and <strong id="EN-US_TOPIC_0000002052655454__b1952343184118">column_2</strong> columns of the <strong id="EN-US_TOPIC_0000002052655454__b852311313412">tablename</strong> table.<div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000002052655454__screen547495315264"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">ALTER</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">tablename</span><span class="w"> </span><span class="k">ADD</span><span class="w"> </span><span class="k">STATISTICS</span><span class="w"> </span><span class="p">((</span><span class="n">column_1</span><span class="p">,</span><span class="w"> </span><span class="n">column_2</span><span class="p">));</span>
</pre></div></td></tr></table></div>
</div>
</li><li id="EN-US_TOPIC_0000002052655454__li1519361362918">Collect the statistics of a single column and statistics of multiple declared columns.<div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000002052655454__screen141625611273"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">ANALYZE</span><span class="w"> </span><span class="n">tablename</span><span class="p">;</span><span class="w"> </span>
</pre></div></td></tr></table></div>
</div>
</li><li id="EN-US_TOPIC_0000002052655454__li171931137298">Delete the statistics of <strong id="EN-US_TOPIC_0000002052655454__b747720714617">column_1</strong> and <strong id="EN-US_TOPIC_0000002052655454__b18918103465">column_2</strong> in the <strong id="EN-US_TOPIC_0000002052655454__b2033219156460">tablename</strong> table or their declarations.<div class="codecoloring" codetype="Sql" id="EN-US_TOPIC_0000002052655454__screen13443142318278"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">ALTER</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">tablename</span><span class="w"> </span><span class="k">DELETE</span><span class="w"> </span><span class="k">STATISTICS</span><span class="w"> </span><span class="p">((</span><span class="n">column_1</span><span class="p">,</span><span class="w"> </span><span class="n">column_2</span><span class="p">));</span>
</pre></div></td></tr></table></div>
</div>
</li></ul>
</li></ul>
<div class="notice" id="EN-US_TOPIC_0000002052655454__en-us_topic_0000001233681757_note121972410486"><span class="noticetitle"><img src="public_sys-resources/notice_3.0-en-us.png"> </span><div class="noticebody"><ul id="EN-US_TOPIC_0000002052655454__en-us_topic_0000001233681757_ul3627143841712"><li id="EN-US_TOPIC_0000002052655454__en-us_topic_0000001233681757_li166271938171714">After the statistics are declared for multiple columns by running the <strong id="EN-US_TOPIC_0000002052655454__b16349578249">ALTER TABLE</strong> <em id="EN-US_TOPIC_0000002052655454__i189157082518">Tablename</em> <strong id="EN-US_TOPIC_0000002052655454__b149010518250">ADD STATISTICS</strong> statement, the system collects the statistics about these columns next time <strong id="EN-US_TOPIC_0000002052655454__b76462160273">ANALYZE</strong> is performed on the table or the entire database. To collect the statistics, run the <strong id="EN-US_TOPIC_0000002052655454__b4128114294716">ANALYZE</strong> statement.</li><li id="EN-US_TOPIC_0000002052655454__en-us_topic_0000001233681757_li103621941141714">Use <strong id="EN-US_TOPIC_0000002052655454__b842352706154058">EXPLAIN</strong> to show the execution plan of each SQL statement. If <strong id="EN-US_TOPIC_0000002052655454__b842352706154428">rows=10</strong> (the default value, probably indicating the table has not been analyzed) is displayed in the <strong id="EN-US_TOPIC_0000002052655454__b84235270615448">SEQ SCAN</strong> output of a table, run the <strong id="EN-US_TOPIC_0000002052655454__b842352706154516">ANALYZE</strong> statement for this table.</li></ul>
</div></div>
<div class="section" id="EN-US_TOPIC_0000002052655454__en-us_topic_0000001233681757_section5205338131912"><h4 class="sectiontitle">Improving the Quality of Statistics</h4><p id="EN-US_TOPIC_0000002052655454__en-us_topic_0000001233681757_p1250062882013"><strong id="EN-US_TOPIC_0000002052655454__b164120974171823">ANALYZE</strong> samples data from a table based on the random sampling algorithm and calculates table data features based on the samples. The number of samples can be specified by the <strong id="EN-US_TOPIC_0000002052655454__b647411214489">default_statistics_target</strong> parameter. The value of <strong id="EN-US_TOPIC_0000002052655454__b16474121204817">default_statistics_target</strong> ranges from <strong id="EN-US_TOPIC_0000002052655454__b141371432184819">-100</strong> to <strong id="EN-US_TOPIC_0000002052655454__b17972123417487">10000</strong> and the default value is <strong id="EN-US_TOPIC_0000002052655454__b10795193744818">100</strong>.</p>
<ul id="EN-US_TOPIC_0000002052655454__en-us_topic_0000001233681757_ul8507135112424"><li id="EN-US_TOPIC_0000002052655454__en-us_topic_0000001233681757_li105072051144216">If the value of <strong id="EN-US_TOPIC_0000002052655454__b226604815485">default_statistics_target</strong> is greater than <strong id="EN-US_TOPIC_0000002052655454__b1266448104813">0</strong>, the number of samples is 300 x <strong id="EN-US_TOPIC_0000002052655454__b8266848124819">default_statistics_target</strong>. This means a larger value of <strong id="EN-US_TOPIC_0000002052655454__b6266144864810">default_statistics_target</strong> indicates a larger number of samples, larger memory space occupied by samples, and longer time required for calculating statistics.</li><li id="EN-US_TOPIC_0000002052655454__en-us_topic_0000001233681757_li19633105615427">If the value of <strong id="EN-US_TOPIC_0000002052655454__b15206152164817">default_statistics_target</strong> is smaller than <strong id="EN-US_TOPIC_0000002052655454__b120685211489">0</strong>, the number of samples is <strong id="EN-US_TOPIC_0000002052655454__b132061652164818">default_statistics_target</strong>/100 x Total number of rows in the table. A smaller value of <strong id="EN-US_TOPIC_0000002052655454__b192072052134814">default_statistics_target</strong> indicates a larger number of samples. If the value of <strong id="EN-US_TOPIC_0000002052655454__b10760555144815">default_statistics_target</strong> is smaller than <strong id="EN-US_TOPIC_0000002052655454__b13760555124815">0</strong>, the sampled data is written to the disk. In this case, the samples do not occupy memory. However, the calculation still takes a long time because the sample size is too large.<p id="EN-US_TOPIC_0000002052655454__en-us_topic_0000001233681757_p155021286202">When <strong id="EN-US_TOPIC_0000002052655454__b155437385111">default_statistics_target</strong> is negative, the number of samples is calculated as <strong id="EN-US_TOPIC_0000002052655454__b18102113125113">default_statistics_target</strong> divided by 100, multiplied by the total number of rows in the table. This sampling mode is also known as percentage sampling.</p>
</li></ul>
</div>
<div class="section" id="EN-US_TOPIC_0000002052655454__en-us_topic_0000001233681757_section8123134612210"><h4 class="sectiontitle">Automatic Statistics Collection</h4><p id="EN-US_TOPIC_0000002052655454__en-us_topic_0000001233681757_p7891087250">When the <strong id="EN-US_TOPIC_0000002052655454__b14463192212521">autoanalyze</strong> parameter is turned on, the optimizer will automatically collect statistics if it finds that there are no statistics in the table or if the data changes exceed a certain threshold. This ensures that the optimizer has the information it needs to make precise decisions.</p>
<p id="EN-US_TOPIC_0000002052655454__p1023415267529">In a cost-based optimizer (CBO) model, statistics play a crucial role in determining whether a query plan is generated. Therefore, it is crucial to have timely and effective statistics.</p>
<ul id="EN-US_TOPIC_0000002052655454__ul19956152115619"><li id="EN-US_TOPIC_0000002052655454__li995645275614">Table-level statistics are stored in <strong id="EN-US_TOPIC_0000002052655454__b6558163725316">relpages</strong> and <strong id="EN-US_TOPIC_0000002052655454__b2963400535">reltuples</strong> of <strong id="EN-US_TOPIC_0000002052655454__b64321242125310">pg_class</strong>.</li><li id="EN-US_TOPIC_0000002052655454__li15956155285620">Column-level statistics, stored in <strong id="EN-US_TOPIC_0000002052655454__b155401553105417">pg_statistics</strong> and accessible through the <strong id="EN-US_TOPIC_0000002052655454__b204911656165412">pg_statistics</strong> view, provide information on the percentage of <strong id="EN-US_TOPIC_0000002052655454__b46669305513">NULL</strong> values, percentage of distinct values, high-frequency MCV values, and histograms.</li></ul>
<p id="EN-US_TOPIC_0000002052655454__p927052035720">Collection condition: If there is a substantial change in data volume (default threshold is <strong id="EN-US_TOPIC_0000002052655454__b1155615125912">10%</strong>), indicating a shift in data characteristics, the system will initiate the collection of statistics again.</p>
<p id="EN-US_TOPIC_0000002052655454__p927092065714">Overall policy: The system enables dynamic sampling to collect statistics promptly and polling sampling to ensure persistent statistics. To ensure fast query performance with response times in seconds, it is recommended to use manual sampling.</p>
</div>
<div class="section" id="EN-US_TOPIC_0000002052655454__section693011385521"><h4 class="sectiontitle">Basic Rules</h4>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="EN-US_TOPIC_0000002052655454__table1968035713527" frame="border" border="1" rules="all"><caption><b>Table 1 </b>Typical sampling methods</caption><thead align="left"><tr id="EN-US_TOPIC_0000002052655454__row2681155713528"><th align="left" class="cellrowborder" valign="top" width="12.32%" id="mcps1.3.8.2.2.5.1.1"><p id="EN-US_TOPIC_0000002052655454__p8468193814539"><strong id="EN-US_TOPIC_0000002052655454__en-us_topic_0100839675_b1323192812918">Function</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="30.15%" id="mcps1.3.8.2.2.5.1.2"><p id="EN-US_TOPIC_0000002052655454__p16469173819535"><strong id="EN-US_TOPIC_0000002052655454__b436512517316">Description</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="42.53%" id="mcps1.3.8.2.2.5.1.3"><p id="EN-US_TOPIC_0000002052655454__p846917381531"><strong id="EN-US_TOPIC_0000002052655454__b25985231835">Feature</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="15%" id="mcps1.3.8.2.2.5.1.4"><p id="EN-US_TOPIC_0000002052655454__p5469438145319"><strong id="EN-US_TOPIC_0000002052655454__b481014251531">Constraint</strong></p>
</th>
</tr>
</thead>
<tbody><tr id="EN-US_TOPIC_0000002052655454__row206818578528"><td class="cellrowborder" valign="top" width="12.32%" headers="mcps1.3.8.2.2.5.1.1 "><p id="EN-US_TOPIC_0000002052655454__p24696386534">Auto sampling</p>
</td>
<td class="cellrowborder" valign="top" width="30.15%" headers="mcps1.3.8.2.2.5.1.2 "><p id="EN-US_TOPIC_0000002052655454__p114697381538">After making significant changes to the data in a job, you need to manually run the <strong id="EN-US_TOPIC_0000002052655454__b13284434951">ANALYZE</strong> command.</p>
</td>
<td class="cellrowborder" valign="top" width="42.53%" headers="mcps1.3.8.2.2.5.1.3 "><ul id="EN-US_TOPIC_0000002052655454__ul091951161814"><li id="EN-US_TOPIC_0000002052655454__li11911151191814">In normal mode, statistics are stored in system catalogs and shared globally. A level-4 lock is applied, preventing concurrent operations on a table.</li><li id="EN-US_TOPIC_0000002052655454__li7911151171810">In light mode, statistics are stored in memory and shared globally. A level-1 lock is applied, allowing concurrent operations on a table.</li><li id="EN-US_TOPIC_0000002052655454__li129110510184">In force mode, you can perform forcible sampling even when statistics are locked, in addition to the normal mode functionalities.</li></ul>
<p id="EN-US_TOPIC_0000002052655454__p6469103885312">Syntax: <strong id="EN-US_TOPIC_0000002052655454__b228115610136">ANALYZE tablename; ANALYZE (light|force) tablename;</strong></p>
</td>
<td class="cellrowborder" valign="top" width="15%" headers="mcps1.3.8.2.2.5.1.4 "><p id="EN-US_TOPIC_0000002052655454__p12469153885312">N/A</p>
</td>
</tr>
<tr id="EN-US_TOPIC_0000002052655454__row4682145719520"><td class="cellrowborder" valign="top" width="12.32%" headers="mcps1.3.8.2.2.5.1.1 "><p id="EN-US_TOPIC_0000002052655454__p546923885315">Polling sampling</p>
</td>
<td class="cellrowborder" valign="top" width="30.15%" headers="mcps1.3.8.2.2.5.1.2 "><p id="EN-US_TOPIC_0000002052655454__p1546963885312">Background thread operates according to a threshold.</p>
<p id="EN-US_TOPIC_0000002052655454__p1846953817534">Polling maintenance statistics</p>
</td>
<td class="cellrowborder" valign="top" width="42.53%" headers="mcps1.3.8.2.2.5.1.3 "><p id="EN-US_TOPIC_0000002052655454__p546912384533">Only the normal mode is supported. Statistics are stored in system catalogs and shared. A level-4 lock is applied, preventing concurrent operations on a table.</p>
<p id="EN-US_TOPIC_0000002052655454__p1334592019190">Related GUC parameters:</p>
<ul id="EN-US_TOPIC_0000002052655454__ul698893601920"><li id="EN-US_TOPIC_0000002052655454__li8988636131917">autovacuum</li><li id="EN-US_TOPIC_0000002052655454__li198893671912">autovacuum_mode</li><li id="EN-US_TOPIC_0000002052655454__li198883614193">autovacuum_analyze_threshold</li><li id="EN-US_TOPIC_0000002052655454__li39891736151912">autovacuum_analyze_scale_factor</li></ul>
</td>
<td class="cellrowborder" valign="top" width="15%" headers="mcps1.3.8.2.2.5.1.4 "><p id="EN-US_TOPIC_0000002052655454__p16470163815311">Asynchronous polling triggering</p>
</td>
</tr>
<tr id="EN-US_TOPIC_0000002052655454__row9682957155212"><td class="cellrowborder" valign="top" width="12.32%" headers="mcps1.3.8.2.2.5.1.1 "><p id="EN-US_TOPIC_0000002052655454__p154701138195310">Dynamic sampling</p>
</td>
<td class="cellrowborder" valign="top" width="30.15%" headers="mcps1.3.8.2.2.5.1.2 "><p id="EN-US_TOPIC_0000002052655454__p154702038135316">Depending on the threshold, the query parsing process can take several dozen seconds.</p>
<p id="EN-US_TOPIC_0000002052655454__p347016382538">Real-time maintenance statistics</p>
</td>
<td class="cellrowborder" valign="top" width="42.53%" headers="mcps1.3.8.2.2.5.1.3 "><ul id="EN-US_TOPIC_0000002052655454__ul14657184916190"><li id="EN-US_TOPIC_0000002052655454__li146573496198">In normal mode, statistics are stored in system catalogs and shared globally. A level-4 lock is applied, preventing concurrent operations on a table.</li><li id="EN-US_TOPIC_0000002052655454__li1564072616209">In light mode, statistics are stored in memory and shared globally. A level-1 lock is applied, allowing concurrent operations on a table.</li></ul>
<p id="EN-US_TOPIC_0000002052655454__p8504132812205">Related GUC parameters:</p>
<ul id="EN-US_TOPIC_0000002052655454__ul1833323318208"><li id="EN-US_TOPIC_0000002052655454__li433343312014">autoanalyze</li><li id="EN-US_TOPIC_0000002052655454__li15333153314203">autoanalyze_mode</li></ul>
</td>
<td class="cellrowborder" valign="top" width="15%" headers="mcps1.3.8.2.2.5.1.4 "><p id="EN-US_TOPIC_0000002052655454__p11470173885316">Real-time triggering upon query</p>
<p id="EN-US_TOPIC_0000002052655454__p64701038145319">In lightweight scenarios, persistence relies on polling sampling.</p>
</td>
</tr>
<tr id="EN-US_TOPIC_0000002052655454__row14683157135217"><td class="cellrowborder" valign="top" width="12.32%" headers="mcps1.3.8.2.2.5.1.1 "><p id="EN-US_TOPIC_0000002052655454__p1147011387535">Forcible sampling</p>
</td>
<td class="cellrowborder" valign="top" width="30.15%" headers="mcps1.3.8.2.2.5.1.2 "><p id="EN-US_TOPIC_0000002052655454__p17470153835313">Uses SQL hints to forcefully gather statistics for each query.</p>
</td>
<td class="cellrowborder" valign="top" width="42.53%" headers="mcps1.3.8.2.2.5.1.3 "><p id="EN-US_TOPIC_0000002052655454__p11470738125315">Used in data feature-sensitive scenarios to ensure real-time and up-to-date query statistics.</p>
<p id="EN-US_TOPIC_0000002052655454__p1547018382536">Usage: <strong id="EN-US_TOPIC_0000002052655454__b15211237073">select /*+ lightanalyze (t1 1) */ from t1;</strong> (<strong id="EN-US_TOPIC_0000002052655454__b1226962010910">1</strong>: forcible sampling; <strong id="EN-US_TOPIC_0000002052655454__b1999517249920">0</strong>: sampling disabled)</p>
</td>
<td class="cellrowborder" valign="top" width="15%" headers="mcps1.3.8.2.2.5.1.4 "><p id="EN-US_TOPIC_0000002052655454__p947043865317">The SQL statement needs to be modified.</p>
</td>
</tr>
<tr id="EN-US_TOPIC_0000002052655454__row968315755220"><td class="cellrowborder" valign="top" width="12.32%" headers="mcps1.3.8.2.2.5.1.1 "><p id="EN-US_TOPIC_0000002052655454__p54714389532">Collecting partition statistics</p>
</td>
<td class="cellrowborder" valign="top" width="30.15%" headers="mcps1.3.8.2.2.5.1.2 "><p id="EN-US_TOPIC_0000002052655454__p1547163815311">Collects incremental information by partition and combines it globally.</p>
</td>
<td class="cellrowborder" valign="top" width="42.53%" headers="mcps1.3.8.2.2.5.1.3 "><p id="EN-US_TOPIC_0000002052655454__p0471438195311">Used in ultra-large partitioned tables to ensure accurate query cost estimation after partition pruning.</p>
</td>
<td class="cellrowborder" valign="top" width="15%" headers="mcps1.3.8.2.2.5.1.4 "><p id="EN-US_TOPIC_0000002052655454__p0471193813539">This method takes up more storage space but provides greater accuracy.</p>
</td>
</tr>
<tr id="EN-US_TOPIC_0000002052655454__row1684175785218"><td class="cellrowborder" valign="top" width="12.32%" headers="mcps1.3.8.2.2.5.1.1 "><p id="EN-US_TOPIC_0000002052655454__p12471183813532">Collecting statistics from multiple columns</p>
</td>
<td class="cellrowborder" valign="top" width="30.15%" headers="mcps1.3.8.2.2.5.1.2 "><p id="EN-US_TOPIC_0000002052655454__p247133819535">Gather statistics from multiple columns.</p>
</td>
<td class="cellrowborder" valign="top" width="42.53%" headers="mcps1.3.8.2.2.5.1.3 "><p id="EN-US_TOPIC_0000002052655454__p947133885313">Used to filter multiple columns simultaneously to ensure accurate query cost estimation.</p>
</td>
<td class="cellrowborder" valign="top" width="15%" headers="mcps1.3.8.2.2.5.1.4 "><p id="EN-US_TOPIC_0000002052655454__p16471153895316">You need to select target columns manually and use temporary tables.</p>
</td>
</tr>
<tr id="EN-US_TOPIC_0000002052655454__row116841057135212"><td class="cellrowborder" valign="top" width="12.32%" headers="mcps1.3.8.2.2.5.1.1 "><p id="EN-US_TOPIC_0000002052655454__p1347123875320">Collecting expression statistics</p>
</td>
<td class="cellrowborder" valign="top" width="30.15%" headers="mcps1.3.8.2.2.5.1.2 "><p id="EN-US_TOPIC_0000002052655454__p447111386532">Collects statistics on a column based on expression functions.</p>
</td>
<td class="cellrowborder" valign="top" width="42.53%" headers="mcps1.3.8.2.2.5.1.3 "><p id="EN-US_TOPIC_0000002052655454__p154711382532">Used in batch expression filtering scenarios to ensure accurate query cost estimation.</p>
</td>
<td class="cellrowborder" valign="top" width="15%" headers="mcps1.3.8.2.2.5.1.4 "><p id="EN-US_TOPIC_0000002052655454__p64711388535">Manual identification is required.</p>
</td>
</tr>
<tr id="EN-US_TOPIC_0000002052655454__row1054014985320"><td class="cellrowborder" valign="top" width="12.32%" headers="mcps1.3.8.2.2.5.1.1 "><p id="EN-US_TOPIC_0000002052655454__p547223814539">Collecting expression index statistics</p>
</td>
<td class="cellrowborder" valign="top" width="30.15%" headers="mcps1.3.8.2.2.5.1.2 "><p id="EN-US_TOPIC_0000002052655454__p247213885312">Automatically collects statistics for created expression indexes.</p>
</td>
<td class="cellrowborder" valign="top" width="42.53%" headers="mcps1.3.8.2.2.5.1.3 "><p id="EN-US_TOPIC_0000002052655454__p3472738205310">Used in the point query expression filtering scenario to ensure accurate query cost estimation.</p>
</td>
<td class="cellrowborder" valign="top" width="15%" headers="mcps1.3.8.2.2.5.1.4 "><p id="EN-US_TOPIC_0000002052655454__p1447223895312">Manual identification is required.</p>
</td>
</tr>
<tr id="EN-US_TOPIC_0000002052655454__row1954110919532"><td class="cellrowborder" valign="top" width="12.32%" headers="mcps1.3.8.2.2.5.1.1 "><p id="EN-US_TOPIC_0000002052655454__p94722038105318">Freezing statistics</p>
</td>
<td class="cellrowborder" valign="top" width="30.15%" headers="mcps1.3.8.2.2.5.1.2 "><p id="EN-US_TOPIC_0000002052655454__p164729380534">Freezes table-level statistics to prevent changes.</p>
</td>
<td class="cellrowborder" valign="top" width="42.53%" headers="mcps1.3.8.2.2.5.1.3 "><p id="EN-US_TOPIC_0000002052655454__p9472183885315">Used in scenarios where data features are extremely stable to prevent sampling and query plan changes.</p>
<p id="EN-US_TOPIC_0000002052655454__p1847223815533">Used in scenarios where data features are highly variable to ensure sampling for each query.</p>
<p id="EN-US_TOPIC_0000002052655454__p647233812531">Parameter: table-level attribute <strong id="EN-US_TOPIC_0000002052655454__b666910359190">analyze_mode</strong></p>
</td>
<td class="cellrowborder" valign="top" width="15%" headers="mcps1.3.8.2.2.5.1.4 "><p id="EN-US_TOPIC_0000002052655454__p047233885313">N/A</p>
</td>
</tr>
<tr id="EN-US_TOPIC_0000002052655454__row154119910539"><td class="cellrowborder" valign="top" width="12.32%" headers="mcps1.3.8.2.2.5.1.1 "><p id="EN-US_TOPIC_0000002052655454__p12472338125313">Modifying statistics</p>
</td>
<td class="cellrowborder" valign="top" width="30.15%" headers="mcps1.3.8.2.2.5.1.2 "><p id="EN-US_TOPIC_0000002052655454__p4472638135315">Directly modifies statistics after manual calculation.</p>
</td>
<td class="cellrowborder" valign="top" width="42.53%" headers="mcps1.3.8.2.2.5.1.3 "><p id="EN-US_TOPIC_0000002052655454__p1147253895310">Used to maintain a low sampling ratio with manual calibration. Usage:</p>
<p id="EN-US_TOPIC_0000002052655454__p247343805312"><strong id="EN-US_TOPIC_0000002052655454__b141991911142117">select approx_count_distinct(col_name) from table_name;</strong></p>
<p id="EN-US_TOPIC_0000002052655454__p1347373816535"><strong id="EN-US_TOPIC_0000002052655454__b18765161516211">alter table set (n_distinct=xxx)</strong></p>
</td>
<td class="cellrowborder" valign="top" width="15%" headers="mcps1.3.8.2.2.5.1.4 "><p id="EN-US_TOPIC_0000002052655454__p447320389537">N/A</p>
</td>
</tr>
<tr id="EN-US_TOPIC_0000002052655454__row11542695530"><td class="cellrowborder" valign="top" width="12.32%" headers="mcps1.3.8.2.2.5.1.1 "><p id="EN-US_TOPIC_0000002052655454__p16473173835311">Copying partition information</p>
</td>
<td class="cellrowborder" valign="top" width="30.15%" headers="mcps1.3.8.2.2.5.1.2 "><p id="EN-US_TOPIC_0000002052655454__p04739383531">Copies statistics from old partitions to new ones.</p>
</td>
<td class="cellrowborder" valign="top" width="42.53%" headers="mcps1.3.8.2.2.5.1.3 "><p id="EN-US_TOPIC_0000002052655454__p19473123865319">Used for partitioned tables with minimal data feature changes to reduce statistics collection overhead.</p>
</td>
<td class="cellrowborder" valign="top" width="15%" headers="mcps1.3.8.2.2.5.1.4 "><p id="EN-US_TOPIC_0000002052655454__p747333825318">N/A</p>
</td>
</tr>
<tr id="EN-US_TOPIC_0000002052655454__row25429913533"><td class="cellrowborder" valign="top" width="12.32%" headers="mcps1.3.8.2.2.5.1.1 "><p id="EN-US_TOPIC_0000002052655454__p947353810535">Statistical information inference</p>
</td>
<td class="cellrowborder" valign="top" width="30.15%" headers="mcps1.3.8.2.2.5.1.2 "><p id="EN-US_TOPIC_0000002052655454__p154731838135311">Automatically calculates more accurate statistics based on existing data.</p>
</td>
<td class="cellrowborder" valign="top" width="42.53%" headers="mcps1.3.8.2.2.5.1.3 "><p id="EN-US_TOPIC_0000002052655454__p84731238105319">Controlled by the GUC parameter <strong id="EN-US_TOPIC_0000002052655454__b97521250122216">enable_extrapolation_stats</strong>.</p>
</td>
<td class="cellrowborder" valign="top" width="15%" headers="mcps1.3.8.2.2.5.1.4 "><p id="EN-US_TOPIC_0000002052655454__p0473133895313">N/A</p>
</td>
</tr>
<tr id="EN-US_TOPIC_0000002052655454__row754310915532"><td class="cellrowborder" valign="top" width="12.32%" headers="mcps1.3.8.2.2.5.1.1 "><p id="EN-US_TOPIC_0000002052655454__p147343875313">Backing up and restoring statistics</p>
</td>
<td class="cellrowborder" valign="top" width="30.15%" headers="mcps1.3.8.2.2.5.1.2 "><p id="EN-US_TOPIC_0000002052655454__p247433825315">Backs up statistics to an SQL statement using the <strong id="EN-US_TOPIC_0000002052655454__b12641152232">EXPLAIN (STAT ON)</strong> command.</p>
</td>
<td class="cellrowborder" valign="top" width="42.53%" headers="mcps1.3.8.2.2.5.1.3 "><p id="EN-US_TOPIC_0000002052655454__p1347423815539">Used for scenario reproduction or statistics restoration.</p>
</td>
<td class="cellrowborder" valign="top" width="15%" headers="mcps1.3.8.2.2.5.1.4 "><p id="EN-US_TOPIC_0000002052655454__p9474113818539">Statistics are exported as SQL statements.</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="EN-US_TOPIC_0000002052655454__section323816025519"><h4 class="sectiontitle">Scenarios and Strategies</h4><p id="EN-US_TOPIC_0000002052655454__p132387085510">The table below outlines typical data processing scenarios and the corresponding strategies for collecting statistics.</p>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="EN-US_TOPIC_0000002052655454__table471841785515" frame="border" border="1" rules="all"><caption><b>Table 2 </b>Statistics collection strategies</caption><thead align="left"><tr id="EN-US_TOPIC_0000002052655454__row1271871725516"><th align="left" class="cellrowborder" valign="top" width="20%" id="mcps1.3.9.3.2.4.1.1"><p id="EN-US_TOPIC_0000002052655454__p2268155813556"><strong id="EN-US_TOPIC_0000002052655454__b73221831153114">Scenario</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="30%" id="mcps1.3.9.3.2.4.1.2"><p id="EN-US_TOPIC_0000002052655454__p2268165812557"><strong id="EN-US_TOPIC_0000002052655454__b76321435113113">Description</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.9.3.2.4.1.3"><p id="EN-US_TOPIC_0000002052655454__p1226885855515"><strong id="EN-US_TOPIC_0000002052655454__b126872401319">Strategy</strong></p>
</th>
</tr>
</thead>
<tbody><tr id="EN-US_TOPIC_0000002052655454__row071941710555"><td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.9.3.2.4.1.1 "><p id="EN-US_TOPIC_0000002052655454__p6268358195516">Incremental stream processing</p>
</td>
<td class="cellrowborder" valign="top" width="30%" headers="mcps1.3.9.3.2.4.1.2 "><p id="EN-US_TOPIC_0000002052655454__p126825812556">Incremental data flow changes with no reasonable time for <strong id="EN-US_TOPIC_0000002052655454__b49555517151">ANALYZE</strong>.</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.9.3.2.4.1.3 "><p id="EN-US_TOPIC_0000002052655454__p182683583553">Enable dynamic sampling to automatically collect and share statistics globally.</p>
</td>
</tr>
<tr id="EN-US_TOPIC_0000002052655454__row167192172557"><td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.9.3.2.4.1.1 "><p id="EN-US_TOPIC_0000002052655454__p8269165811558">Online batch processing</p>
<p id="EN-US_TOPIC_0000002052655454__p6269115865516">(Data lake)</p>
</td>
<td class="cellrowborder" valign="top" width="30%" headers="mcps1.3.9.3.2.4.1.2 "><p id="EN-US_TOPIC_0000002052655454__p162691158165519">Data processing and querying occur concurrently, requiring stable queries.</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.9.3.2.4.1.3 "><p id="EN-US_TOPIC_0000002052655454__p1726915815520">Enable dynamic sampling or complete data processing and <strong id="EN-US_TOPIC_0000002052655454__b118201714165">ANALYZE</strong> within a transaction.</p>
<pre class="screen" id="EN-US_TOPIC_0000002052655454__screen15941554185718">begin;
truncate table or partition;
copy/merge/insert overwrite
ANALYZE (light) tablename;
end;</pre>
</td>
</tr>
<tr id="EN-US_TOPIC_0000002052655454__row8719191745511"><td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.9.3.2.4.1.1 "><p id="EN-US_TOPIC_0000002052655454__p2269115825511">Partition parallel processing</p>
</td>
<td class="cellrowborder" valign="top" width="30%" headers="mcps1.3.9.3.2.4.1.2 "><p id="EN-US_TOPIC_0000002052655454__p1626916589559">Concurrent data processing in different partitions</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.9.3.2.4.1.3 "><p id="EN-US_TOPIC_0000002052655454__p1027015587554">Enable dynamic or manual light sampling and collect statistics concurrently for the same table.</p>
</td>
</tr>
<tr id="EN-US_TOPIC_0000002052655454__row472071715552"><td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.9.3.2.4.1.1 "><p id="EN-US_TOPIC_0000002052655454__p1270158115512">Flat-wide table scenario</p>
</td>
<td class="cellrowborder" valign="top" width="30%" headers="mcps1.3.9.3.2.4.1.2 "><p id="EN-US_TOPIC_0000002052655454__p42701958155514">Wide table with over 100 columns</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.9.3.2.4.1.3 "><p id="EN-US_TOPIC_0000002052655454__p1056210131581">1. Enable automatic predicate management for dynamic sampling.</p>
<p id="EN-US_TOPIC_0000002052655454__p20128171612589">2. Collect statistics only on the first <em id="EN-US_TOPIC_0000002052655454__i168849529175">N</em> columns.</p>
<p id="EN-US_TOPIC_0000002052655454__p162706584558">3. Set column-level participation in sampling based on common query predicates.</p>
</td>
</tr>
<tr id="EN-US_TOPIC_0000002052655454__row17201917135514"><td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.9.3.2.4.1.1 "><p id="EN-US_TOPIC_0000002052655454__p52709588554">Large table scenario</p>
</td>
<td class="cellrowborder" valign="top" width="30%" headers="mcps1.3.9.3.2.4.1.2 "><p id="EN-US_TOPIC_0000002052655454__p8270358195513">Large data volume with changes not reaching the threshold</p>
<p id="EN-US_TOPIC_0000002052655454__p14270125810552">Variable statistics</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.9.3.2.4.1.3 "><p id="EN-US_TOPIC_0000002052655454__p9270558105510">Lower the threshold for triggering dynamic sampling.</p>
</td>
</tr>
<tr id="EN-US_TOPIC_0000002052655454__row272031710557"><td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.9.3.2.4.1.1 "><p id="EN-US_TOPIC_0000002052655454__p9270158175512">Feature-sensitive scenario</p>
</td>
<td class="cellrowborder" valign="top" width="30%" headers="mcps1.3.9.3.2.4.1.2 "><p id="EN-US_TOPIC_0000002052655454__p132716586555">Changeable data features causing unstable query plans, requiring forcible collection.</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.9.3.2.4.1.3 "><p id="EN-US_TOPIC_0000002052655454__p1327145817551">1. Lower the threshold for triggering dynamic sampling.</p>
<p id="EN-US_TOPIC_0000002052655454__p19271195875514">2. Use the <strong id="EN-US_TOPIC_0000002052655454__b1239985521910">HINT</strong> mode in SQL statements for light dynamic sampling.</p>
<p id="EN-US_TOPIC_0000002052655454__p8271158175516">3. Clear and freeze statistics, re-collecting them for each query without sharing.</p>
</td>
</tr>
<tr id="EN-US_TOPIC_0000002052655454__row27211617195511"><td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.9.3.2.4.1.1 "><p id="EN-US_TOPIC_0000002052655454__p32711158165517">High-concurrency scenario</p>
</td>
<td class="cellrowborder" valign="top" width="30%" headers="mcps1.3.9.3.2.4.1.2 "><p id="EN-US_TOPIC_0000002052655454__p10271155818555">Concurrent queries (over 10) are performed on the same table, triggering dynamic sampling and resource usage.</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.9.3.2.4.1.3 "><p id="EN-US_TOPIC_0000002052655454__p19271125811556">1. Disable concurrency, and other queries use outdated statistics.</p>
<p id="EN-US_TOPIC_0000002052655454__p152711658195511">2. Generate the latest statistics before querying (under development).</p>
</td>
</tr>
<tr id="EN-US_TOPIC_0000002052655454__row69907272556"><td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.9.3.2.4.1.1 "><p id="EN-US_TOPIC_0000002052655454__p1227195811554">Streaming performance sensitivity</p>
</td>
<td class="cellrowborder" valign="top" width="30%" headers="mcps1.3.9.3.2.4.1.2 "><p id="EN-US_TOPIC_0000002052655454__p18271135875511">Stream processing with queries responded in seconds or high resource usage</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.9.3.2.4.1.3 "><p id="EN-US_TOPIC_0000002052655454__p1027265885519">Disable dynamic sampling at the table or SQL level and use background polling sampling.</p>
</td>
</tr>
<tr id="EN-US_TOPIC_0000002052655454__row6990727105512"><td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.9.3.2.4.1.1 "><p id="EN-US_TOPIC_0000002052655454__p82722586557">Batch performance sensitivity</p>
</td>
<td class="cellrowborder" valign="top" width="30%" headers="mcps1.3.9.3.2.4.1.2 "><p id="EN-US_TOPIC_0000002052655454__p027235816555">Batch processing with queries responded in seconds or high resource usage</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.9.3.2.4.1.3 "><p id="EN-US_TOPIC_0000002052655454__p5272258155518">Manually collect statistics during processing.</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="EN-US_TOPIC_0000002052655454__section19161638105714"><h4 class="sectiontitle">Resource Consumption</h4>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="EN-US_TOPIC_0000002052655454__table151332705815" frame="border" border="1" rules="all"><caption><b>Table 3 </b>Resource consumption</caption><thead align="left"><tr id="EN-US_TOPIC_0000002052655454__row013477115817"><th align="left" class="cellrowborder" valign="top" width="20%" id="mcps1.3.10.2.2.4.1.1"><p id="EN-US_TOPIC_0000002052655454__p1913419715814"><strong id="EN-US_TOPIC_0000002052655454__b4291192853213">Category</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="30%" id="mcps1.3.10.2.2.4.1.2"><p id="EN-US_TOPIC_0000002052655454__p4134207135815"><strong id="EN-US_TOPIC_0000002052655454__b64181232193210">Sub-Category</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.10.2.2.4.1.3"><p id="EN-US_TOPIC_0000002052655454__p613416712587"><strong id="EN-US_TOPIC_0000002052655454__b922933413321">Description</strong></p>
</th>
</tr>
</thead>
<tbody><tr id="EN-US_TOPIC_0000002052655454__row5134578585"><td class="cellrowborder" rowspan="2" valign="top" width="20%" headers="mcps1.3.10.2.2.4.1.1 "><p id="EN-US_TOPIC_0000002052655454__p19171134275820">CPU</p>
<p id="EN-US_TOPIC_0000002052655454__p112016428588"></p>
</td>
<td class="cellrowborder" valign="top" width="30%" headers="mcps1.3.10.2.2.4.1.2 "><p id="EN-US_TOPIC_0000002052655454__p1717144219586">Predicate column management</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.10.2.2.4.1.3 "><p id="EN-US_TOPIC_0000002052655454__p3171134211587">Automatically manage predicates and collect statistics only on queried columns.</p>
<p id="EN-US_TOPIC_0000002052655454__p8171124215811">Manually mask non-predicate columns.</p>
</td>
</tr>
<tr id="EN-US_TOPIC_0000002052655454__row151345717582"><td class="cellrowborder" valign="top" headers="mcps1.3.10.2.2.4.1.1 "><p id="EN-US_TOPIC_0000002052655454__p917112426584">Ultra-long column statistics</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.10.2.2.4.1.2 "><p id="EN-US_TOPIC_0000002052655454__p1717124235813">Data type that can be truncated, counting only the first 1,024 characters.</p>
</td>
</tr>
<tr id="EN-US_TOPIC_0000002052655454__row191352716587"><td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.10.2.2.4.1.1 "><p id="EN-US_TOPIC_0000002052655454__p14171642135811">I/O</p>
</td>
<td class="cellrowborder" valign="top" width="30%" headers="mcps1.3.10.2.2.4.1.2 "><p id="EN-US_TOPIC_0000002052655454__p817117422585">30,000 samples are collected by default.</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.10.2.2.4.1.3 "><p id="EN-US_TOPIC_0000002052655454__p1817274211585">Related to the number of columns, partitions, and small CUs, not table size.</p>
</td>
</tr>
<tr id="EN-US_TOPIC_0000002052655454__row013515745818"><td class="cellrowborder" rowspan="4" valign="top" width="20%" headers="mcps1.3.10.2.2.4.1.1 "><p id="EN-US_TOPIC_0000002052655454__p317264215817">Memory</p>
<p id="EN-US_TOPIC_0000002052655454__p2199184211584"></p>
<p id="EN-US_TOPIC_0000002052655454__p519984218585"></p>
<p id="EN-US_TOPIC_0000002052655454__p19198144217587"></p>
</td>
<td class="cellrowborder" valign="top" width="30%" headers="mcps1.3.10.2.2.4.1.2 "><p id="EN-US_TOPIC_0000002052655454__p1317224215581">Buffer usage</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.10.2.2.4.1.3 "><p id="EN-US_TOPIC_0000002052655454__p111721242135811">At most one slot in the <strong id="EN-US_TOPIC_0000002052655454__b21157454296">cstore</strong> buffer is occupied.</p>
</td>
</tr>
<tr id="EN-US_TOPIC_0000002052655454__row12135376587"><td class="cellrowborder" valign="top" headers="mcps1.3.10.2.2.4.1.1 "><p id="EN-US_TOPIC_0000002052655454__p717264225819">Memory zero copy</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.10.2.2.4.1.2 "><p id="EN-US_TOPIC_0000002052655454__p1617214215819">Directly calculate statistics from buffer samples without organizing into tuples.</p>
</td>
</tr>
<tr id="EN-US_TOPIC_0000002052655454__row2135167105812"><td class="cellrowborder" valign="top" headers="mcps1.3.10.2.2.4.1.1 "><p id="EN-US_TOPIC_0000002052655454__p11172154215819">Memory adaptation</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.10.2.2.4.1.2 "><p id="EN-US_TOPIC_0000002052655454__p10173124225810">Configure the system to use temporary tables for sampling when memory is insufficient. Prevent temporary table creation triggered by queries using the <strong id="EN-US_TOPIC_0000002052655454__b1931743133017">analyze_stats_mode</strong> parameter.</p>
</td>
</tr>
<tr id="EN-US_TOPIC_0000002052655454__row1713610765811"><td class="cellrowborder" valign="top" headers="mcps1.3.10.2.2.4.1.1 "><p id="EN-US_TOPIC_0000002052655454__p1717319423585">Memory size</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.10.2.2.4.1.2 "><p id="EN-US_TOPIC_0000002052655454__p201731342145811">Control maximum memory usage during <strong id="EN-US_TOPIC_0000002052655454__b8810956579">ANALYZE</strong> with the <strong id="EN-US_TOPIC_0000002052655454__b1578182045713">maintenance_work_mem</strong> parameter. Exceeding memory limits results in data being written to disks or reduced samples.</p>
</td>
</tr>
<tr id="EN-US_TOPIC_0000002052655454__row1327318176581"><td class="cellrowborder" rowspan="2" valign="top" width="20%" headers="mcps1.3.10.2.2.4.1.1 "><p id="EN-US_TOPIC_0000002052655454__p1517364214589">Lock</p>
<p id="EN-US_TOPIC_0000002052655454__p15197342195811"></p>
</td>
<td class="cellrowborder" valign="top" width="30%" headers="mcps1.3.10.2.2.4.1.2 "><p id="EN-US_TOPIC_0000002052655454__p17173142105817">Level-4 lock</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.10.2.2.4.1.3 "><p id="EN-US_TOPIC_0000002052655454__p1917318425589">(Normal mode) Applied in distributed mode, conflicting with <strong id="EN-US_TOPIC_0000002052655454__b14201173419">DDL</strong>, <strong id="EN-US_TOPIC_0000002052655454__b1444715311343">VACUUM</strong>, <strong id="EN-US_TOPIC_0000002052655454__b19535059349">ANALYZE</strong>, and <strong id="EN-US_TOPIC_0000002052655454__b17284178173410">REINDEX</strong> but not with addition, deletion, or modification.</p>
</td>
</tr>
<tr id="EN-US_TOPIC_0000002052655454__row1274717145814"><td class="cellrowborder" valign="top" headers="mcps1.3.10.2.2.4.1.1 "><p id="EN-US_TOPIC_0000002052655454__p10174194245810">Level-1 lock</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.10.2.2.4.1.2 "><p id="EN-US_TOPIC_0000002052655454__p1217444245811">(Light mode) Only local level-1 lock is supported, conflicting only with DDL statements.</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="EN-US_TOPIC_0000002052655454__section1084446915"><h4 class="sectiontitle">Accuracy and Reliability</h4>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="EN-US_TOPIC_0000002052655454__table8314329111" frame="border" border="1" rules="all"><caption><b>Table 4 </b>Accuracy/Reliability</caption><thead align="left"><tr id="EN-US_TOPIC_0000002052655454__row154143216111"><th align="left" class="cellrowborder" valign="top" width="11.86%" id="mcps1.3.11.2.2.4.1.1"><p id="EN-US_TOPIC_0000002052655454__p1517719227314"><strong id="EN-US_TOPIC_0000002052655454__b6238249153417">Accuracy/Reliability</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="11.95%" id="mcps1.3.11.2.2.4.1.2"><p id="EN-US_TOPIC_0000002052655454__p184163211119"><strong id="EN-US_TOPIC_0000002052655454__b14311195515339">Item</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="76.19%" id="mcps1.3.11.2.2.4.1.3"><p id="EN-US_TOPIC_0000002052655454__p1349329118"><strong id="EN-US_TOPIC_0000002052655454__b2938195115341">Description</strong></p>
</th>
</tr>
</thead>
<tbody><tr id="EN-US_TOPIC_0000002052655454__row1141632419"><td class="cellrowborder" rowspan="7" valign="top" width="11.86%" headers="mcps1.3.11.2.2.4.1.1 "><p id="EN-US_TOPIC_0000002052655454__p9914111412220">Accuracy</p>
<p id="EN-US_TOPIC_0000002052655454__p1798771413217"></p>
<p id="EN-US_TOPIC_0000002052655454__p119855141320"></p>
<p id="EN-US_TOPIC_0000002052655454__p8983714324"></p>
<p id="EN-US_TOPIC_0000002052655454__p99817143212"></p>
<p id="EN-US_TOPIC_0000002052655454__p10979214723"></p>
<p id="EN-US_TOPIC_0000002052655454__p8978214827"></p>
</td>
<td class="cellrowborder" valign="top" width="11.95%" headers="mcps1.3.11.2.2.4.1.2 "><p id="EN-US_TOPIC_0000002052655454__p119145141220">Sampling size</p>
</td>
<td class="cellrowborder" valign="top" width="76.19%" headers="mcps1.3.11.2.2.4.1.3 "><p id="EN-US_TOPIC_0000002052655454__p199141143220">Configurable to adapt to table size with the <strong id="EN-US_TOPIC_0000002052655454__b721118544357">default_statistics_target</strong> parameter.</p>
</td>
</tr>
<tr id="EN-US_TOPIC_0000002052655454__row6573212111"><td class="cellrowborder" valign="top" headers="mcps1.3.11.2.2.4.1.1 "><p id="EN-US_TOPIC_0000002052655454__p3914414520">Sampling randomness</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.11.2.2.4.1.2 "><ul id="EN-US_TOPIC_0000002052655454__ul339552101311"><li id="EN-US_TOPIC_0000002052655454__li73951021111320">Optimize reservoir and range sampling with the <strong id="EN-US_TOPIC_0000002052655454__b14662517163614">analyze_sample_mode</strong> parameter.</li><li id="EN-US_TOPIC_0000002052655454__li13395102161318">Enhance randomness of random number calculation with the <strong id="EN-US_TOPIC_0000002052655454__b37211355143614">random_function_version</strong> parameter.</li></ul>
</td>
</tr>
<tr id="EN-US_TOPIC_0000002052655454__row551132418"><td class="cellrowborder" valign="top" headers="mcps1.3.11.2.2.4.1.1 "><p id="EN-US_TOPIC_0000002052655454__p691491419220">Global sharing</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.11.2.2.4.1.2 "><p id="EN-US_TOPIC_0000002052655454__p291481411212">Statistics can be shared across sessions and nodes.</p>
</td>
</tr>
<tr id="EN-US_TOPIC_0000002052655454__row175832219"><td class="cellrowborder" valign="top" headers="mcps1.3.11.2.2.4.1.1 "><p id="EN-US_TOPIC_0000002052655454__p29158149213">Modifying count broadcast</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.11.2.2.4.1.2 "><p id="EN-US_TOPIC_0000002052655454__p591512141425">Background thread checks and broadcasts the global modification count in polling mode.</p>
<p id="EN-US_TOPIC_0000002052655454__p169158145214">The job thread can also directly broadcast the modification count by specifying the <strong id="EN-US_TOPIC_0000002052655454__b167274468402">tuple_change_sync_threshold</strong> parameter.</p>
<p id="EN-US_TOPIC_0000002052655454__p1091541416218">Cross-CN modification and query have minimal impact. The modification count is broadcast and synchronized in asynchronous mode.</p>
</td>
</tr>
<tr id="EN-US_TOPIC_0000002052655454__row5514321617"><td class="cellrowborder" valign="top" headers="mcps1.3.11.2.2.4.1.1 "><p id="EN-US_TOPIC_0000002052655454__p119152141124">Adjusting the CU sampling ratio</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.11.2.2.4.1.2 "><p id="EN-US_TOPIC_0000002052655454__p19154141321">Increase CU sampling ratio if the CU filling rate is low, using the <strong id="EN-US_TOPIC_0000002052655454__b1388515844113">cstore_cu_sample_ratio</strong> parameter.</p>
</td>
</tr>
<tr id="EN-US_TOPIC_0000002052655454__row867322113"><td class="cellrowborder" valign="top" headers="mcps1.3.11.2.2.4.1.1 "><p id="EN-US_TOPIC_0000002052655454__p139151314726">Stabilizing distinct values</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.11.2.2.4.1.2 "><p id="EN-US_TOPIC_0000002052655454__p2091551420210">Use the <strong id="EN-US_TOPIC_0000002052655454__b841811426459">n_distinct</strong> parameter to stabilize distinct values after random sampling without increasing the sampling ratio.</p>
</td>
</tr>
<tr id="EN-US_TOPIC_0000002052655454__row461632916"><td class="cellrowborder" valign="top" headers="mcps1.3.11.2.2.4.1.1 "><p id="EN-US_TOPIC_0000002052655454__p891517141029">Statistical information calculation</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.11.2.2.4.1.2 "><p id="EN-US_TOPIC_0000002052655454__p1915191419218">Use the <strong id="EN-US_TOPIC_0000002052655454__b204754944611">enable_extrapolation_stats</strong> parameter to calculate more accurate statistics based on old statistics during distortion estimation.</p>
</td>
</tr>
<tr id="EN-US_TOPIC_0000002052655454__row9931104019114"><td class="cellrowborder" rowspan="3" valign="top" width="11.86%" headers="mcps1.3.11.2.2.4.1.1 "><p id="EN-US_TOPIC_0000002052655454__p291641419210">Reliability</p>
<p id="EN-US_TOPIC_0000002052655454__p19973614328"></p>
<p id="EN-US_TOPIC_0000002052655454__p997118141928"></p>
</td>
<td class="cellrowborder" valign="top" width="11.95%" headers="mcps1.3.11.2.2.4.1.2 "><p id="EN-US_TOPIC_0000002052655454__p189168141927">CN fault</p>
</td>
<td class="cellrowborder" valign="top" width="76.19%" headers="mcps1.3.11.2.2.4.1.3 "><p id="EN-US_TOPIC_0000002052655454__p11916121411213">Dynamic sampling is unaffected by other CN faults, and statistics are not synchronized. Query quality on the current CN remains unaffected.</p>
</td>
</tr>
<tr id="EN-US_TOPIC_0000002052655454__row169321340716"><td class="cellrowborder" valign="top" headers="mcps1.3.11.2.2.4.1.1 "><p id="EN-US_TOPIC_0000002052655454__p1991681411210">CN restoration</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.11.2.2.4.1.2 "><p id="EN-US_TOPIC_0000002052655454__p59163141023">Forcibly perform dynamic sampling and global synchronization during queries after CN recovery.</p>
</td>
</tr>
<tr id="EN-US_TOPIC_0000002052655454__row209324401310"><td class="cellrowborder" valign="top" headers="mcps1.3.11.2.2.4.1.1 "><p id="EN-US_TOPIC_0000002052655454__p1191619141221">DN fault</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.11.2.2.4.1.2 "><p id="EN-US_TOPIC_0000002052655454__p69164141129">Dynamic sampling of the logical cluster is unaffected by faults in other logical clusters.</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="EN-US_TOPIC_0000002052655454__section147425401437"><h4 class="sectiontitle">O&amp;M Monitoring</h4><p id="EN-US_TOPIC_0000002052655454__p71939111243">GaussDB(DWS) offers a comprehensive view of the <strong id="EN-US_TOPIC_0000002052655454__b1146163044911">ANALYZE</strong> running mode and different execution stages by adding comments after the <strong id="EN-US_TOPIC_0000002052655454__b1372693414915">ANALYZE</strong> command. This information is primarily presented through the following views:</p>
<ul id="EN-US_TOPIC_0000002052655454__ul128556476119"><li id="EN-US_TOPIC_0000002052655454__li18551947181117"><strong id="EN-US_TOPIC_0000002052655454__b184781156194911">query</strong> column in the <strong id="EN-US_TOPIC_0000002052655454__b14669507505">pgxc_stat_activity</strong> view</li><li id="EN-US_TOPIC_0000002052655454__li3855104771117"><strong id="EN-US_TOPIC_0000002052655454__b1753211835018">wait_status</strong> column in the <strong id="EN-US_TOPIC_0000002052655454__b753216186504">pgxc_thread_wait_status</strong> view</li></ul>
<p id="EN-US_TOPIC_0000002052655454__p2033419311233">The format of the <strong id="EN-US_TOPIC_0000002052655454__b26711022526">ANALYZE</strong> command is <strong id="EN-US_TOPIC_0000002052655454__b333710845211">--Action-RunMode-StatsMode-SyncMode</strong>.</p>
<ul id="EN-US_TOPIC_0000002052655454__ul4535015181310"><li id="EN-US_TOPIC_0000002052655454__li453518158138">Values and meanings of <strong id="EN-US_TOPIC_0000002052655454__b5267610153717">Action</strong>:<pre class="screen" id="EN-US_TOPIC_0000002052655454__screen159216316126"> {"begin", "finished", "lock FirstCN", "estimate rows", "statistics", "sample rows", "calc stats"};</pre>
<p id="EN-US_TOPIC_0000002052655454__p71621438141319"><strong id="EN-US_TOPIC_0000002052655454__b158418217134">begin</strong>: indicates the start of the process; <strong id="EN-US_TOPIC_0000002052655454__b14910225181316">finished</strong>: indicates the end of the process; <strong id="EN-US_TOPIC_0000002052655454__b0439144131314">lock FirstCN</strong>: applies a lock from the FirstCN; <strong id="EN-US_TOPIC_0000002052655454__b5529516148">estimate rows</strong>: estimates the number of rows in the first phase; <strong id="EN-US_TOPIC_0000002052655454__b1397041821419">statistics</strong>: executes <strong id="EN-US_TOPIC_0000002052655454__b18471026161416">ANALYZE</strong> in the second phase; <strong id="EN-US_TOPIC_0000002052655454__b1058210376144">sample rows</strong>: collects samples in the second phase; <strong id="EN-US_TOPIC_0000002052655454__b1530824919147">calc stats</strong>: calculates statistics in the second phase.</p>
</li></ul>
<ul id="EN-US_TOPIC_0000002052655454__ul173561921181217"><li id="EN-US_TOPIC_0000002052655454__li1559116134141">Values and meanings of <strong id="EN-US_TOPIC_0000002052655454__b15477732143711">RunMode</strong>:<pre class="screen" id="EN-US_TOPIC_0000002052655454__screen9936171911147"> {"manual", "backend", "normal runtime", "light runtime", "light runtime inxact", "light estimate rows", "light manual"};</pre>
<p id="EN-US_TOPIC_0000002052655454__p13134711151517"><strong id="EN-US_TOPIC_0000002052655454__b95841859151516">manual</strong>: indicates the manual mode; <strong id="EN-US_TOPIC_0000002052655454__b1982891419166">backend</strong>: indicates the background polling mode; <strong id="EN-US_TOPIC_0000002052655454__b055953313164">normal runtime</strong>: indicates the normal dynamic sampling; <strong id="EN-US_TOPIC_0000002052655454__b25562819179">light runtime</strong>: indicates the light dynamic samplin; <strong id="EN-US_TOPIC_0000002052655454__b15759183520176">light runtime inxact</strong>: indicates the light dynamic sampling in a transaction; <strong id="EN-US_TOPIC_0000002052655454__b914385720173">light estimate rows</strong> indicates the light estimation function only; <strong id="EN-US_TOPIC_0000002052655454__b51302137183">light manual</strong>: indicates the manual light mode.</p>
</li><li id="EN-US_TOPIC_0000002052655454__li18442133611158">Values and meanings of <strong id="EN-US_TOPIC_0000002052655454__b20198239193719">StatsMode</strong>:<pre class="screen" id="EN-US_TOPIC_0000002052655454__screen3596114231510"> {"dynamic", "memory", "smptbl"};</pre>
<p id="EN-US_TOPIC_0000002052655454__p6193191119413"><strong id="EN-US_TOPIC_0000002052655454__b2475753161812">dynamic</strong>: indicates adaptive selection of memory or temporary table placement samples; <strong id="EN-US_TOPIC_0000002052655454__b17427675194">memory</strong>: uses only internal storage samples; <strong id="EN-US_TOPIC_0000002052655454__b109038345198">smptbl</strong>: uses only temporary table placement samples.</p>
</li><li id="EN-US_TOPIC_0000002052655454__li16541335121617">Values and meanings of <strong id="EN-US_TOPIC_0000002052655454__b191921250173719">SyncMode</strong>:<pre class="screen" id="EN-US_TOPIC_0000002052655454__screen19324104101620"> {"sync", "nosync"};</pre>
<p id="EN-US_TOPIC_0000002052655454__p8194171110411"><strong id="EN-US_TOPIC_0000002052655454__b161933553197">sync</strong>: Statistics are synchronized to all CNs; <strong id="EN-US_TOPIC_0000002052655454__b178417372019">nosync</strong>: Statistics are not synchronized.</p>
</li></ul>
<p id="EN-US_TOPIC_0000002052655454__p55301341446">Example:</p>
<pre class="screen" id="EN-US_TOPIC_0000002052655454__screen1388219541748">SELECT coorname,datid,datname,pid,usename,application_name,query_id,query
FROM pgxc_stat_activity WHERE query like '%analyze%' and query not like '%application_name%';
coorname | datid | datname | pid | usename | application_name | query_id | query
--------------+-------+----------+-----------------+-----------+------------------+-------------------+-----------------------------------------
coordinator1 | 15676 | postgres | 139919333779200 | test | gsql | 73183493944770822 | analyze t_1;
coordinator2 | 15676 | postgres | 140217336461056 | test | coordinator1 | 73183493944770822 | analyze public.t_1;--push stats-manual-memory-sync
coordinator3 | 15676 | postgres | 139944245847808 | test | coordinator1 | 73183493944770822 | analyze public.t_1;--push stats-manual-memory-sync
(3 rows)</pre>
</div>
<div class="section" id="EN-US_TOPIC_0000002052655454__section1625208618"><h4 class="sectiontitle">Viewing Statistics</h4><ul id="EN-US_TOPIC_0000002052655454__ul7694103810718"><li id="EN-US_TOPIC_0000002052655454__li369412382711">Check the dynamically sampled memory statistics.<ul id="EN-US_TOPIC_0000002052655454__ul109371722121711"><li id="EN-US_TOPIC_0000002052655454__li1593722219175">Retrieve table-level memory statistics.<pre class="screen" id="EN-US_TOPIC_0000002052655454__screen1970611131774">SELECT * FROM pv_runtime_relstats; </pre>
</li><li id="EN-US_TOPIC_0000002052655454__li189371122191717">Retrieve column-level memory statistics.<pre class="screen" id="EN-US_TOPIC_0000002052655454__screen4261193017718">SELECT * FROM pv_runtime_attstats; </pre>
</li></ul>
</li><li id="EN-US_TOPIC_0000002052655454__li148715498714">Check the system catalog statistics.<ul id="EN-US_TOPIC_0000002052655454__ul869633241720"><li id="EN-US_TOPIC_0000002052655454__li1169611323174">Check the table-level system catalog statistics.<pre class="screen" id="EN-US_TOPIC_0000002052655454__screen66066101683">select relname, relpages, reltuples from pg_class; </pre>
</li><li id="EN-US_TOPIC_0000002052655454__li176971632141719">Check the column-level system catalog statistics.<pre class="screen" id="EN-US_TOPIC_0000002052655454__screen1468445013815">SELECT * FROM pg_stats; </pre>
</li></ul>
</li><li id="EN-US_TOPIC_0000002052655454__li11134130295">Check the latest time when statistics are collected.<p id="EN-US_TOPIC_0000002052655454__p10366181214915"><a name="EN-US_TOPIC_0000002052655454__li11134130295"></a><a name="li11134130295"></a>Dynamic sampling stores statistics in memory without modifying the timestamp of the system catalog.</p>
<pre class="screen" id="EN-US_TOPIC_0000002052655454__screen125101033898">SELECT * FROM pg_object; </pre>
</li></ul>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="dws_04_0430.html">SQL Tuning</a></div>
</div>
</div>