doc-exports/docs/dli/sqlreference/dli_08_0205.html
Su, Xiaomeng 76a5b1ee83 dli_sqlreference_20240227
Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com>
Co-authored-by: Su, Xiaomeng <suxiaomeng1@huawei.com>
Co-committed-by: Su, Xiaomeng <suxiaomeng1@huawei.com>
2024-03-27 22:02:33 +00:00

65 lines
8.5 KiB
HTML

<a name="dli_08_0205"></a><a name="dli_08_0205"></a>
<h1 class="topictitle1">Exporting Search Results</h1>
<div id="body8662426"><div class="section" id="dli_08_0205__en-us_topic_0156816309_en-us_topic_0093946741_s86887e0a86644c22b61a4c2ccc84025e"><h4 class="sectiontitle">Function</h4><p id="dli_08_0205__en-us_topic_0156816309_en-us_topic_0093946741_a9755a68b9d0d47668ac554151d03e0ce">This statement is used to directly write query results to a specified directory. The query results can be stored in CSV, Parquet, ORC, JSON, or Avro format.</p>
</div>
<div class="section" id="dli_08_0205__en-us_topic_0156816309_en-us_topic_0093946741_sd6725bae42f0429e8bc6dfa2e92b9664"><h4 class="sectiontitle">Syntax</h4><div class="codecoloring" codetype="Sql" id="dli_08_0205__en-us_topic_0156816309_screen16901195231215"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span>
<span class="normal">2</span>
<span class="normal">3</span>
<span class="normal">4</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">INSERT</span><span class="w"> </span><span class="n">OVERWRITE</span><span class="w"> </span><span class="n">DIRECTORY</span><span class="w"> </span><span class="n">path</span>
<span class="w"> </span><span class="k">USING</span><span class="w"> </span><span class="n">file_format</span>
<span class="w"> </span><span class="p">[</span><span class="k">OPTIONS</span><span class="p">(</span><span class="n">key1</span><span class="o">=</span><span class="n">value1</span><span class="p">)]</span>
<span class="w"> </span><span class="n">select_statement</span><span class="p">;</span>
</pre></div></td></tr></table></div>
</div>
</div>
<div class="section" id="dli_08_0205__en-us_topic_0156816309_section10966119185419"><h4 class="sectiontitle">Keywords</h4><ul id="dli_08_0205__en-us_topic_0156816309_ul4673101142316"><li id="dli_08_0205__en-us_topic_0156816309_li56737172317">USING: Specifies the storage format.</li><li id="dli_08_0205__en-us_topic_0156816309_li196747110233">OPTIONS: Specifies the list of attributes to be exported. This parameter is optional.</li></ul>
</div>
<div class="section" id="dli_08_0205__en-us_topic_0156816309_section1251695502513"><h4 class="sectiontitle">Parameter</h4>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="dli_08_0205__en-us_topic_0156816309_en-us_topic_0093946741_td8eb9ff5337945feb0b707d81a4acc90" frame="border" border="1" rules="all"><caption><b>Table 1 </b>INSERT OVERWRITE DIRECTORY parameters</caption><thead align="left"><tr id="dli_08_0205__en-us_topic_0156816309_en-us_topic_0093946741_ra973c320b2524745ab259dca3a46809f"><th align="left" class="cellrowborder" valign="top" width="25.2%" id="mcps1.3.4.2.2.3.1.1"><p id="dli_08_0205__en-us_topic_0156816309_en-us_topic_0093946741_a14c009dd6ca34d2caf68bc9c24fcf82b"><strong id="dli_08_0205__b3348538163219">Parameter</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="74.8%" id="mcps1.3.4.2.2.3.1.2"><p id="dli_08_0205__en-us_topic_0156816309_en-us_topic_0093946741_abff9a2e7c9614f08973e2b8afb9d7abe"><strong id="dli_08_0205__en-us_topic_0093946771_en-us_topic_0053447306_en-us_topic_0039551470_b6335010717346">Description</strong></p>
</th>
</tr>
</thead>
<tbody><tr id="dli_08_0205__en-us_topic_0156816309_en-us_topic_0093946741_r1a16c5faa82047138ab4fae99da6b80a"><td class="cellrowborder" valign="top" width="25.2%" headers="mcps1.3.4.2.2.3.1.1 "><p id="dli_08_0205__en-us_topic_0156816309_en-us_topic_0093946741_aa30066b85f034fd0b4f9e4edfde93b43">path</p>
</td>
<td class="cellrowborder" valign="top" width="74.8%" headers="mcps1.3.4.2.2.3.1.2 "><p id="dli_08_0205__en-us_topic_0156816309_en-us_topic_0093946741_aff3839c9a643497ab38dc650affa62de">The OBS path to which the query result is to be written.</p>
</td>
</tr>
<tr id="dli_08_0205__en-us_topic_0156816309_en-us_topic_0093946741_re196701bd81944f3b77ddae2d89f4878"><td class="cellrowborder" valign="top" width="25.2%" headers="mcps1.3.4.2.2.3.1.1 "><p id="dli_08_0205__en-us_topic_0156816309_en-us_topic_0093946741_a3f6ef3ea3f764b27a3832c7fbf84654f">file_format</p>
</td>
<td class="cellrowborder" valign="top" width="74.8%" headers="mcps1.3.4.2.2.3.1.2 "><p id="dli_08_0205__en-us_topic_0156816309_en-us_topic_0093946741_ae3c5c3ed67574e0482fa3c69c38ce5f8">Format of the file to be written. The value can be CSV, Parquet, ORC, JSON, or Avro.</p>
</td>
</tr>
</tbody>
</table>
</div>
<div class="note" id="dli_08_0205__en-us_topic_0156816309_note1052811920238"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="dli_08_0205__en-us_topic_0156816309_p8528131992315">If <strong id="dli_08_0205__b910915186257">file_format</strong> is set to <strong id="dli_08_0205__b199717469326">csv</strong>, see <a href="dli_08_0076.html#dli_08_0076__dli_08_0076_en-us_topic_0114776170_table1876517231928">Table 3</a> for the OPTIONS parameters.</p>
</div></div>
</div>
<div class="section" id="dli_08_0205__en-us_topic_0156816309_section256972552612"><h4 class="sectiontitle">Precautions</h4><ul id="dli_08_0205__ul1918510575263"><li id="dli_08_0205__li7185657142618">You can configure the <span class="parmname" id="dli_08_0205__parmname18233550121510"><b>spark.sql.shuffle.partitions</b></span> parameter to set the number of files to be inserted into the OBS bucket in the non-DLI table. In addition, to avoid data skew, you can add <strong id="dli_08_0205__b5233150181511">distribute by rand()</strong> to the end of the INSERT statement to increase the number of concurrent jobs. The following is an example:<pre class="screen" id="dli_08_0205__screen14809114794213">insert into table table_target select * from table_source distribute by cast(rand() * N as int);</pre>
</li><li id="dli_08_0205__en-us_topic_0114776194_en-us_topic_0093946741_li1283316012322">When the configuration item is <strong id="dli_08_0205__b274642313514">OPTIONS('DELIMITER'=',')</strong>, you can specify a separator. The default value is <span class="parmvalue" id="dli_08_0205__en-us_topic_0114776194_en-us_topic_0093946741_parmvalue3976890011539"><b>,</b></span>.<p id="dli_08_0205__p15161732113516">For CSV data, the following delimiters are supported:</p>
<ul id="dli_08_0205__ul55224545357"><li id="dli_08_0205__li226614819353">Tab character, for example, <strong id="dli_08_0205__b86713400353">'DELIMITER'='\t'</strong>.</li><li id="dli_08_0205__li6833163410550">Any binary character, for example, <strong id="dli_08_0205__b1593944223513">'DELIMITER'='\u0001(^A)'</strong>.</li><li id="dli_08_0205__li15171039125620">Single quotation mark ('). A single quotation mark must be enclosed in double quotation marks (" "). For example, <strong id="dli_08_0205__b4581185483510">'DELIMITER'= "'"</strong>.</li><li id="dli_08_0205__li725165103713"><strong id="dli_08_0205__b5604582355">\001(^A)</strong> and <strong id="dli_08_0205__b146614585356">\017(^Q)</strong> are also supported, for example, <strong id="dli_08_0205__b76665819358">'DELIMITER'='\001(^A)'</strong> and <strong id="dli_08_0205__b146685823514">'DELIMITER'='\017(^Q)'</strong>.</li></ul>
</li></ul>
</div>
<div class="section" id="dli_08_0205__en-us_topic_0156816309_section53871537273"><h4 class="sectiontitle">Example</h4><div class="codecoloring" codetype="Sql" id="dli_08_0205__en-us_topic_0156816309_screen1166013811137"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span>
<span class="normal">2</span>
<span class="normal">3</span>
<span class="normal">4</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">INSERT</span><span class="w"> </span><span class="n">OVERWRITE</span><span class="w"> </span><span class="n">DIRECTORY</span><span class="w"> </span><span class="s1">'obs://bucket/dir'</span>
<span class="w"> </span><span class="k">USING</span><span class="w"> </span><span class="n">csv</span>
<span class="w"> </span><span class="k">OPTIONS</span><span class="p">(</span><span class="n">key1</span><span class="o">=</span><span class="n">value1</span><span class="p">)</span>
<span class="w"> </span><span class="k">select</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="k">from</span><span class="w"> </span><span class="n">db1</span><span class="p">.</span><span class="n">tb1</span><span class="p">;</span>
</pre></div></td></tr></table></div>
</div>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="dli_08_0221.html">Spark SQL Syntax Reference</a></div>
</div>
</div>