forked from docs/doc-exports
Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com> Co-authored-by: Hasko, Vladimir <vladimir.hasko@t-systems.com> Co-committed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com>
176 lines
48 KiB
HTML
176 lines
48 KiB
HTML
<a name="dli_05_0044"></a><a name="dli_05_0044"></a>
|
|
|
|
<h1 class="topictitle1">Using Spark SQL Jobs to Analyze OBS Data</h1>
|
|
<div id="body8662426"><p id="dli_05_0044__en-us_topic_0000001242084772_p8060118">DLI allows you to use data stored on OBS. You can create OBS tables on DLI to access and process data in your OBS bucket.</p>
|
|
<p id="dli_05_0044__en-us_topic_0000001242084772_p1917715241913">This section describes how to create an OBS table on DLI, import data to the table, and insert and query table data.</p>
|
|
<div class="section" id="dli_05_0044__en-us_topic_0000001242084772_section17626143841511"><h4 class="sectiontitle">Prerequisites</h4><ul id="dli_05_0044__en-us_topic_0000001242084772_ul2431135921611"><li id="dli_05_0044__en-us_topic_0000001242084772_li10431105991616">You have created an OBS bucket. For more information about OBS, see the Object Storage Service Console Operation Guide. In this example, the OBS bucket name is <strong id="dli_05_0044__en-us_topic_0000001242084772_b887165410597">dli-test-021</strong>.</li><li id="dli_05_0044__en-us_topic_0000001242084772_li297254531810">You have created a DLI SQL queue. For details, see Creating a Queue.<p id="dli_05_0044__en-us_topic_0000001242084772_p168611617209"><a name="dli_05_0044__en-us_topic_0000001242084772_li297254531810"></a><a name="en-us_topic_0000001242084772_li297254531810"></a><strong id="dli_05_0044__en-us_topic_0000001242084772_b143271717171718">Note</strong>: When you create the DLI queue, set <strong id="dli_05_0044__en-us_topic_0000001242084772_b1859714285184">Type</strong> to <strong id="dli_05_0044__en-us_topic_0000001242084772_b3551644181810">For SQL</strong>.</p>
|
|
</li></ul>
|
|
</div>
|
|
<div class="section" id="dli_05_0044__en-us_topic_0000001242084772_section1465612208496"><h4 class="sectiontitle">Preparations</h4><p id="dli_05_0044__en-us_topic_0000001242084772_p20595135413280"><strong id="dli_05_0044__en-us_topic_0000001242084772_b999316466180">Creating a Database on DLI</strong></p>
|
|
<ol id="dli_05_0044__en-us_topic_0000001242084772_ol1736912722914"><li id="dli_05_0044__en-us_topic_0000001242084772_li15369137152913">Log in to the DLI management console and click <strong id="dli_05_0044__en-us_topic_0000001242084772_b157391841151918">SQL Editor</strong>. On the displayed page, set <strong id="dli_05_0044__en-us_topic_0000001242084772_b5741134116193">Engine</strong> to <strong id="dli_05_0044__en-us_topic_0000001242084772_b3742134151920">spark</strong> and <strong id="dli_05_0044__en-us_topic_0000001242084772_b074417418199">Queue</strong> to the created SQL queue.</li><li id="dli_05_0044__en-us_topic_0000001242084772_li125211912297">Enter the following statement in the SQL editing window to create the <strong id="dli_05_0044__en-us_topic_0000001242084772_b1426312312200">testdb</strong> database. <pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen21331426135018">create database testdb;</pre>
|
|
</li></ol>
|
|
<p id="dli_05_0044__en-us_topic_0000001242084772_p134636210503">The following operations in this section must be performed for the <strong id="dli_05_0044__en-us_topic_0000001242084772_b1148717186216">testdb</strong> database.</p>
|
|
</div>
|
|
<div class="section" id="dli_05_0044__en-us_topic_0000001242084772_section1229781333811"><h4 class="sectiontitle">DataSource and Hive Syntax for Creating an OBS Table on DLI</h4><p id="dli_05_0044__en-us_topic_0000001242084772_p3750174410351">The main difference between DataSource syntax and Hive syntax lies in the range of table data storage formats supported and the number of partitions supported. For the key differences in creating OBS tables using these two syntax, refer to <a href="#dli_05_0044__en-us_topic_0000001242084772_table8559753103819">Table 1</a>.</p>
|
|
|
|
<div class="tablenoborder"><a name="dli_05_0044__en-us_topic_0000001242084772_table8559753103819"></a><a name="en-us_topic_0000001242084772_table8559753103819"></a><table cellpadding="4" cellspacing="0" summary="" id="dli_05_0044__en-us_topic_0000001242084772_table8559753103819" frame="border" border="1" rules="all"><caption><b>Table 1 </b>Syntax differences</caption><thead align="left"><tr id="dli_05_0044__en-us_topic_0000001242084772_row12559195316381"><th align="left" class="cellrowborder" valign="top" width="9.42%" id="mcps1.3.5.3.2.5.1.1"><p id="dli_05_0044__en-us_topic_0000001242084772_p145597533382">Syntax</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="29.45%" id="mcps1.3.5.3.2.5.1.2"><p id="dli_05_0044__en-us_topic_0000001242084772_p4559135310389">Data Types</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="36.13%" id="mcps1.3.5.3.2.5.1.3"><p id="dli_05_0044__en-us_topic_0000001242084772_p255955313814">Partitioning</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="25%" id="mcps1.3.5.3.2.5.1.4"><p id="dli_05_0044__en-us_topic_0000001242084772_p18559175318385">Number of Partitions</p>
|
|
</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody><tr id="dli_05_0044__en-us_topic_0000001242084772_row5559153103810"><td class="cellrowborder" valign="top" width="9.42%" headers="mcps1.3.5.3.2.5.1.1 "><p id="dli_05_0044__en-us_topic_0000001242084772_p755920532387">DataSource</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="29.45%" headers="mcps1.3.5.3.2.5.1.2 "><p id="dli_05_0044__en-us_topic_0000001242084772_p45591853183817">ORC, PARQUET, JSON, CSV, and AVRO</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="36.13%" headers="mcps1.3.5.3.2.5.1.3 "><p id="dli_05_0044__en-us_topic_0000001242084772_p1559453143815">You need to specify the partitioning column in both CREATE TABLE and PARTITIONED BY statements. For details, see <a href="#dli_05_0044__en-us_topic_0000001242084772_li1473310571410">Creating a Single-Partition OBS Table Using DataSource Syntax</a>.</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.5.3.2.5.1.4 "><p id="dli_05_0044__en-us_topic_0000001242084772_p1729964394818">A maximum of 7,000 partitions can be created in a single table.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_05_0044__en-us_topic_0000001242084772_row115591053103817"><td class="cellrowborder" valign="top" width="9.42%" headers="mcps1.3.5.3.2.5.1.1 "><p id="dli_05_0044__en-us_topic_0000001242084772_p355918534383">Hive</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="29.45%" headers="mcps1.3.5.3.2.5.1.2 "><p id="dli_05_0044__en-us_topic_0000001242084772_p4559115363812">TEXTFILE, AVRO, ORC, SEQUENCEFILE, RCFILE, and PARQUET</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="36.13%" headers="mcps1.3.5.3.2.5.1.3 "><p id="dli_05_0044__en-us_topic_0000001242084772_p10559165343810">Do not specify the partitioning column in the CREATE TABLE statement. Specify the column name and data type in the PARTITIONED BY statement. For details, see <a href="#dli_05_0044__en-us_topic_0000001242084772_li68154254017">Creating an OBS Table Using Hive Syntax</a>.</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.5.3.2.5.1.4 "><p id="dli_05_0044__en-us_topic_0000001242084772_p12489103354810">A maximum of 100,000 partitions can be created in a single table.</p>
|
|
</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
</div>
|
|
</div>
|
|
<div class="section" id="dli_05_0044__en-us_topic_0000001242084772_section56331851172019"><h4 class="sectiontitle">Creating an OBS Table Using the DataSource Syntax</h4><p id="dli_05_0044__en-us_topic_0000001242084772_p16954917537">The following describes how to create an OBS table for CSV files. The methods of creating OBS tables for other file formats are similar.</p>
|
|
<ul id="dli_05_0044__en-us_topic_0000001242084772_ul46011549274"><li id="dli_05_0044__en-us_topic_0000001242084772_li7601104172715">Create a non-partitioned OBS table.<ul id="dli_05_0044__en-us_topic_0000001242084772_ul249692943119"><li id="dli_05_0044__en-us_topic_0000001242084772_li7645127163112">Specify an OBS file and create an OBS table for the CSV data.<ol id="dli_05_0044__en-us_topic_0000001242084772_ol166911614105111"><li id="dli_05_0044__en-us_topic_0000001242084772_li395033925113">Create the <strong id="dli_05_0044__en-us_topic_0000001242084772_b5538116152214">test.csv</strong> file containing the following content and upload the <strong id="dli_05_0044__en-us_topic_0000001242084772_b1189103620226">test.csv</strong> file to the <strong id="dli_05_0044__en-us_topic_0000001242084772_b319017406220">root</strong> directory of OBS bucket <strong id="dli_05_0044__en-us_topic_0000001242084772_b92531444132216">dli-test-021</strong>:<pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen57031715135219">Jordon,88,23
|
|
Kim,87,25
|
|
Henry,76,26</pre>
|
|
</li><li id="dli_05_0044__en-us_topic_0000001242084772_li7851224135213">Log in to the DLI management console and choose <strong id="dli_05_0044__en-us_topic_0000001242084772_b17218141012238">SQL Editor</strong> from the navigation pane on the left. In the SQL editing window, set <strong id="dli_05_0044__en-us_topic_0000001242084772_b4523841192313">Engine</strong> to <strong id="dli_05_0044__en-us_topic_0000001242084772_b8996104217236">spark</strong>, <strong id="dli_05_0044__en-us_topic_0000001242084772_b18917744172313">Queue</strong> to the SQL queue you have created, and <strong id="dli_05_0044__en-us_topic_0000001242084772_b1225912588236">Database</strong> to <strong id="dli_05_0044__en-us_topic_0000001242084772_b4541554249">testdb</strong>. Run the following statement to create an OBS table:<pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen1735111212713">CREATE TABLE testcsvdatasource (name STRING, score DOUBLE, classNo INT
|
|
) USING csv OPTIONS (path "obs://dli-test-021/test.csv");</pre>
|
|
<div class="caution" id="dli_05_0044__en-us_topic_0000001242084772_note2111509374"><span class="cautiontitle"><img src="public_sys-resources/caution_3.0-en-us.png"> </span><div class="cautionbody"><p id="dli_05_0044__en-us_topic_0000001242084772_p12117509374">If you create an OBS table using a specified file, you cannot insert data to the table with DLI. The OBS file content is synchronized with the table data.</p>
|
|
</div></div>
|
|
</li><li id="dli_05_0044__en-us_topic_0000001242084772_li1623173413522">Run the following statement to query data in the <strong id="dli_05_0044__en-us_topic_0000001242084772_b17863218182617">testcsvdatasource</strong> table.<pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen73517128277">select * from testcsvdatasource;</pre>
|
|
</li><li id="dli_05_0044__en-us_topic_0000001242084772_li2060884515527">Open the <strong id="dli_05_0044__en-us_topic_0000001242084772_b1698932372717">test.csv</strong> file on the local PC, add <strong id="dli_05_0044__en-us_topic_0000001242084772_b48913618278">Aarn,98,20</strong> to the file, and replace the original <strong id="dli_05_0044__en-us_topic_0000001242084772_b12989184917275">test.csv</strong> file in the OBS bucket.<pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen973254554019">Jordon,88,23
|
|
Kim,87,25
|
|
Henry,76,26
|
|
<strong id="dli_05_0044__en-us_topic_0000001242084772_b17306167202520">Aarn,98,20</strong></pre>
|
|
</li><li id="dli_05_0044__en-us_topic_0000001242084772_li1069264875213">In the DLI <strong id="dli_05_0044__en-us_topic_0000001242084772_b124788343281">SQL Editor</strong>, query the <strong id="dli_05_0044__en-us_topic_0000001242084772_b1655704615283">testcsvdatasource</strong> table for <strong id="dli_05_0044__en-us_topic_0000001242084772_b782195220286">Aarn,98,20</strong>. The result is displayed.<pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen107150012015">select * from testcsvdatasource;</pre>
|
|
</li></ol>
|
|
</li><li id="dli_05_0044__en-us_topic_0000001242084772_li791925710363">Specify an OBS directory and create an OBS table for CSV data.<ul id="dli_05_0044__en-us_topic_0000001242084772_ul74489193616"><li id="dli_05_0044__en-us_topic_0000001242084772_li15426755367">The specified OBS data directory does not contain files you want to import to the table.<ol id="dli_05_0044__en-us_topic_0000001242084772_ol3962111835418"><li id="dli_05_0044__en-us_topic_0000001242084772_li1373662310432">Create the file directory <strong id="dli_05_0044__en-us_topic_0000001242084772_b102891043143013">data</strong> in the <strong id="dli_05_0044__en-us_topic_0000001242084772_b1370544613020">root</strong> directory of the OBS bucket <strong id="dli_05_0044__en-us_topic_0000001242084772_b352975017306">dli-test-021</strong>.</li><li id="dli_05_0044__en-us_topic_0000001242084772_li89621418195412">Log in to the DLI management console and click <strong id="dli_05_0044__en-us_topic_0000001242084772_b88485323011">SQL Editor</strong>. On the displayed page, set <strong id="dli_05_0044__en-us_topic_0000001242084772_b178555313308">Engine</strong> to <strong id="dli_05_0044__en-us_topic_0000001242084772_b7861753123010">spark</strong>, <strong id="dli_05_0044__en-us_topic_0000001242084772_b118735383016">Queue</strong> to the created SQL queue, and <strong id="dli_05_0044__en-us_topic_0000001242084772_b167941228103110">Database</strong> to <strong id="dli_05_0044__en-us_topic_0000001242084772_b1337062618319">testdb</strong>. Run the following statement to create OBS table <strong id="dli_05_0044__en-us_topic_0000001242084772_b18399143553111">testcsvdata2source</strong> in the <strong id="dli_05_0044__en-us_topic_0000001242084772_b98071458153119">testdb</strong> database on DLI:<pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen29693452432">CREATE TABLE testcsvdata2source (name STRING, score DOUBLE, classNo INT) USING csv OPTIONS (path "obs://dli-test-021/data");</pre>
|
|
</li><li id="dli_05_0044__en-us_topic_0000001242084772_li1231716212540">Run the following statement to insert table data:<pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen19566163124512">insert into testcsvdata2source VALUES('Aarn','98','20');</pre>
|
|
</li><li id="dli_05_0044__en-us_topic_0000001242084772_li171231325185414">Run the following statement to query data in the <strong id="dli_05_0044__en-us_topic_0000001242084772_b5527441143210">testcsvdata2source</strong> table:<pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen881417122018">select * from testcsvdata2source;</pre>
|
|
</li><li id="dli_05_0044__en-us_topic_0000001242084772_li748173310546">Refresh the <strong id="dli_05_0044__en-us_topic_0000001242084772_b1140043103410">obs://dli-test-021/data</strong> directory of the OBS bucket and query the data. A CSV data file is generated, and the data is added to the file.</li></ol>
|
|
</li><li id="dli_05_0044__en-us_topic_0000001242084772_li11855113918377">The specified OBS data directory contains files you want to import to the table.<ol id="dli_05_0044__en-us_topic_0000001242084772_ol10125175414542"><li id="dli_05_0044__en-us_topic_0000001242084772_li18125154155414">Create file directory <strong id="dli_05_0044__en-us_topic_0000001242084772_b3681389351">data2</strong> in the <strong id="dli_05_0044__en-us_topic_0000001242084772_b156823833519">root</strong> directory of the OBS bucket <strong id="dli_05_0044__en-us_topic_0000001242084772_b1369838103514">dli-test-021</strong>. Create the <strong id="dli_05_0044__en-us_topic_0000001242084772_b16831182220366">test.csv</strong> file with the following content and upload the file to the <strong id="dli_05_0044__en-us_topic_0000001242084772_b1997145510373">obs://dli-test-021/data2</strong> directory:<pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen119122311446">Jordon,88,23
|
|
Kim,87,25
|
|
Henry,76,26</pre>
|
|
</li><li id="dli_05_0044__en-us_topic_0000001242084772_li315975675413">Log in to the DLI management console and click <strong id="dli_05_0044__en-us_topic_0000001242084772_b1436416073818">SQL Editor</strong>. On the displayed page, set <strong id="dli_05_0044__en-us_topic_0000001242084772_b12365170143816">Engine</strong> to <strong id="dli_05_0044__en-us_topic_0000001242084772_b1636617073815">spark</strong>, <strong id="dli_05_0044__en-us_topic_0000001242084772_b153678012389">Queue</strong> to the created SQL queue, and <strong id="dli_05_0044__en-us_topic_0000001242084772_b136818023811">Database</strong> to <strong id="dli_05_0044__en-us_topic_0000001242084772_b11369170163815">testdb</strong>. Run the following statement to create OBS table <strong id="dli_05_0044__en-us_topic_0000001242084772_b14838123173820">testcsvdata3source</strong> in the <strong id="dli_05_0044__en-us_topic_0000001242084772_b1383915353816">testdb</strong> database on DLI:<pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen476416107432">CREATE TABLE testcsvdata3source (name STRING, score DOUBLE, classNo INT) USING csv OPTIONS (path "obs://dli-test-021/data2");</pre>
|
|
</li><li id="dli_05_0044__en-us_topic_0000001242084772_li5271759195416">Run the following statement to insert table data:<pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen157641010164313">insert into testcsvdata3source VALUES('Aarn','98','20');</pre>
|
|
</li><li id="dli_05_0044__en-us_topic_0000001242084772_li142634512555">Run the following statement to query data in the <strong id="dli_05_0044__en-us_topic_0000001242084772_b270052711384">testcsvdata3source</strong> table:<pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen1820417398476">select * from testcsvdata3source;</pre>
|
|
</li><li id="dli_05_0044__en-us_topic_0000001242084772_li12711913115511">Refresh the <strong id="dli_05_0044__en-us_topic_0000001242084772_b3752105210381">obs://dli-test-021/data2</strong> directory of the OBS bucket and query the data. A CSV data file is generated, and the data is added to the file.</li></ol>
|
|
</li></ul>
|
|
</li></ul>
|
|
</li><li id="dli_05_0044__en-us_topic_0000001242084772_li171013127271">Create an OBS partitioned table<ul id="dli_05_0044__en-us_topic_0000001242084772_ul157331757540"><li id="dli_05_0044__en-us_topic_0000001242084772_li1473310571410"><a name="dli_05_0044__en-us_topic_0000001242084772_li1473310571410"></a><a name="en-us_topic_0000001242084772_li1473310571410"></a>Create a single-partition OBS table<ol id="dli_05_0044__en-us_topic_0000001242084772_ol1873275718415"><li id="dli_05_0044__en-us_topic_0000001242084772_li2031575317434">Create file directory <strong id="dli_05_0044__en-us_topic_0000001242084772_b1241152917398">data3</strong> in the <strong id="dli_05_0044__en-us_topic_0000001242084772_b1941172993919">root</strong> directory of the OBS bucket <strong id="dli_05_0044__en-us_topic_0000001242084772_b10412829113913">dli-test-021</strong>.</li><li id="dli_05_0044__en-us_topic_0000001242084772_li10732125710419">Log in to the DLI management console and click <strong id="dli_05_0044__en-us_topic_0000001242084772_b6464111214401">SQL Editor</strong>. On the displayed page, set <strong id="dli_05_0044__en-us_topic_0000001242084772_b8465141224016">Engine</strong> to <strong id="dli_05_0044__en-us_topic_0000001242084772_b104666126406">spark</strong>, <strong id="dli_05_0044__en-us_topic_0000001242084772_b1146721224020">Queue</strong> to the created SQL queue, and <strong id="dli_05_0044__en-us_topic_0000001242084772_b18468012144014">Database</strong> to <strong id="dli_05_0044__en-us_topic_0000001242084772_b9470412114013">testdb</strong>. Run the following statement to create OBS table <strong id="dli_05_0044__en-us_topic_0000001242084772_b12467654118">testcsvdata4source</strong> using data in the specified OBS directory <strong id="dli_05_0044__en-us_topic_0000001242084772_b876014116415">obs://dli-test-021/data3</strong> and partition the table on the <strong id="dli_05_0044__en-us_topic_0000001242084772_b6981122064117">classNo</strong> column.<pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen177327571245">CREATE TABLE testcsvdata4source (name STRING, score DOUBLE, classNo INT) USING csv OPTIONS (path "obs://dli-test-021/data3") PARTITIONED BY (classNo);</pre>
|
|
</li></ol><ol start="3" id="dli_05_0044__en-us_topic_0000001242084772_ol373375716413"><li id="dli_05_0044__en-us_topic_0000001242084772_li47323573412">Create the <strong id="dli_05_0044__en-us_topic_0000001242084772_b186143584412">classNo=25</strong> directory in the <strong id="dli_05_0044__en-us_topic_0000001242084772_b16726253134419">obs://dli-test-021/data3</strong> directory of the OBS bucket. Create the <strong id="dli_05_0044__en-us_topic_0000001242084772_b84091433144519">test.csv</strong> file based on the following file content and upload the file to the <strong id="dli_05_0044__en-us_topic_0000001242084772_b6856942134510">obs://dli-test-021/data3/classNo=25</strong> directory of the OBS bucket.<pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen16732157147">Jordon,88,25
|
|
Kim,87,25
|
|
Henry,76,25</pre>
|
|
</li><li id="dli_05_0044__en-us_topic_0000001242084772_li37321957643">Run the following statement in the SQL editor to add the partition data to OBS table <strong id="dli_05_0044__en-us_topic_0000001242084772_b7393268467">testcsvdata4source</strong>:<pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen14732957942">ALTER TABLE
|
|
testcsvdata4source
|
|
ADD
|
|
PARTITION (classNo = 25) LOCATION 'obs://dli-test-021/data3/classNo=25';</pre>
|
|
</li><li id="dli_05_0044__en-us_topic_0000001242084772_li12733175714418">Run the following statement to query data in the <strong id="dli_05_0044__en-us_topic_0000001242084772_b114575159388">classNo=25</strong> partition of the <strong id="dli_05_0044__en-us_topic_0000001242084772_b1238611221385">testcsvdata4source</strong> table:<pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen67330571046">select * from testcsvdata4source where classNo = 25;</pre>
|
|
</li><li id="dli_05_0044__en-us_topic_0000001242084772_li127331857746">Run the following statement to insert the following data to the <strong id="dli_05_0044__en-us_topic_0000001242084772_b111811173911">testcsvdata4source</strong> table:<pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen19733857840">insert into testcsvdata4source VALUES('Aarn','98','25');
|
|
insert into testcsvdata4source VALUES('Adam','68','24');</pre>
|
|
</li><li id="dli_05_0044__en-us_topic_0000001242084772_li13733115712412">Run the following statement to query data in the <strong id="dli_05_0044__en-us_topic_0000001242084772_b167298433916">classNo=25</strong> and <strong id="dli_05_0044__en-us_topic_0000001242084772_b1190210272154">classNo=24</strong> partitions of the <strong id="dli_05_0044__en-us_topic_0000001242084772_b207308410397">testcsvdata4source</strong> table:<div class="caution" id="dli_05_0044__en-us_topic_0000001242084772_note473310573413"><span class="cautiontitle"><img src="public_sys-resources/caution_3.0-en-us.png"> </span><div class="cautionbody"><p id="dli_05_0044__en-us_topic_0000001242084772_p273317571442">When a partitioned table is queried using the where condition, the partition must be specified. Otherwise, the query fails and "DLI.0005: There should be at least one partition pruning predicate on partitioned table" is reported.</p>
|
|
</div></div>
|
|
<pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen1073335713418">select * from testcsvdata4source where classNo = 25;</pre>
|
|
<div class="p" id="dli_05_0044__en-us_topic_0000001242084772_p573345713410"><pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen1773319571346">select * from testcsvdata4source where classNo = 24;</pre>
|
|
</div>
|
|
</li><li id="dli_05_0044__en-us_topic_0000001242084772_li4733175714414">In the <strong id="dli_05_0044__en-us_topic_0000001242084772_b13741845124920">obs://dli-test-021/data3</strong> directory of the OBS bucket, click the refresh button. Partition files are generated in the directory for storing the newly inserted table data.<p id="dli_05_0044__en-us_topic_0000001242084772_p41097386"></p>
|
|
</li></ol>
|
|
</li><li id="dli_05_0044__en-us_topic_0000001242084772_li973311571749">Create an OBS table partitioned on multiple columns.<ol id="dli_05_0044__en-us_topic_0000001242084772_ol473345711411"><li id="dli_05_0044__en-us_topic_0000001242084772_li636612464417">Create file directory <strong id="dli_05_0044__en-us_topic_0000001242084772_b7698619101514">data4</strong> in the <strong id="dli_05_0044__en-us_topic_0000001242084772_b176991719171516">root</strong> directory of the OBS bucket <strong id="dli_05_0044__en-us_topic_0000001242084772_b5699171961514">dli-test-021</strong>.</li><li id="dli_05_0044__en-us_topic_0000001242084772_li1873365716410">Log in to the DLI management console and click <strong id="dli_05_0044__en-us_topic_0000001242084772_b2085011224216">SQL Editor</strong>. On the displayed page, set <strong id="dli_05_0044__en-us_topic_0000001242084772_b48511228219">Engine</strong> to <strong id="dli_05_0044__en-us_topic_0000001242084772_b085117225215">spark</strong>, <strong id="dli_05_0044__en-us_topic_0000001242084772_b1485152212219">Queue</strong> to the created SQL queue, and <strong id="dli_05_0044__en-us_topic_0000001242084772_b7852112213210">Database</strong> to <strong id="dli_05_0044__en-us_topic_0000001242084772_b10852102292114">testdb</strong>. Run the following statement to create OBS table <strong id="dli_05_0044__en-us_topic_0000001242084772_b18933172416217">testcsvdata5source</strong> using data in the specified OBS directory <strong id="dli_05_0044__en-us_topic_0000001242084772_b9933122416211">obs://dli-test-021/data4</strong> and partition the table on <strong id="dli_05_0044__en-us_topic_0000001242084772_b79331024132112">classNo</strong> and <strong id="dli_05_0044__en-us_topic_0000001242084772_b141482222218">dt</strong> columns.<pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen1873375711415">CREATE TABLE testcsvdata5source (name STRING, score DOUBLE, classNo INT, dt varchar(16)) USING csv OPTIONS (path "obs://dli-test-021/data4") PARTITIONED BY (classNo,dt);</pre>
|
|
</li><li id="dli_05_0044__en-us_topic_0000001242084772_li07331857643">Run the following statements to insert the following data into the <strong id="dli_05_0044__en-us_topic_0000001242084772_b12112152482312">testcsvdata5source</strong> table:<pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen1973317575419">insert into testcsvdata5source VALUES('Aarn','98','25','2021-07-27');
|
|
insert into testcsvdata5source VALUES('Adam','68','25','2021-07-28');</pre>
|
|
</li><li id="dli_05_0044__en-us_topic_0000001242084772_li2733057749">Run the following statement to query data in the <strong id="dli_05_0044__en-us_topic_0000001242084772_b010625114234">classNo</strong> partition of the <strong id="dli_05_0044__en-us_topic_0000001242084772_b14736347102319">testcsvdata5source</strong> table:<pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen1673319571841">select * from testcsvdata5source where classNo = 25;</pre>
|
|
</li><li id="dli_05_0044__en-us_topic_0000001242084772_li187331657942">Run the following statement to query data in the <strong id="dli_05_0044__en-us_topic_0000001242084772_b11637174132412">dt</strong> partition of the <strong id="dli_05_0044__en-us_topic_0000001242084772_b20644144119243">testcsvdata5source</strong> table:<pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen117334571841">select * from testcsvdata5source where dt like '2021-07%';</pre>
|
|
</li><li id="dli_05_0044__en-us_topic_0000001242084772_li373315573413">Refresh the <strong id="dli_05_0044__en-us_topic_0000001242084772_b144681511132517">obs://dli-test-021/data4</strong> directory of the OBS bucket. The following data files are generated:<ul id="dli_05_0044__en-us_topic_0000001242084772_ul16733857849"><li id="dli_05_0044__en-us_topic_0000001242084772_li1373311578420">File directory 1: <strong id="dli_05_0044__en-us_topic_0000001242084772_b33251735192513">obs://dli-test-021/data4/</strong><em id="dli_05_0044__en-us_topic_0000001242084772_i273325710414">xxxxxx</em><strong id="dli_05_0044__en-us_topic_0000001242084772_b12210240152516">/classNo=25/dt=2021-07-27</strong></li><li id="dli_05_0044__en-us_topic_0000001242084772_li14733185710414">File directory 2: <strong id="dli_05_0044__en-us_topic_0000001242084772_b1873414022610">obs://dli-test-021/data4/</strong><em id="dli_05_0044__en-us_topic_0000001242084772_i1073355713419">xxxxxx</em><strong id="dli_05_0044__en-us_topic_0000001242084772_b85753332612">/classNo=25/dt=2021-07-28</strong></li></ul>
|
|
</li><li id="dli_05_0044__en-us_topic_0000001242084772_li1773325720413">Create the partition directory <strong id="dli_05_0044__en-us_topic_0000001242084772_b799895011265">classNo=24</strong> in <strong id="dli_05_0044__en-us_topic_0000001242084772_b6589454272">obs://dli-test-021/data4</strong>, and then create the subdirectory <strong id="dli_05_0044__en-us_topic_0000001242084772_b04242205286">dt=2021-07-29</strong> in <strong id="dli_05_0044__en-us_topic_0000001242084772_b121711224192816">classNo=24</strong>. Create the <strong id="dli_05_0044__en-us_topic_0000001242084772_b856712124266">test.csv</strong> file using the following file content and upload the file to the <strong id="dli_05_0044__en-us_topic_0000001242084772_b13574512132614">obs://dli-test-021/data4/classNo=24/dt=2021-07-29</strong> directory.<pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen273318573419">Jordon,88,24,2021-07-29
|
|
Kim,87,24,2021-07-29
|
|
Henry,76,24,2021-07-29</pre>
|
|
</li><li id="dli_05_0044__en-us_topic_0000001242084772_li87331957240">Run the following statement in the SQL editor to add the partition data to OBS table <strong id="dli_05_0044__en-us_topic_0000001242084772_b19952881290">testcsvdata5source</strong>:<pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen1173315571647">ALTER TABLE
|
|
testcsvdata5source
|
|
ADD
|
|
PARTITION (classNo = 24,dt='2021-07-29') LOCATION 'obs://dli-test-021/data4/classNo=24/dt=2021-07-29';</pre>
|
|
</li><li id="dli_05_0044__en-us_topic_0000001242084772_li57335574417">Run the following statement to query data in the <strong id="dli_05_0044__en-us_topic_0000001242084772_b4949183219299">classNo</strong> partition of the <strong id="dli_05_0044__en-us_topic_0000001242084772_b49492032112910">testcsvdata5source</strong> table:<pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen1673345720417">select * from testcsvdata5source where classNo = 24;</pre>
|
|
</li><li id="dli_05_0044__en-us_topic_0000001242084772_li67331657248">Run the following statement to query all data in July 2021 in the <strong id="dli_05_0044__en-us_topic_0000001242084772_b591852593417">dt</strong> partition:<pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen1273310577415">select * from testcsvdata5source where dt like '2021-07%';</pre>
|
|
</li></ol>
|
|
</li></ul>
|
|
</li></ul>
|
|
</div>
|
|
<div class="section" id="dli_05_0044__en-us_topic_0000001242084772_section079761717234"><h4 class="sectiontitle">Creating an OBS Table Using Hive Syntax</h4><p id="dli_05_0044__en-us_topic_0000001242084772_p999614201951">The following describes how to create an OBS table for TEXTFILE files. The methods of creating OBS tables for other file formats are similar.</p>
|
|
<ul id="dli_05_0044__en-us_topic_0000001242084772_ul4960144841614"><li id="dli_05_0044__en-us_topic_0000001242084772_li159601548141613">Create a non-partitioned OBS table.<ol id="dli_05_0044__en-us_topic_0000001242084772_ol1596591115618"><li id="dli_05_0044__en-us_topic_0000001242084772_li179658117613">Create file directory <strong id="dli_05_0044__en-us_topic_0000001242084772_b299942884019">data5</strong> in the <strong id="dli_05_0044__en-us_topic_0000001242084772_b1541029104017">root</strong> directory of the OBS bucket <strong id="dli_05_0044__en-us_topic_0000001242084772_b2542912404">dli-test-021</strong>. Create the <strong id="dli_05_0044__en-us_topic_0000001242084772_b3280113444017">test.txt</strong> file based on the following file content and upload the file to the <strong id="dli_05_0044__en-us_topic_0000001242084772_b1280143413409">obs://dli-test-021/data5</strong> directory:<pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen861764519243">Jordon,88,23
|
|
Kim,87,25
|
|
Henry,76,26</pre>
|
|
</li><li id="dli_05_0044__en-us_topic_0000001242084772_li172113063217">Log in to the DLI management console and click <strong id="dli_05_0044__en-us_topic_0000001242084772_b15887136134112">SQL Editor</strong>. On the displayed page, set <strong id="dli_05_0044__en-us_topic_0000001242084772_b13888186124115">Engine</strong> to <strong id="dli_05_0044__en-us_topic_0000001242084772_b10888560415">spark</strong>, <strong id="dli_05_0044__en-us_topic_0000001242084772_b3888669419">Queue</strong> to the created SQL queue, and <strong id="dli_05_0044__en-us_topic_0000001242084772_b388926174112">Database</strong> to <strong id="dli_05_0044__en-us_topic_0000001242084772_b178893684113">testdb</strong>. Run the following Hive statement to create an OBS table using data in <strong id="dli_05_0044__en-us_topic_0000001242084772_b874440104215">obs://dli-test-021/data5/test.txt</strong> and set the row data delimiter to commas (,):<pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen1028182532717">CREATE TABLE hiveobstable (name STRING, score DOUBLE, classNo INT) STORED AS TEXTFILE LOCATION 'obs://dli-test-021/data5' ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';</pre>
|
|
<div class="note" id="dli_05_0044__en-us_topic_0000001242084772_note6403185115328"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="dli_05_0044__en-us_topic_0000001242084772_p19404165163213"><strong id="dli_05_0044__en-us_topic_0000001242084772_b1559812245455">ROW FORMAT DELIMITED FIELDS TERMINATED BY ','</strong> indicates that records are separated by commas (,).</p>
|
|
</div></div>
|
|
</li><li id="dli_05_0044__en-us_topic_0000001242084772_li4660319569">Run the following statement to query data in the <strong id="dli_05_0044__en-us_topic_0000001242084772_b3882137184812">hiveobstable</strong> table:<pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen21369434440">select * from hiveobstable;</pre>
|
|
</li><li id="dli_05_0044__en-us_topic_0000001242084772_li91949241867">Run the following statements to insert data into the table:<pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen15910206018">insert into hiveobstable VALUES('Aarn','98','25');
|
|
insert into hiveobstable VALUES('Adam','68','25');</pre>
|
|
</li><li id="dli_05_0044__en-us_topic_0000001242084772_li15531112613615">Run the following statement to query data in the table to verify that the data has been inserted:<pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen12432131620214">select * from hiveobstable;</pre>
|
|
</li><li id="dli_05_0044__en-us_topic_0000001242084772_li191831731866">In the <strong id="dli_05_0044__en-us_topic_0000001242084772_b19861051114919">obs://dli-test-021/data5</strong> directory, refresh the page and query the data. Two files are generated containing the newly inserted data.</li></ol>
|
|
<p id="dli_05_0044__en-us_topic_0000001242084772_p115321814413"><strong id="dli_05_0044__en-us_topic_0000001242084772_b10625613798">Create an OBS Table Containing Data of Multiple Formats</strong></p>
|
|
<ol id="dli_05_0044__en-us_topic_0000001242084772_ol4266204717813"><li id="dli_05_0044__en-us_topic_0000001242084772_li1726613476817">Create file directory <strong id="dli_05_0044__en-us_topic_0000001242084772_b58561384535">data6</strong> in the <strong id="dli_05_0044__en-us_topic_0000001242084772_b2085603895313">root</strong> directory of the OBS bucket <strong id="dli_05_0044__en-us_topic_0000001242084772_b1385716384539">dli-test-021</strong>. Create the <strong id="dli_05_0044__en-us_topic_0000001242084772_b316455711538">test.txt</strong> file based on the following file content and upload the file to the <strong id="dli_05_0044__en-us_topic_0000001242084772_b151651157185315">obs://dli-test-021/data6</strong> directory:<pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen127805111329">Jordon,88-22,23:21
|
|
Kim,87-22,25:22
|
|
Henry,76-22,26:23</pre>
|
|
</li><li id="dli_05_0044__en-us_topic_0000001242084772_li111531349682">Log in to the DLI management console and click <strong id="dli_05_0044__en-us_topic_0000001242084772_b470331355410">SQL Editor</strong>. On the displayed page, set <strong id="dli_05_0044__en-us_topic_0000001242084772_b4709913135414">Engine</strong> to <strong id="dli_05_0044__en-us_topic_0000001242084772_b13710201325418">spark</strong>, <strong id="dli_05_0044__en-us_topic_0000001242084772_b1871016139542">Queue</strong> to the created SQL queue, and <strong id="dli_05_0044__en-us_topic_0000001242084772_b1171113136543">Database</strong> to <strong id="dli_05_0044__en-us_topic_0000001242084772_b1071118134548">testdb</strong>. Run the following Hive statement to create an OBS table using data stored in <strong id="dli_05_0044__en-us_topic_0000001242084772_b144721249185414">obs://dli-test-021/data6</strong>.<pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen167801913321">CREATE TABLE hiveobstable2 (name STRING, hobbies ARRAY<string>, address map<string,string>) STORED AS TEXTFILE LOCATION 'obs://dli-test-021/data6'
|
|
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
|
|
COLLECTION ITEMS TERMINATED BY '-'
|
|
MAP KEYS TERMINATED BY ':';</pre>
|
|
<div class="note" id="dli_05_0044__en-us_topic_0000001242084772_note67361132203316"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><ul id="dli_05_0044__en-us_topic_0000001242084772_ul667317263346"><li id="dli_05_0044__en-us_topic_0000001242084772_li1673162683417"><strong id="dli_05_0044__en-us_topic_0000001242084772_b9564144055510">ROW FORMAT DELIMITED FIELDS TERMINATED BY ','</strong> indicates that records are separated by commas (,).</li><li id="dli_05_0044__en-us_topic_0000001242084772_li18744183023417"><strong id="dli_05_0044__en-us_topic_0000001242084772_b5560105515555">COLLECTION ITEMS TERMINATED BY '-'</strong> indicates that the second column <strong id="dli_05_0044__en-us_topic_0000001242084772_b206510375619">hobbies</strong> is in array format. Elements are separated by hyphens (-).</li><li id="dli_05_0044__en-us_topic_0000001242084772_li143371347173511"><strong id="dli_05_0044__en-us_topic_0000001242084772_b2770122513560">MAP KEYS TERMINATED BY ':'</strong> indicates that the <strong id="dli_05_0044__en-us_topic_0000001242084772_b16237113113569">address</strong> column is in the key-value format. Key-value pairs are separated by colons (:).</li></ul>
|
|
</div></div>
|
|
</li><li id="dli_05_0044__en-us_topic_0000001242084772_li3349304913">Run the following statement to query data in the <strong id="dli_05_0044__en-us_topic_0000001242084772_b1532714179571">hiveobstable2</strong> table:<pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen378012193212">select * from hiveobstable2;</pre>
|
|
</li></ol>
|
|
</li><li id="dli_05_0044__en-us_topic_0000001242084772_li68154254017"><a name="dli_05_0044__en-us_topic_0000001242084772_li68154254017"></a><a name="en-us_topic_0000001242084772_li68154254017"></a>Create a partitioned OBS table.<ol id="dli_05_0044__en-us_topic_0000001242084772_ol10958518798"><li id="dli_05_0044__en-us_topic_0000001242084772_li2098125013443">Create file directory <strong id="dli_05_0044__en-us_topic_0000001242084772_b472623615719">data7</strong> in the <strong id="dli_05_0044__en-us_topic_0000001242084772_b147271036155712">root</strong> directory of the OBS bucket <strong id="dli_05_0044__en-us_topic_0000001242084772_b972863613576">dli-test-021</strong>.</li><li id="dli_05_0044__en-us_topic_0000001242084772_li8958141810918">Log in to the DLI management console and click <strong id="dli_05_0044__en-us_topic_0000001242084772_b6466652145816">SQL Editor</strong>. On the displayed page, set <strong id="dli_05_0044__en-us_topic_0000001242084772_b1446765255819">Engine</strong> to <strong id="dli_05_0044__en-us_topic_0000001242084772_b4468152185820">spark</strong>, <strong id="dli_05_0044__en-us_topic_0000001242084772_b1846895214581">Queue</strong> to the created SQL queue, and <strong id="dli_05_0044__en-us_topic_0000001242084772_b1469852135820">Database</strong> to <strong id="dli_05_0044__en-us_topic_0000001242084772_b1247012528583">testdb</strong>. Run the following statement to create an OBS table using data stored in <strong id="dli_05_0044__en-us_topic_0000001242084772_b18110195212014">obs://dli-test-021/data7</strong> and partition the table on the <strong id="dli_05_0044__en-us_topic_0000001242084772_b1711144520597">classNo</strong> column:<pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen16971112512495">CREATE TABLE IF NOT EXISTS hiveobstable3(name STRING, score DOUBLE) PARTITIONED BY (classNo INT) STORED AS TEXTFILE LOCATION 'obs://dli-test-021/data7' ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';</pre>
|
|
<div class="caution" id="dli_05_0044__en-us_topic_0000001242084772_note1003155114"><span class="cautiontitle"><img src="public_sys-resources/caution_3.0-en-us.png"> </span><div class="cautionbody"><p id="dli_05_0044__en-us_topic_0000001242084772_p8714746115119">You can specify the partition key in the <strong id="dli_05_0044__en-us_topic_0000001242084772_b13264384213">PARTITIONED BY</strong> statement. Do not specify the partition key in the <strong id="dli_05_0044__en-us_topic_0000001242084772_b15540145917311">CREATE TABLE IF NOT EXISTS</strong> statement. The following is an incorrect example:</p>
|
|
<p id="dli_05_0044__en-us_topic_0000001242084772_p1073717309527">CREATE TABLE IF NOT EXISTS hiveobstable3(name STRING, score DOUBLE, classNo INT) PARTITIONED BY (classNo) STORED AS TEXTFILE LOCATION 'obs://dli-test-021/data7';</p>
|
|
</div></div>
|
|
</li><li id="dli_05_0044__en-us_topic_0000001242084772_li95931258917">Run the following statements to insert data into the table:<pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen149901754115312">insert into hiveobstable3 VALUES('Aarn','98','25');
|
|
insert into hiveobstable3 VALUES('Adam','68','25');</pre>
|
|
</li><li id="dli_05_0044__en-us_topic_0000001242084772_li468115291791">Run the following statement to query data in the table:<pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen71571715205514">select * from hiveobstable3 where classNo = 25;</pre>
|
|
</li><li id="dli_05_0044__en-us_topic_0000001242084772_li86581033797">Refresh the <strong id="dli_05_0044__en-us_topic_0000001242084772_b16444711257">obs://dli-test-021/data7</strong> directory. A new partition directory <strong id="dli_05_0044__en-us_topic_0000001242084772_b18961258950">classno=25</strong> is generated containing the newly inserted table data.</li><li id="dli_05_0044__en-us_topic_0000001242084772_li26336511103">Create partition directory <strong id="dli_05_0044__en-us_topic_0000001242084772_b1290313371464">classno=24</strong> in the <strong id="dli_05_0044__en-us_topic_0000001242084772_b775718254612">obs://dli-test-021/data7</strong> directory. Create the <strong id="dli_05_0044__en-us_topic_0000001242084772_b19109162071">test.txt</strong> file using the following file content and upload the file to the <strong id="dli_05_0044__en-us_topic_0000001242084772_b236812273">obs://dli-test-021/data7/classno=24</strong> directory:<pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen6969155911128">Jordon,88,24
|
|
Kim,87,24
|
|
Henry,76,24</pre>
|
|
</li><li id="dli_05_0044__en-us_topic_0000001242084772_li1619718371699">Run the following statement in the SQL editor to add the partition data to OBS table <strong id="dli_05_0044__en-us_topic_0000001242084772_b0489172213717">hiveobstable3</strong>:<pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen1693919528210">ALTER TABLE
|
|
hiveobstable3
|
|
ADD
|
|
PARTITION (classNo = 24) LOCATION 'obs://dli-test-021/data7/classNo=24';</pre>
|
|
</li><li id="dli_05_0044__en-us_topic_0000001242084772_li153321941091">Run the following statement to query data in the <strong id="dli_05_0044__en-us_topic_0000001242084772_b1549019325714">hiveobstable3</strong> table:<pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen188068231161">select * from hiveobstable3 where classNo = 24;</pre>
|
|
</li></ol>
|
|
</li></ul>
|
|
</div>
|
|
<div class="section" id="dli_05_0044__en-us_topic_0000001242084772_section116194216574"><h4 class="sectiontitle">FAQs</h4><ul id="dli_05_0044__en-us_topic_0000001242084772_ul1341411210585"><li id="dli_05_0044__en-us_topic_0000001242084772_li13102144310157"><strong id="dli_05_0044__en-us_topic_0000001242084772_b860311143817">Q1</strong>: What should I do if the following error is reported when the OBS partition table is queried?<pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen208721563181">DLI.0005: There should be at least one partition pruning predicate on partitioned table `<em id="dli_05_0044__en-us_topic_0000001242084772_i13872556131814">xxxx</em>`.`<em id="dli_05_0044__en-us_topic_0000001242084772_i657514518328">xxxx</em>`.;</pre>
|
|
<p id="dli_05_0044__en-us_topic_0000001242084772_p852510581187"><strong id="dli_05_0044__en-us_topic_0000001242084772_b102866181794">Cause</strong>: The partition key is not specified in the query statement of a partitioned table.</p>
|
|
<p id="dli_05_0044__en-us_topic_0000001242084772_p159712271919"><strong id="dli_05_0044__en-us_topic_0000001242084772_b314555414101">Solution</strong>: Ensure that the where condition contains at least one partition key.</p>
|
|
</li><li id="dli_05_0044__en-us_topic_0000001242084772_li77488011254"><strong id="dli_05_0044__en-us_topic_0000001242084772_b24661926191112">Q2</strong>: What should I do if "DLI.0007: The output path is a file, don't support INSERT...SELECT error" is reported when I use a DataSource statement to insert data in a specified OBS directory into an OBS table and the execution fails?<div class="p" id="dli_05_0044__en-us_topic_0000001242084772_p1374213172713">The statement is similar to the following:<pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen1772333092710">CREATE TABLE testcsvdatasource (name string, id int) USING csv OPTIONS (path <strong id="dli_05_0044__en-us_topic_0000001242084772_b1472343032711">"obs://dli-test-021/data/test.csv</strong>");</pre>
|
|
</div>
|
|
<p id="dli_05_0044__en-us_topic_0000001242084772_p15723430112716"><strong id="dli_05_0044__en-us_topic_0000001242084772_b133115314214">Cause</strong>: Data cannot be inserted if a specific file is used in the table creation statement. For example, the OBS file <strong id="dli_05_0044__en-us_topic_0000001242084772_b19590181573316">obs://dli-test-021/data/test.csv</strong> is used in the preceding example.</p>
|
|
<div class="p" id="dli_05_0044__en-us_topic_0000001242084772_p87231330152717"><strong id="dli_05_0044__en-us_topic_0000001242084772_b1463302610239">Solution</strong>: Replace the OBS file to the file directory. You can insert data using the INSERT statement. The preceding example statement can be modified as follows:<pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen97776921513">CREATE TABLE testcsvdatasource (name string, id int) USING csv OPTIONS (path <strong id="dli_05_0044__en-us_topic_0000001242084772_b644211321520">"obs://dli-test-021/data</strong>");</pre>
|
|
</div>
|
|
</li><li id="dli_05_0044__en-us_topic_0000001242084772_li132002542515"><strong id="dli_05_0044__en-us_topic_0000001242084772_b4606309253">Q3</strong>: What should I do if the syntax of a Hive statement used to create a partitioned OBS table is incorrect? For example, the following statement creates an OBS table partitioned on <strong id="dli_05_0044__en-us_topic_0000001242084772_b0341152120289">classNo</strong>:<pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen1531444182811">CREATE TABLE IF NOT EXISTS testtable(name STRING, score DOUBLE, classNo INT) PARTITIONED BY (classNo) STORED AS TEXTFILE LOCATION 'obs://dli-test-021/data7';</pre>
|
|
<p id="dli_05_0044__en-us_topic_0000001242084772_p6388219162820"><strong id="dli_05_0044__en-us_topic_0000001242084772_b10349153212810">Cause</strong>: Do not specify the partition key in the list following the table name. Specify the partition key in the <strong id="dli_05_0044__en-us_topic_0000001242084772_b2129843142915">PARTITIONED BY</strong> statement.</p>
|
|
<div class="p" id="dli_05_0044__en-us_topic_0000001242084772_p16432102514306"><strong id="dli_05_0044__en-us_topic_0000001242084772_b58916578291">Solution</strong>: Specify the partition key in <strong id="dli_05_0044__en-us_topic_0000001242084772_b16289101113303">PARTITIONED BY</strong>. For example:<pre class="screen" id="dli_05_0044__en-us_topic_0000001242084772_screen139287463117">CREATE TABLE IF NOT EXISTS testtable(name STRING, score DOUBLE) PARTITIONED BY (classNo INT) STORED AS TEXTFILE LOCATION 'obs://dli-test-021/data7';</pre>
|
|
</div>
|
|
</li></ul>
|
|
</div>
|
|
</div>
|
|
<div>
|
|
<div class="familylinks">
|
|
<div class="parentlink"><strong>Parent topic:</strong> <a href="dli_09_0120.html">SQL Jobs</a></div>
|
|
</div>
|
|
</div>
|
|
|