Files
doc-exports/docs/dataartsstudio/umn/dataartsstudio_01_0051.html
chenxiaoxiong f9e2808b7c DataArts UMN 20250810 version
Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com>
Co-authored-by: chenxiaoxiong <chenxiaoxiong@huawei.com>
Co-committed-by: chenxiaoxiong <chenxiaoxiong@huawei.com>
2025-09-02 10:44:13 +00:00

100 lines
16 KiB
HTML
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<a name="dataartsstudio_01_0051"></a><a name="dataartsstudio_01_0051"></a>
<h1 class="topictitle1">From Hive</h1>
<div id="body8662426"><p id="dataartsstudio_01_0051__en-us_topic_0108275424_p6222841216210">If the source link of a job is a <a href="dataartsstudio_01_0026.html">Hive link</a>, configure the source job parameters based on <a href="#dataartsstudio_01_0051__en-us_topic_0108275424_table31823995163953">Table 1</a>.</p>
<div class="tablenoborder"><a name="dataartsstudio_01_0051__en-us_topic_0108275424_table31823995163953"></a><a name="en-us_topic_0108275424_table31823995163953"></a><table cellpadding="4" cellspacing="0" summary="" id="dataartsstudio_01_0051__en-us_topic_0108275424_table31823995163953" frame="border" border="1" rules="all"><caption><b>Table 1 </b>Parameter description</caption><thead align="left"><tr id="dataartsstudio_01_0051__en-us_topic_0108275424_row18653487163953"><th align="left" class="cellrowborder" valign="top" width="19.32%" id="mcps1.3.2.2.5.1.1"><p id="dataartsstudio_01_0051__en-us_topic_0108275424_p6625103734115">Category</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="18.16%" id="mcps1.3.2.2.5.1.2"><p id="dataartsstudio_01_0051__en-us_topic_0108275424_p15314298163953">Parameter</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="41.91%" id="mcps1.3.2.2.5.1.3"><p id="dataartsstudio_01_0051__en-us_topic_0108275424_p32498630163953">Description</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="20.61%" id="mcps1.3.2.2.5.1.4"><p id="dataartsstudio_01_0051__en-us_topic_0108275424_p15143370163953">Example Value</p>
</th>
</tr>
</thead>
<tbody><tr id="dataartsstudio_01_0051__en-us_topic_0108275424_row1928353163953"><td class="cellrowborder" rowspan="5" valign="top" width="19.32%" headers="mcps1.3.2.2.5.1.1 "><p id="dataartsstudio_01_0051__en-us_topic_0108275424_p3625137104112">Basic parameters</p>
</td>
<td class="cellrowborder" valign="top" width="18.16%" headers="mcps1.3.2.2.5.1.2 "><p id="dataartsstudio_01_0051__en-us_topic_0108275424_p28108739161524">Database Name</p>
</td>
<td class="cellrowborder" valign="top" width="41.91%" headers="mcps1.3.2.2.5.1.3 "><p id="dataartsstudio_01_0051__en-us_topic_0108275424_p17702619155928">Database name. Click the icon next to the text box. The dialog box for selecting the database is displayed.</p>
</td>
<td class="cellrowborder" valign="top" width="20.61%" headers="mcps1.3.2.2.5.1.4 "><p id="dataartsstudio_01_0051__en-us_topic_0108275424_p64647157163953">default</p>
</td>
</tr>
<tr id="dataartsstudio_01_0051__en-us_topic_0108275424_row11687830163953"><td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.1 "><p id="dataartsstudio_01_0051__en-us_topic_0108275424_p23518636161524">Table Name</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.2 "><p id="dataartsstudio_01_0051__en-us_topic_0108275424_p17984995155931">Hive table name. Click the icon next to the text box. The dialog box for selecting the table is displayed.</p>
<p id="dataartsstudio_01_0051__en-us_topic_0108275424_p1210244910548">This parameter can be configured as a macro variable of date and time and a path name can contain multiple macro variables. When the macro variable of date and time works with a scheduled job, the incremental data can be synchronized periodically. </p>
<div class="note" id="dataartsstudio_01_0051__en-us_topic_0108275424_note116874368587"><span class="notetitle"> NOTE: </span><div class="notebody"><p id="dataartsstudio_01_0051__en-us_topic_0108275424_en-us_topic_0108275319_p52974484409">If you have configured a macro variable of date and time and schedule a CDM job through <span id="dataartsstudio_01_0051__en-us_topic_0108275424_en-us_topic_0108275319_text9997118203">DataArts Studio DataArts Factory</span>, the system replaces the macro variable of date and time with (<em id="dataartsstudio_01_0051__en-us_topic_0108275424_en-us_topic_0108275319_i799871152012">Planned start time of the data development job</em> <em id="dataartsstudio_01_0051__en-us_topic_0108275424_en-us_topic_0108275319_i0998101192020">Offset</em>) rather than (<em id="dataartsstudio_01_0051__en-us_topic_0108275424_en-us_topic_0108275319_i299821132018">Actual start time of the CDM job</em> <em id="dataartsstudio_01_0051__en-us_topic_0108275424_en-us_topic_0108275319_i129981917200">Offset</em>).</p>
</div></div>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.3 "><p id="dataartsstudio_01_0051__en-us_topic_0108275424_p50683068163953">TBL_E</p>
</td>
</tr>
<tr id="dataartsstudio_01_0051__en-us_topic_0108275424_row962165734014"><td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.1 "><p id="dataartsstudio_01_0051__en-us_topic_0108275424_p96355710402">Read Mode</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.2 "><p id="dataartsstudio_01_0051__en-us_topic_0108275424_p1656393534412">Two read modes are available: HDFS and JDBC. By default, the HDFS mode is used. If you do not need to use the WHERE condition to filter data or add new fields on the field mapping page, select the HDFS mode.</p>
<ul id="dataartsstudio_01_0051__en-us_topic_0108275424_ul169421201417"><li id="dataartsstudio_01_0051__en-us_topic_0108275424_li3378113716448">The HDFS mode shows good performance, but in this mode, you cannot use the WHERE condition to filter data or add new fields on the field mapping page.</li><li id="dataartsstudio_01_0051__en-us_topic_0108275424_li13942122018412">The HDFS mode allows you to use the WHERE condition to filter data or add new fields on the field mapping page.<div class="note" id="dataartsstudio_01_0051__en-us_topic_0108275424_note10801173103210"><span class="notetitle"> NOTE: </span><div class="notebody"><p id="dataartsstudio_01_0051__en-us_topic_0108275424_p16801133183213">If the migration source is Hive and JDBC is used to read data, CDM does not support concurrency. That is, <strong id="dataartsstudio_01_0051__en-us_topic_0108275424_b63018315255">Concurrent Extractors</strong> can only be set to <strong id="dataartsstudio_01_0051__en-us_topic_0108275424_b757413336254">1</strong>.</p>
</div></div>
</li></ul>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.3 "><p id="dataartsstudio_01_0051__en-us_topic_0108275424_p2631557154010">HDFS</p>
</td>
</tr>
<tr id="dataartsstudio_01_0051__en-us_topic_0108275424_row15868328144312"><td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.1 "><p id="dataartsstudio_01_0051__en-us_topic_0108275424_p786952834314">Use SQL Statement</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.2 "><p id="dataartsstudio_01_0051__en-us_topic_0108275424_p1586918285431">Whether you can use SQL statements to export data from a relational database</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.3 "><p id="dataartsstudio_01_0051__en-us_topic_0108275424_p14869152816436">No</p>
</td>
</tr>
<tr id="dataartsstudio_01_0051__en-us_topic_0108275424_row1186522594314"><td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.1 "><p id="dataartsstudio_01_0051__en-us_topic_0108275424_p1186632594313">SQL Statement</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.2 "><p id="dataartsstudio_01_0051__en-us_topic_0108275424_p5220739135">When <span class="parmname" id="dataartsstudio_01_0051__en-us_topic_0108275424_parmname346133185611"><b>Use SQL Statement</b></span> is set to <span class="parmvalue" id="dataartsstudio_01_0051__en-us_topic_0108275424_parmvalue623353520561"><b>Yes</b></span>, enter an SQL statement here. CDM exports data based on the SQL statement.</p>
<div class="note" id="dataartsstudio_01_0051__en-us_topic_0108275424_note1624112587564"><span class="notetitle"> NOTE: </span><div class="notebody"><ul id="dataartsstudio_01_0051__en-us_topic_0108275424_en-us_topic_0108275324_en-us_topic_0108275424_ul95781925125319"><li id="dataartsstudio_01_0051__en-us_topic_0108275424_en-us_topic_0108275324_en-us_topic_0108275424_li15578132545313">SQL statements can only be used to query data. Join and nesting are supported, but multiple query statements are not allowed, for example, <strong id="dataartsstudio_01_0051__en-us_topic_0108275424_en-us_topic_0108275324_b143067633772251">select * from table a; select * from table b</strong>.</li><li id="dataartsstudio_01_0051__en-us_topic_0108275424_en-us_topic_0108275324_en-us_topic_0108275424_li115789258538">With statements are not supported.</li><li id="dataartsstudio_01_0051__en-us_topic_0108275424_en-us_topic_0108275324_en-us_topic_0108275424_li15578152525319">Comments, such as <strong id="dataartsstudio_01_0051__en-us_topic_0108275424_en-us_topic_0108275324_b148619239372251">--</strong> and <strong id="dataartsstudio_01_0051__en-us_topic_0108275424_en-us_topic_0108275324_b187717313372251">/*</strong>, are not supported.</li><li id="dataartsstudio_01_0051__en-us_topic_0108275424_en-us_topic_0108275324_en-us_topic_0108275424_li115785251536">Addition, deletion, and modification operations are not supported, including but not limited to the following:<ul id="dataartsstudio_01_0051__en-us_topic_0108275424_en-us_topic_0108275324_en-us_topic_0108275424_ul185084210539"><li id="dataartsstudio_01_0051__en-us_topic_0108275424_en-us_topic_0108275324_en-us_topic_0108275424_li1623929145315">load data</li><li id="dataartsstudio_01_0051__en-us_topic_0108275424_en-us_topic_0108275324_en-us_topic_0108275424_li15832356155310">delete from</li><li id="dataartsstudio_01_0051__en-us_topic_0108275424_en-us_topic_0108275324_en-us_topic_0108275424_li144890175411">alter table</li><li id="dataartsstudio_01_0051__en-us_topic_0108275424_en-us_topic_0108275324_en-us_topic_0108275424_li17896205419">create table</li><li id="dataartsstudio_01_0051__en-us_topic_0108275424_en-us_topic_0108275324_en-us_topic_0108275424_li7715510105412">drop table</li><li id="dataartsstudio_01_0051__en-us_topic_0108275424_en-us_topic_0108275324_en-us_topic_0108275424_li548061418543">into outfile</li></ul>
</li><li id="dataartsstudio_01_0051__en-us_topic_0108275424_en-us_topic_0108275324_li1517424293819">If the SQL statement is too long, the request fails to be delivered. If you continue to create a job, the system displays an error message indicating that the request is incorrect. In this case, you need to simplify or clear the SQL statement and try again.</li></ul>
</div></div>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.3 "><p id="dataartsstudio_01_0051__en-us_topic_0108275424_p6220439735">select id,name from sqoop.user;</p>
</td>
</tr>
<tr id="dataartsstudio_01_0051__en-us_topic_0108275424_row4203135418142"><td class="cellrowborder" rowspan="2" valign="top" width="19.32%" headers="mcps1.3.2.2.5.1.1 "><p id="dataartsstudio_01_0051__en-us_topic_0108275424_p1062523764111">Advanced attributes</p>
<p id="dataartsstudio_01_0051__en-us_topic_0108275424_p46256371418"></p>
</td>
<td class="cellrowborder" valign="top" width="18.16%" headers="mcps1.3.2.2.5.1.2 "><p id="dataartsstudio_01_0051__en-us_topic_0108275424_p7203154151410">Partition Values</p>
</td>
<td class="cellrowborder" valign="top" width="41.91%" headers="mcps1.3.2.2.5.1.3 "><p id="dataartsstudio_01_0051__en-us_topic_0108275424_p1219435334515">This parameter is displayed when you select the HDFS read mode and click <strong id="dataartsstudio_01_0051__en-us_topic_0108275424_b1454933863319">Show Advanced Attributes</strong>.</p>
<p id="dataartsstudio_01_0051__en-us_topic_0108275424_p1020335414148">This parameter indicates extracting the partition of a specified value. The attribute name is the partition name. You can configure multiple values (separated by spaces) or a field value range. The time macro function is supported. </p>
<div class="note" id="dataartsstudio_01_0051__en-us_topic_0108275424_note1855834135914"><span class="notetitle"> NOTE: </span><div class="notebody"><p id="dataartsstudio_01_0051__en-us_topic_0108275424_en-us_topic_0108275319_p52974484409_1">If you have configured a macro variable of date and time and schedule a CDM job through <span id="dataartsstudio_01_0051__en-us_topic_0108275424_en-us_topic_0108275319_text9997118203_1">DataArts Studio DataArts Factory</span>, the system replaces the macro variable of date and time with (<em id="dataartsstudio_01_0051__en-us_topic_0108275424_en-us_topic_0108275319_i799871152012_1">Planned start time of the data development job</em> <em id="dataartsstudio_01_0051__en-us_topic_0108275424_en-us_topic_0108275319_i0998101192020_1">Offset</em>) rather than (<em id="dataartsstudio_01_0051__en-us_topic_0108275424_en-us_topic_0108275319_i299821132018_1">Actual start time of the CDM job</em> <em id="dataartsstudio_01_0051__en-us_topic_0108275424_en-us_topic_0108275319_i129981917200_1">Offset</em>).</p>
</div></div>
</td>
<td class="cellrowborder" valign="top" width="20.61%" headers="mcps1.3.2.2.5.1.4 "><ul id="dataartsstudio_01_0051__en-us_topic_0108275424_ul16185646318"><li id="dataartsstudio_01_0051__en-us_topic_0108275424_li187594277316">Attribute value in the single-value or multi-value filtering scenario:<p id="dataartsstudio_01_0051__en-us_topic_0108275424_p88421435130"><a name="dataartsstudio_01_0051__en-us_topic_0108275424_li187594277316"></a><a name="en-us_topic_0108275424_li187594277316"></a>${dateformat(yyyyMMdd, -1, DAY)} ${dateformat(yyyyMMdd)}</p>
</li><li id="dataartsstudio_01_0051__en-us_topic_0108275424_li1458674019314">Attribute value in the range filtering scenario:<p id="dataartsstudio_01_0051__en-us_topic_0108275424_p22875441038"><a name="dataartsstudio_01_0051__en-us_topic_0108275424_li1458674019314"></a><a name="en-us_topic_0108275424_li1458674019314"></a>${value} &gt;= ${dateformat(yyyyMMdd, -7, DAY)} &amp;&amp; ${value} &lt; ${dateformat(yyyyMMdd)}</p>
</li></ul>
</td>
</tr>
<tr id="dataartsstudio_01_0051__en-us_topic_0108275424_row131351318184611"><td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.1 "><p id="dataartsstudio_01_0051__en-us_topic_0108275424_p54171024164616">WHERE Clause</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.2 "><p id="dataartsstudio_01_0051__en-us_topic_0108275424_p1213520181469">This parameter is displayed when you select the JDBC read mode and click <strong id="dataartsstudio_01_0051__en-us_topic_0108275424_b1066165863415">Show Advanced Attributes</strong>.</p>
<p id="dataartsstudio_01_0051__en-us_topic_0108275424_p976143054712">This parameter indicates the WHERE clause to be extracted. If this parameter is not set, the entire table is extracted. If the table to be migrated does not contain the fields specified by the WHERE clause, the migration will fail.</p>
<p id="dataartsstudio_01_0051__en-us_topic_0108275424_p18740131613716">You can set a date macro variable to extract data generated on a specific date. </p>
<div class="note" id="dataartsstudio_01_0051__en-us_topic_0108275424_note45791757134910"><span class="notetitle"> NOTE: </span><div class="notebody"><p id="dataartsstudio_01_0051__en-us_topic_0108275424_en-us_topic_0108275319_p52974484409_2">If you have configured a macro variable of date and time and schedule a CDM job through <span id="dataartsstudio_01_0051__en-us_topic_0108275424_en-us_topic_0108275319_text9997118203_2">DataArts Studio DataArts Factory</span>, the system replaces the macro variable of date and time with (<em id="dataartsstudio_01_0051__en-us_topic_0108275424_en-us_topic_0108275319_i799871152012_2">Planned start time of the data development job</em> <em id="dataartsstudio_01_0051__en-us_topic_0108275424_en-us_topic_0108275319_i0998101192020_2">Offset</em>) rather than (<em id="dataartsstudio_01_0051__en-us_topic_0108275424_en-us_topic_0108275319_i299821132018_2">Actual start time of the CDM job</em> <em id="dataartsstudio_01_0051__en-us_topic_0108275424_en-us_topic_0108275319_i129981917200_2">Offset</em>).</p>
</div></div>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.3 "><p id="dataartsstudio_01_0051__en-us_topic_0108275424_p8135518144617">age &gt; 18 and age &lt;= 60</p>
</td>
</tr>
</tbody>
</table>
</div>
<div class="note" id="dataartsstudio_01_0051__en-us_topic_0108275424_note57157885143833"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="dataartsstudio_01_0051__en-us_topic_0108275424_p23221388113028">If the data source is Hive, CDM will automatically partition data using the Hive data partitioning file.</p>
</div></div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="dataartsstudio_01_0047.html">Configuring CDM Source Job Parameters</a></div>
</div>
</div>