Files
doc-exports/docs/dataartsstudio/umn/dataartsstudio_01_0049.html
chenxiaoxiong f9e2808b7c DataArts UMN 20250810 version
Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com>
Co-authored-by: chenxiaoxiong <chenxiaoxiong@huawei.com>
Co-committed-by: chenxiaoxiong <chenxiaoxiong@huawei.com>
2025-09-02 10:44:13 +00:00

214 lines
34 KiB
HTML
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<a name="dataartsstudio_01_0049"></a><a name="dataartsstudio_01_0049"></a>
<h1 class="topictitle1">From HDFS</h1>
<div id="body8662426"><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p371394572713">If the source link of a job is an <a href="dataartsstudio_01_0040.html">HDFS link</a>, that is, if data is exported from MRS HDFS, FusionInsight HDFS, or Apache HDFS, configure the source job parameters based on <a href="#dataartsstudio_01_0049__en-us_topic_0108275442_table5046103815165">Table 1</a>.</p>
<div class="tablenoborder"><a name="dataartsstudio_01_0049__en-us_topic_0108275442_table5046103815165"></a><a name="en-us_topic_0108275442_table5046103815165"></a><table cellpadding="4" cellspacing="0" summary="" id="dataartsstudio_01_0049__en-us_topic_0108275442_table5046103815165" frame="border" border="1" rules="all"><caption><b>Table 1 </b>Parameter description</caption><thead align="left"><tr id="dataartsstudio_01_0049__en-us_topic_0108275442_row585315215165"><th align="left" class="cellrowborder" valign="top" width="17.349999999999998%" id="mcps1.3.2.2.5.1.1"><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p434331215165">Category</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="24.16%" id="mcps1.3.2.2.5.1.2"><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p1626397215165">Parameter</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="41.089999999999996%" id="mcps1.3.2.2.5.1.3"><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p4231334915165">Description</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="17.4%" id="mcps1.3.2.2.5.1.4"><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p482921015165">Example Value</p>
</th>
</tr>
</thead>
<tbody><tr id="dataartsstudio_01_0049__en-us_topic_0108275442_row11117123394312"><td class="cellrowborder" rowspan="7" valign="top" width="17.349999999999998%" headers="mcps1.3.2.2.5.1.1 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p17117163312436">Basic parameters</p>
</td>
<td class="cellrowborder" valign="top" width="24.16%" headers="mcps1.3.2.2.5.1.2 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p97392215428">Source Link Name</p>
</td>
<td class="cellrowborder" valign="top" width="41.089999999999996%" headers="mcps1.3.2.2.5.1.3 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p672212524619">Select a type from the drop-down list box.</p>
</td>
<td class="cellrowborder" valign="top" width="17.4%" headers="mcps1.3.2.2.5.1.4 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p3722853467">hdfs_to_cdm</p>
</td>
</tr>
<tr id="dataartsstudio_01_0049__en-us_topic_0108275442_row4012116315165"><td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.1 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p2858877215165">Source Directory/File</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.2 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p4100102710298">This parameter is available only when <strong id="dataartsstudio_01_0049__en-us_topic_0108275442_b19815250183218">Pull List File</strong> is set to <strong id="dataartsstudio_01_0049__en-us_topic_0108275442_b20815550183214">No</strong>.</p>
<p id="dataartsstudio_01_0049__en-us_topic_0108275442_p3398923015165">Directory or file path from which data will be extracted.</p>
<p id="dataartsstudio_01_0049__en-us_topic_0108275442_p49679878163953">Directory from which data is to be migrated. All files (including all nested subdirectories and their subfiles) in the directory will be migrated.</p>
<p id="dataartsstudio_01_0049__en-us_topic_0108275442_p1210244910548">This parameter can be configured as a macro variable of date and time and a path name can contain multiple macro variables. When the macro variable of date and time works with a scheduled job, the incremental data can be synchronized periodically. </p>
<div class="note" id="dataartsstudio_01_0049__en-us_topic_0108275442_note45791757134910"><span class="notetitle"> NOTE: </span><div class="notebody"><p id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_p52974484409">If you have configured a macro variable of date and time and schedule a CDM job through <span id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_text9997118203">DataArts Studio DataArts Factory</span>, the system replaces the macro variable of date and time with (<em id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_i799871152012">Planned start time of the data development job</em> <em id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_i0998101192020">Offset</em>) rather than (<em id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_i299821132018">Actual start time of the CDM job</em> <em id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_i129981917200">Offset</em>).</p>
</div></div>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.3 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p166427315165">/user/cdm/</p>
</td>
</tr>
<tr id="dataartsstudio_01_0049__en-us_topic_0108275442_row1497845915165"><td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.1 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p529563715165">File Format</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.2 "><div class="p" id="dataartsstudio_01_0049__en-us_topic_0108275442_p27142413114140">File format used when transferring data. The options are as follows:<ul id="dataartsstudio_01_0049__en-us_topic_0108275442_ue5dec7869b79475f8f1e727e91bfc65e"><li id="dataartsstudio_01_0049__en-us_topic_0108275442_l8159128e835e4b91b083e9a9b999ddd7"><strong id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_b842352706102048_1">CSV</strong>: Source files will be migrated to tables after being converted to CSV format.</li><li id="dataartsstudio_01_0049__en-us_topic_0108275442_lf9cb991576024a549bebf5c01e84041f"><strong id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_b842352706111710">Binary</strong>: Files (even not in binary format) will be transferred directly. It is used for file copy.</li><li id="dataartsstudio_01_0049__en-us_topic_0108275442_lfb1f3264830e4b00a7f00af47b142818"><strong id="dataartsstudio_01_0049__en-us_topic_0108275442_b842352706102048_1">Parquet</strong>: Source files will be migrated to tables after being converted to Parquet format.</li></ul>
</div>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.3 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p3753014815165">CSV</p>
</td>
</tr>
<tr id="dataartsstudio_01_0049__en-us_topic_0108275442_row1755248172813"><td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.1 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p172732585211">Pull List File</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.2 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p1784616256013">This parameter is displayed only when <span class="parmname" id="dataartsstudio_01_0049__en-us_topic_0108275442_parmname15237155563211"><b>File Format</b></span> is set to <span class="parmvalue" id="dataartsstudio_01_0049__en-us_topic_0108275442_parmvalue1723755523211"><b>Binary</b></span>.</p>
<div class="p" id="dataartsstudio_01_0049__en-us_topic_0108275442_p133836141792">If the pull list file function is enabled, the content of a file (such as a .txt file) in an OBS bucket can be read as the list of files to be migrated. The content in the file must be the absolute path of the file to be migrated (rather than a directory). The following is example content:<pre class="screen" id="dataartsstudio_01_0049__en-us_topic_0108275442_screen173973616018">/mrs/job-properties/application_1634891604621_0014/job.properties
/mrs/job-properties/application_1634891604621_0029/job.properties</pre>
</div>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.3 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p5625141693519">Yes</p>
</td>
</tr>
<tr id="dataartsstudio_01_0049__en-us_topic_0108275442_row725185362810"><td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.1 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p583413915521">OBS Link of List File</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.2 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p1779718175122">This parameter is available only when <strong id="dataartsstudio_01_0049__en-us_topic_0108275442_b1939676133319">Pull List File</strong> is set to <strong id="dataartsstudio_01_0049__en-us_topic_0108275442_b1139626203310">Yes</strong>. You can select the OBS link where the list file is located.</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.3 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p517163773612">OBS_test_link</p>
</td>
</tr>
<tr id="dataartsstudio_01_0049__en-us_topic_0108275442_row228975117284"><td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.1 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p1491784155215">OBS Bucket of entries files</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.2 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p15767548553">This parameter is available only when <strong id="dataartsstudio_01_0049__en-us_topic_0108275442_b18893171119330">Pull List File</strong> is set to <strong id="dataartsstudio_01_0049__en-us_topic_0108275442_b5893121113338">Yes</strong>. It indicates the name of the OBS bucket where the list file is located.</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.3 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p11773718362">01</p>
</td>
</tr>
<tr id="dataartsstudio_01_0049__en-us_topic_0108275442_row17330245162813"><td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.1 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p096517320540">Path/Directory of entries files</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.2 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p1350591217157">This parameter is available only when <strong id="dataartsstudio_01_0049__en-us_topic_0108275442_b12854161910332">Pull List File</strong> is set to <strong id="dataartsstudio_01_0049__en-us_topic_0108275442_b19855201923311">Yes</strong>. It indicates the absolute path or directory of the list file in the OBS bucket.</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.3 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p4176375366">/0521/Lists.txt</p>
</td>
</tr>
<tr id="dataartsstudio_01_0049__en-us_topic_0108275442_row3482998115165"><td class="cellrowborder" rowspan="16" valign="top" width="17.349999999999998%" headers="mcps1.3.2.2.5.1.1 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p4617080315165">Advanced attributes</p>
</td>
<td class="cellrowborder" valign="top" width="24.16%" headers="mcps1.3.2.2.5.1.2 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p265625015165">Line Separator</p>
</td>
<td class="cellrowborder" valign="top" width="41.089999999999996%" headers="mcps1.3.2.2.5.1.3 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p1382968715165">Lind feed character in a file. By default, the system automatically identifies <span class="parmvalue" id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_parmvalue1628141519144315"><b>\n</b></span>, <span class="parmvalue" id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_parmvalue946796237144315"><b>\r</b></span>, and <span class="parmvalue" id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_parmvalue936618015144315"><b>\r\n</b></span>. This parameter is displayed only when <span class="parmname" id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_parmname1101376798112032"><b>File Format</b></span> is set to <span class="parmvalue" id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_parmvalue86583942112056"><b>CSV</b></span>.</p>
</td>
<td class="cellrowborder" valign="top" width="17.4%" headers="mcps1.3.2.2.5.1.4 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p4854631115165">\n</p>
</td>
</tr>
<tr id="dataartsstudio_01_0049__en-us_topic_0108275442_row3426361615165"><td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.1 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p2388955115165">Field Delimiter</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.2 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p5600551115165">Character used to separate fields in the file. To set the <strong id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_b842352706113647">Tab</strong> key as the delimiter, set this parameter to <span class="parmvalue" id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_parmvalue1470440010114039"><b>\t</b></span>. This parameter is displayed only when <span class="parmname" id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_parmname1357494714213"><b>File Format</b></span> is set to <span class="parmvalue" id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_parmvalue1257416476212"><b>CSV</b></span>.</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.3 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p4015252715165">,</p>
</td>
</tr>
<tr id="dataartsstudio_01_0049__en-us_topic_0108275442_row06816265114"><td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.1 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p116811622519">Use First Row as Header</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.2 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p1468115213518">This parameter is displayed only when <span class="parmname" id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_parmname1938690813"><b>File Format</b></span> is set to <span class="parmvalue" id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_parmvalue1483588303"><b>CSV</b></span>. When you migrate a CSV file to a table, CDM writes all data to the table by default. If you set this parameter to <span class="parmvalue" id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_parmvalue1968432313314"><b>Yes</b></span>, CDM uses the first N rows of the CSV file as the heading row and does not write the row to the destination table.</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.3 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p368113211517">No</p>
</td>
</tr>
<tr id="dataartsstudio_01_0049__en-us_topic_0108275442_row11788951152710"><td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.1 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p378875162719">Encoding Type</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.2 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p33988295142526">Encoding type, for example, <span class="parmvalue" id="dataartsstudio_01_0049__en-us_topic_0108275442_parmvalue35309864411287"><b>UTF-8</b></span> or <span class="parmvalue" id="dataartsstudio_01_0049__en-us_topic_0108275442_parmvalue73612015911287"><b>GBK</b></span>. You can set the encoding type for text files only. This parameter is invalid when <span class="parmname" id="dataartsstudio_01_0049__en-us_topic_0108275442_parmname25778748142537"><b>File Format</b></span> is set to <span class="parmvalue" id="dataartsstudio_01_0049__en-us_topic_0108275442_parmvalue13285685144950"><b>Binary</b></span>.</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.3 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p35014356164716">GBK</p>
</td>
</tr>
<tr id="dataartsstudio_01_0049__en-us_topic_0108275442_row6041688715658"><td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.1 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p6192965315658">Start Job by Marker File</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.2 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p5024596315658">Whether to start a job by a marker file. A job is only started if there is a marker file for starting the job in the source path. If there is no marker file, the job will be suspended for a period of time specified by <span class="parmname" id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_parmname3785430292"><b>Suspension Period</b></span>.</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.3 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p4339120215658">ok.txt</p>
</td>
</tr>
<tr id="dataartsstudio_01_0049__en-us_topic_0108275442_row1425517309456"><td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.1 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p18685103014317">Filter Type</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.2 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p368512307312">Only paths or files that meet the filtering conditions are transferred. The options are <strong id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_b1123610303127">None</strong>, <strong id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_b924293071215">Wildcard</strong>, and <strong id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_b9242530131215">Regex</strong>. </p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.3 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p968513303313">-</p>
</td>
</tr>
<tr id="dataartsstudio_01_0049__en-us_topic_0108275442_row994336215165"><td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.1 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p64217335145116">Directory Filter</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.2 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p34221671145116">If you set <span class="parmname" id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_parmname9651163720211"><b>Filter Type</b></span> to <span class="parmvalue" id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_parmvalue1199413443210"><b>Wildcard</b></span> or <strong id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_b17182166142119">Regex</strong>, enter a wildcard character to filter paths. The paths that meet the filtering condition are migrated. You can configure multiple paths separated by commas (,).</p>
<div class="note" id="dataartsstudio_01_0049__en-us_topic_0108275442_note363856144214"><span class="notetitle"> NOTE: </span><div class="notebody"><p id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_p52974484409_1">If you have configured a macro variable of date and time and schedule a CDM job through <span id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_text9997118203_1">DataArts Studio DataArts Factory</span>, the system replaces the macro variable of date and time with (<em id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_i799871152012_1">Planned start time of the data development job</em> <em id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_i0998101192020_1">Offset</em>) rather than (<em id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_i299821132018_1">Actual start time of the CDM job</em> <em id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_i129981917200_1">Offset</em>).</p>
</div></div>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.3 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p20491984145116">*input</p>
</td>
</tr>
<tr id="dataartsstudio_01_0049__en-us_topic_0108275442_row851377714512"><td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.1 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p40488609145116">File Filter</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.2 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p58351869145116">If you set <span class="parmname" id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_parmname15716259342"><b>Filter Type</b></span> to <span class="parmvalue" id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_parmvalue561362193810"><b>Wildcard</b></span> or <span class="parmvalue" id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_parmvalue1759949153411"><b>Regex</b></span>, you can enter a wildcard character to search for files in a specified path. The files that meet the search criteria are migrated. You can configure multiple files separated by commas (,).</p>
<div class="note" id="dataartsstudio_01_0049__en-us_topic_0108275442_note937081116422"><span class="notetitle"> NOTE: </span><div class="notebody"><p id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_p52974484409_2">If you have configured a macro variable of date and time and schedule a CDM job through <span id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_text9997118203_2">DataArts Studio DataArts Factory</span>, the system replaces the macro variable of date and time with (<em id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_i799871152012_2">Planned start time of the data development job</em> <em id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_i0998101192020_2">Offset</em>) rather than (<em id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_i299821132018_2">Actual start time of the CDM job</em> <em id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_i129981917200_2">Offset</em>).</p>
</div></div>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.3 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p28880974145116">*.csv</p>
</td>
</tr>
<tr id="dataartsstudio_01_0049__en-us_topic_0108275442_row121055161243"><td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.1 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p101063161414">Time Filter</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.2 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p151064164412">If you select <strong id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_b113651633184010">Yes</strong>, files are transferred based on their modification time.</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.3 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p5106201616413">Yes</p>
</td>
</tr>
<tr id="dataartsstudio_01_0049__en-us_topic_0108275442_row8615115514527"><td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.1 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p13831194733913">Minimum Timestamp</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.2 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p18311447203911">If you set <span class="parmname" id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_parmname8132151819611"><b>Filter Type</b></span> to <span class="parmvalue" id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_parmvalue113211184612"><b>Time Filter</b></span>, and specify a point in time for this parameter, only the files modified at or after the specified time are transferred. The time format must be <em id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_i16132918961">yyyy-MM-dd HH:mm:ss</em>.</p>
<p id="dataartsstudio_01_0049__en-us_topic_0108275442_p12144111817144">This parameter can be set to a macro variable of date and time. For example, <strong id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_b97861314103">${timestamp(dateformat(yyyy-MM-dd HH:mm:ss,-90,DAY))}</strong> indicates that only files generated within the latest 90 days are migrated.</p>
<div class="note" id="dataartsstudio_01_0049__en-us_topic_0108275442_note6622111464219"><span class="notetitle"> NOTE: </span><div class="notebody"><p id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_p52974484409_3">If you have configured a macro variable of date and time and schedule a CDM job through <span id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_text9997118203_3">DataArts Studio DataArts Factory</span>, the system replaces the macro variable of date and time with (<em id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_i799871152012_3">Planned start time of the data development job</em> <em id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_i0998101192020_3">Offset</em>) rather than (<em id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_i299821132018_3">Actual start time of the CDM job</em> <em id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_i129981917200_3">Offset</em>).</p>
</div></div>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.3 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p1683154713395">2019-07-01 00:00:00</p>
</td>
</tr>
<tr id="dataartsstudio_01_0049__en-us_topic_0108275442_row1949013110145"><td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.1 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p1149121141416">Maximum Timestamp</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.2 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p34921611171416">If you set <span class="parmname" id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_parmname32123244113"><b>Filter Type</b></span> to <span class="parmvalue" id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_parmvalue161911324415"><b>Time Filter</b></span>, and specify a point in time for this parameter, only the files modified before the specified time are transferred. The time format must be <em id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_i4192321411">yyyy-MM-dd HH:mm:ss</em>.</p>
<p id="dataartsstudio_01_0049__en-us_topic_0108275442_p13254806156">This parameter can be set to a macro variable of date and time. For example, <strong id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_b1521616538112">${timestamp(dateformat(yyyy-MM-dd HH:mm:ss))}</strong> indicates that only the files whose modification time is earlier than the current time are migrated.</p>
<div class="note" id="dataartsstudio_01_0049__en-us_topic_0108275442_note107252176423"><span class="notetitle"> NOTE: </span><div class="notebody"><p id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_p52974484409_4">If you have configured a macro variable of date and time and schedule a CDM job through <span id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_text9997118203_4">DataArts Studio DataArts Factory</span>, the system replaces the macro variable of date and time with (<em id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_i799871152012_4">Planned start time of the data development job</em> <em id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_i0998101192020_4">Offset</em>) rather than (<em id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_i299821132018_4">Actual start time of the CDM job</em> <em id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_i129981917200_4">Offset</em>).</p>
</div></div>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.3 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p8442749101515">2019-07-30 00:00:00</p>
</td>
</tr>
<tr id="dataartsstudio_01_0049__en-us_topic_0108275442_row719361213207"><td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.1 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p1719419128201">Create Snapshot</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.2 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p17617269485">If you set this parameter to <span class="parmvalue" id="dataartsstudio_01_0049__en-us_topic_0108275442_parmvalue6280638398"><b>Yes</b></span>, CDM creates a snapshot for the source directory to be migrated (the snapshot cannot be created for a single file) before it reads files from HDFS. Then CDM migrates the data in the snapshot.</p>
<p id="dataartsstudio_01_0049__en-us_topic_0108275442_p20194131216207">Only the HDFS administrator can create a snapshot. After the CDM job is completed, the snapshot is deleted.</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.3 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p419411125208">No</p>
</td>
</tr>
<tr id="dataartsstudio_01_0049__en-us_topic_0108275442_row142015472296"><td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.1 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p1273915511715">Encryption</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.2 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p87391251172">This parameter is displayed only when <span class="parmname" id="dataartsstudio_01_0049__en-us_topic_0108275442_parmname8968316161717"><b>File Format</b></span> is set to <span class="parmvalue" id="dataartsstudio_01_0049__en-us_topic_0108275442_parmvalue12969131641710"><b>Binary</b></span>.</p>
<div class="p" id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275442_p943615213200">If the source data is encrypted, CDM can decrypt the data before exporting it. Select whether to decrypt the source data and select a decryption algorithm. The options are as follows:<ul id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275442_en-us_topic_0108275319_ul44410501486"><li id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275442_en-us_topic_0108275319_li193971654134816"><strong id="dataartsstudio_01_0049__en-us_topic_0108275442_b72820180611287">NONE</strong>: Export data without decrypting it.</li><li id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275442_en-us_topic_0108275319_li4983122011199"><strong id="dataartsstudio_01_0049__en-us_topic_0108275442_b145474047611287">AES-256-GCM</strong>: The AES 256-bit encryption algorithm is used to encrypt data. Currently, only the AES-256-GCM (NoPadding) encryption algorithm is supported. This parameter is used for encryption at the migration destination and decryption at the migration source.</li></ul>
</div>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.3 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p1173985171713">AES-256-GCM</p>
</td>
</tr>
<tr id="dataartsstudio_01_0049__en-us_topic_0108275442_row8357848162917"><td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.1 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p2560191135113">DEK</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.2 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275442_p13560410513">This parameter is displayed only when <span class="parmname" id="dataartsstudio_01_0049__en-us_topic_0108275442_parmname358922962013"><b>Encryption</b></span> is set to <span class="parmvalue" id="dataartsstudio_01_0049__en-us_topic_0108275442_parmvalue1958922962018"><b>AES-256-GCM</b></span>. The key consists of 64 hexadecimal numbers and must be the same as the <span class="parmname" id="dataartsstudio_01_0049__en-us_topic_0108275442_parmname135891029132013"><b>DEK</b></span> configured during encryption. If the encryption and decryption keys are inconsistent, the system does not report an exception, but the decrypted data is incorrect.</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.3 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p10560111155111">DD0AE00DFECD78BF051BCFDA25BD4E320DB0A7AC75A1F3FC3D3C56A457DCDC1B</p>
</td>
</tr>
<tr id="dataartsstudio_01_0049__en-us_topic_0108275442_row926318497295"><td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.1 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p24773525117">IV</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.2 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275442_p24771354512">This parameter is displayed only when <span class="parmname" id="dataartsstudio_01_0049__en-us_topic_0108275442_parmname48104322411287"><b>Encryption</b></span> is set to <span class="parmvalue" id="dataartsstudio_01_0049__en-us_topic_0108275442_parmvalue144493273311287"><b>AES-256-GCM</b></span>. The initialization vector consists of 32 hexadecimal numbers and must be the same as the <span class="parmname" id="dataartsstudio_01_0049__en-us_topic_0108275442_parmname201495676911287"><b>IV</b></span> configured during encryption. If the encryption and decryption keys are inconsistent, the system does not report an exception, but the decrypted data is incorrect.</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.3 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p647775195111">5C91687BA886EDCD12ACBC3FF19A3C3F</p>
</td>
</tr>
<tr id="dataartsstudio_01_0049__en-us_topic_0108275442_row665715116462"><td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.1 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p1638142316101">MD5 File Extension</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.2 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p10638223191015">This parameter is displayed only when <span class="parmname" id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_parmname16121112171713"><b>File Format</b></span> is set to <span class="parmvalue" id="dataartsstudio_01_0049__en-us_topic_0108275442_en-us_topic_0108275319_parmvalue1812612217176"><b>Binary</b></span>.</p>
<p id="dataartsstudio_01_0049__en-us_topic_0108275442_p205522046409">This parameter is used to check whether the files extracted by CDM are consistent with source files. </p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.3 "><p id="dataartsstudio_01_0049__en-us_topic_0108275442_p7638723101010">.md5</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="dataartsstudio_01_0047.html">Configuring CDM Source Job Parameters</a></div>
</div>
</div>