forked from docs/doc-exports
Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com> Co-authored-by: chenxiaoxiong <chenxiaoxiong@huawei.com> Co-committed-by: chenxiaoxiong <chenxiaoxiong@huawei.com>
84 lines
12 KiB
HTML
84 lines
12 KiB
HTML
<a name="dataartsstudio_01_0050"></a><a name="dataartsstudio_01_0050"></a>
|
||
|
||
<h1 class="topictitle1">From HBase/CloudTable</h1>
|
||
<div id="body8662426"><p id="dataartsstudio_01_0050__en-us_topic_0108275276_p516813286351">If the source link of a job is an <a href="dataartsstudio_01_0039.html">HBase</a> or <a href="dataartsstudio_01_0027.html">CloudTable</a> link, that is, if data is exported from MRS HBase, FusionInsight HBase, CloudTable, or Apache HBase, configure the source job parameters based on <a href="#dataartsstudio_01_0050__en-us_topic_0108275276_table5046103815165">Table 1</a>.</p>
|
||
<div class="note" id="dataartsstudio_01_0050__en-us_topic_0108275276_note18976163112245"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><ol id="dataartsstudio_01_0050__en-us_topic_0108275276_ol12963193284516"><li id="dataartsstudio_01_0050__en-us_topic_0108275276_en-us_topic_0108275404_li469815359134">When you migrate data from CloudTable or HBase, CDM reads the first row of the table as an example of the field list. If the first row of data does not contain all fields of the table, you need to manually add fields.</li><li id="dataartsstudio_01_0050__en-us_topic_0108275276_en-us_topic_0108275404_li156983351133">Because HBase is schema-less, CDM cannot obtain the data types. If the data is stored in binary format, CDM cannot parse the data.</li></ol><ol start="3" id="dataartsstudio_01_0050__en-us_topic_0108275276_ol196211215464"><li id="dataartsstudio_01_0050__en-us_topic_0108275276_li116210284615">When data is exported from HBase or CloudTable, because HBase/CloudTable is schema-less storage systems, CDM requires that the source numeric fields be stored in regular decimal format rather than in binary format. For example, the value 100 needs to be stored as <span class="parmvalue" id="dataartsstudio_01_0050__en-us_topic_0108275276_parmvalue18659236162419"><b>100</b></span> rather than <span class="parmvalue" id="dataartsstudio_01_0050__en-us_topic_0108275276_parmvalue19660113615246"><b>01100100</b></span>.</li></ol>
|
||
</div></div>
|
||
|
||
<div class="tablenoborder"><a name="dataartsstudio_01_0050__en-us_topic_0108275276_table5046103815165"></a><a name="en-us_topic_0108275276_table5046103815165"></a><table cellpadding="4" cellspacing="0" summary="" id="dataartsstudio_01_0050__en-us_topic_0108275276_table5046103815165" frame="border" border="1" rules="all"><caption><b>Table 1 </b>Parameter description</caption><thead align="left"><tr id="dataartsstudio_01_0050__en-us_topic_0108275276_row585315215165"><th align="left" class="cellrowborder" valign="top" width="18.509999999999998%" id="mcps1.3.3.2.5.1.1"><p id="dataartsstudio_01_0050__en-us_topic_0108275276_p10837171103918">Category</p>
|
||
</th>
|
||
<th align="left" class="cellrowborder" valign="top" width="18.86%" id="mcps1.3.3.2.5.1.2"><p id="dataartsstudio_01_0050__en-us_topic_0108275276_p1626397215165">Parameter</p>
|
||
</th>
|
||
<th align="left" class="cellrowborder" valign="top" width="42.230000000000004%" id="mcps1.3.3.2.5.1.3"><p id="dataartsstudio_01_0050__en-us_topic_0108275276_p4231334915165">Description</p>
|
||
</th>
|
||
<th align="left" class="cellrowborder" valign="top" width="20.4%" id="mcps1.3.3.2.5.1.4"><p id="dataartsstudio_01_0050__en-us_topic_0108275276_p482921015165">Example Value</p>
|
||
</th>
|
||
</tr>
|
||
</thead>
|
||
<tbody><tr id="dataartsstudio_01_0050__en-us_topic_0108275276_row4012116315165"><td class="cellrowborder" rowspan="2" valign="top" width="18.509999999999998%" headers="mcps1.3.3.2.5.1.1 "><p id="dataartsstudio_01_0050__en-us_topic_0108275276_p683715153917">Basic parameters</p>
|
||
<p id="dataartsstudio_01_0050__en-us_topic_0108275276_p1983719163918"></p>
|
||
</td>
|
||
<td class="cellrowborder" valign="top" width="18.86%" headers="mcps1.3.3.2.5.1.2 "><p id="dataartsstudio_01_0050__en-us_topic_0108275276_p2858877215165">Table Name</p>
|
||
</td>
|
||
<td class="cellrowborder" valign="top" width="42.230000000000004%" headers="mcps1.3.3.2.5.1.3 "><p id="dataartsstudio_01_0050__en-us_topic_0108275276_p3398923015165">Name of the HBase table that data will be exported from</p>
|
||
<p id="dataartsstudio_01_0050__en-us_topic_0108275276_p1210244910548">This parameter can be configured as a macro variable of date and time and a path name can contain multiple macro variables. When the macro variable of date and time works with a scheduled job, the incremental data can be synchronized periodically. </p>
|
||
<div class="note" id="dataartsstudio_01_0050__en-us_topic_0108275276_note1391794455811"><span class="notetitle"> NOTE: </span><div class="notebody"><p id="dataartsstudio_01_0050__en-us_topic_0108275276_en-us_topic_0108275319_p52974484409">If you have configured a macro variable of date and time and schedule a CDM job through <span id="dataartsstudio_01_0050__en-us_topic_0108275276_en-us_topic_0108275319_text9997118203">DataArts Studio DataArts Factory</span>, the system replaces the macro variable of date and time with (<em id="dataartsstudio_01_0050__en-us_topic_0108275276_en-us_topic_0108275319_i799871152012">Planned start time of the data development job</em> – <em id="dataartsstudio_01_0050__en-us_topic_0108275276_en-us_topic_0108275319_i0998101192020">Offset</em>) rather than (<em id="dataartsstudio_01_0050__en-us_topic_0108275276_en-us_topic_0108275319_i299821132018">Actual start time of the CDM job</em> – <em id="dataartsstudio_01_0050__en-us_topic_0108275276_en-us_topic_0108275319_i129981917200">Offset</em>).</p>
|
||
</div></div>
|
||
</td>
|
||
<td class="cellrowborder" valign="top" width="20.4%" headers="mcps1.3.3.2.5.1.4 "><p id="dataartsstudio_01_0050__en-us_topic_0108275276_p166427315165">TBL_2</p>
|
||
</td>
|
||
</tr>
|
||
<tr id="dataartsstudio_01_0050__en-us_topic_0108275276_row2085795312515"><td class="cellrowborder" valign="top" headers="mcps1.3.3.2.5.1.1 "><p id="dataartsstudio_01_0050__en-us_topic_0108275276_p9857953135118">Column Families</p>
|
||
</td>
|
||
<td class="cellrowborder" valign="top" headers="mcps1.3.3.2.5.1.2 "><p id="dataartsstudio_01_0050__en-us_topic_0108275276_p168572535513">(Optional) Column families to which the exported data belongs</p>
|
||
</td>
|
||
<td class="cellrowborder" valign="top" headers="mcps1.3.3.2.5.1.3 "><p id="dataartsstudio_01_0050__en-us_topic_0108275276_p11857175335116">CF1&CF2</p>
|
||
</td>
|
||
</tr>
|
||
<tr id="dataartsstudio_01_0050__en-us_topic_0108275276_row68131455163415"><td class="cellrowborder" rowspan="4" valign="top" width="18.509999999999998%" headers="mcps1.3.3.2.5.1.1 "><p id="dataartsstudio_01_0050__en-us_topic_0108275276_p3837512391">Advanced attributes</p>
|
||
</td>
|
||
<td class="cellrowborder" valign="top" width="18.86%" headers="mcps1.3.3.2.5.1.2 "><p id="dataartsstudio_01_0050__en-us_topic_0108275276_p198131455173412">Split Rowkey</p>
|
||
</td>
|
||
<td class="cellrowborder" valign="top" width="42.230000000000004%" headers="mcps1.3.3.2.5.1.3 "><p id="dataartsstudio_01_0050__en-us_topic_0108275276_p14813655143413">(Optional) Whether to split a rowkey. The default value is <span class="parmvalue" id="dataartsstudio_01_0050__en-us_topic_0108275276_parmvalue164898924418270"><b>No</b></span>.</p>
|
||
</td>
|
||
<td class="cellrowborder" valign="top" width="20.4%" headers="mcps1.3.3.2.5.1.4 "><p id="dataartsstudio_01_0050__en-us_topic_0108275276_p881355511346">Yes</p>
|
||
</td>
|
||
</tr>
|
||
<tr id="dataartsstudio_01_0050__en-us_topic_0108275276_row1431725743415"><td class="cellrowborder" valign="top" headers="mcps1.3.3.2.5.1.1 "><p id="dataartsstudio_01_0050__en-us_topic_0108275276_p3317457133416">Rowkey Delimiter</p>
|
||
</td>
|
||
<td class="cellrowborder" valign="top" headers="mcps1.3.3.2.5.1.2 "><p id="dataartsstudio_01_0050__en-us_topic_0108275276_p1631810574340">(Optional) Delimiter used to split a rowkey. If this parameter is left empty, the rowkey will not be split.</p>
|
||
</td>
|
||
<td class="cellrowborder" valign="top" headers="mcps1.3.3.2.5.1.3 "><p id="dataartsstudio_01_0050__en-us_topic_0108275276_p15318175714341">|</p>
|
||
</td>
|
||
</tr>
|
||
<tr id="dataartsstudio_01_0050__en-us_topic_0108275276_row667043895219"><td class="cellrowborder" valign="top" headers="mcps1.3.3.2.5.1.1 "><p id="dataartsstudio_01_0050__en-us_topic_0108275276_p1967023835217">Start Time</p>
|
||
</td>
|
||
<td class="cellrowborder" valign="top" headers="mcps1.3.3.2.5.1.2 "><p id="dataartsstudio_01_0050__en-us_topic_0108275276_p091005834018">(Optional) <span id="dataartsstudio_01_0050__en-us_topic_0108275276_ph5654161619427">Start time (including the value) for extracting data. The format is <em id="dataartsstudio_01_0050__en-us_topic_0108275276_i38211611191413">yyyy-MM-dd HH:mm:ss</em>. Only the data generated at the specified time and later is extracted.</span></p>
|
||
<p id="dataartsstudio_01_0050__en-us_topic_0108275276_p116711580415">This parameter can be set to a macro variable of date and time. When the macro variable of date and time works with a scheduled job, the incremental data can be synchronized periodically. </p>
|
||
<div class="note" id="dataartsstudio_01_0050__en-us_topic_0108275276_note45791757134910"><span class="notetitle"> NOTE: </span><div class="notebody"><p id="dataartsstudio_01_0050__en-us_topic_0108275276_en-us_topic_0108275319_p52974484409_1">If you have configured a macro variable of date and time and schedule a CDM job through <span id="dataartsstudio_01_0050__en-us_topic_0108275276_en-us_topic_0108275319_text9997118203_1">DataArts Studio DataArts Factory</span>, the system replaces the macro variable of date and time with (<em id="dataartsstudio_01_0050__en-us_topic_0108275276_en-us_topic_0108275319_i799871152012_1">Planned start time of the data development job</em> – <em id="dataartsstudio_01_0050__en-us_topic_0108275276_en-us_topic_0108275319_i0998101192020_1">Offset</em>) rather than (<em id="dataartsstudio_01_0050__en-us_topic_0108275276_en-us_topic_0108275319_i299821132018_1">Actual start time of the CDM job</em> – <em id="dataartsstudio_01_0050__en-us_topic_0108275276_en-us_topic_0108275319_i129981917200_1">Offset</em>).</p>
|
||
</div></div>
|
||
</td>
|
||
<td class="cellrowborder" valign="top" headers="mcps1.3.3.2.5.1.3 "><p id="dataartsstudio_01_0050__en-us_topic_0108275276_p17670173812526">2019-01-01 20:00:00</p>
|
||
</td>
|
||
</tr>
|
||
<tr id="dataartsstudio_01_0050__en-us_topic_0108275276_row6700144113522"><td class="cellrowborder" valign="top" headers="mcps1.3.3.2.5.1.1 "><p id="dataartsstudio_01_0050__en-us_topic_0108275276_p1970004165215">End Time</p>
|
||
</td>
|
||
<td class="cellrowborder" valign="top" headers="mcps1.3.3.2.5.1.2 "><p id="dataartsstudio_01_0050__en-us_topic_0108275276_p2700041145210">(Optional) <span id="dataartsstudio_01_0050__en-us_topic_0108275276_ph5201142410429">End time (excluding the value) for extracting data. The format is <em id="dataartsstudio_01_0050__en-us_topic_0108275276_i1268214522145">yyyy-MM-dd HH:mm:ss</em>. Only the data generated before the time point is extracted.</span></p>
|
||
<p id="dataartsstudio_01_0050__en-us_topic_0108275276_p92671211101710">This parameter can be set to a macro variable of date and time. </p>
|
||
<div class="note" id="dataartsstudio_01_0050__en-us_topic_0108275276_note11381953104418"><span class="notetitle"> NOTE: </span><div class="notebody"><p id="dataartsstudio_01_0050__en-us_topic_0108275276_en-us_topic_0108275319_p52974484409_2">If you have configured a macro variable of date and time and schedule a CDM job through <span id="dataartsstudio_01_0050__en-us_topic_0108275276_en-us_topic_0108275319_text9997118203_2">DataArts Studio DataArts Factory</span>, the system replaces the macro variable of date and time with (<em id="dataartsstudio_01_0050__en-us_topic_0108275276_en-us_topic_0108275319_i799871152012_2">Planned start time of the data development job</em> – <em id="dataartsstudio_01_0050__en-us_topic_0108275276_en-us_topic_0108275319_i0998101192020_2">Offset</em>) rather than (<em id="dataartsstudio_01_0050__en-us_topic_0108275276_en-us_topic_0108275319_i299821132018_2">Actual start time of the CDM job</em> – <em id="dataartsstudio_01_0050__en-us_topic_0108275276_en-us_topic_0108275319_i129981917200_2">Offset</em>).</p>
|
||
</div></div>
|
||
</td>
|
||
<td class="cellrowborder" valign="top" headers="mcps1.3.3.2.5.1.3 "><p id="dataartsstudio_01_0050__en-us_topic_0108275276_p17007416523">2019-02-01 20:00:00</p>
|
||
</td>
|
||
</tr>
|
||
</tbody>
|
||
</table>
|
||
</div>
|
||
</div>
|
||
<div>
|
||
<div class="familylinks">
|
||
<div class="parentlink"><strong>Parent topic:</strong> <a href="dataartsstudio_01_0047.html">Configuring CDM Source Job Parameters</a></div>
|
||
</div>
|
||
</div>
|
||
|