Files
doc-exports/docs/dataartsstudio/umn/dataartsstudio_01_0059.html
chenxiaoxiong f9e2808b7c DataArts UMN 20250810 version
Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com>
Co-authored-by: chenxiaoxiong <chenxiaoxiong@huawei.com>
Co-committed-by: chenxiaoxiong <chenxiaoxiong@huawei.com>
2025-09-02 10:44:13 +00:00

87 lines
15 KiB
HTML

<a name="dataartsstudio_01_0059"></a><a name="dataartsstudio_01_0059"></a>
<h1 class="topictitle1">From Elasticsearch or CSS</h1>
<div id="body8662426"><p id="dataartsstudio_01_0059__en-us_topic_0108275408_p32842430161426">If the source link of a job is a link described in <a href="dataartsstudio_01_1380.html">Elasticsearch Link Parameters</a> or <a href="dataartsstudio_01_0035.html">CSS Link Parameters</a>, configure the source job parameters based on <a href="#dataartsstudio_01_0059__en-us_topic_0108275408_table5046103815165">Table 1</a>.</p>
<div class="tablenoborder"><a name="dataartsstudio_01_0059__en-us_topic_0108275408_table5046103815165"></a><a name="en-us_topic_0108275408_table5046103815165"></a><table cellpadding="4" cellspacing="0" summary="" id="dataartsstudio_01_0059__en-us_topic_0108275408_table5046103815165" frame="border" border="1" rules="all"><caption><b>Table 1 </b>Job parameters when Elasticsearch or CSS is the source</caption><thead align="left"><tr id="dataartsstudio_01_0059__en-us_topic_0108275408_row585315215165"><th align="left" class="cellrowborder" valign="top" width="13.15%" id="mcps1.3.2.2.5.1.1"><p id="dataartsstudio_01_0059__en-us_topic_0108275408_p10804134914373">Category</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="14.44%" id="mcps1.3.2.2.5.1.2"><p id="dataartsstudio_01_0059__en-us_topic_0108275408_p1626397215165">Parameter</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="55.08%" id="mcps1.3.2.2.5.1.3"><p id="dataartsstudio_01_0059__en-us_topic_0108275408_p4231334915165">Description</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="17.330000000000002%" id="mcps1.3.2.2.5.1.4"><p id="dataartsstudio_01_0059__en-us_topic_0108275408_p482921015165">Example Value</p>
</th>
</tr>
</thead>
<tbody><tr id="dataartsstudio_01_0059__en-us_topic_0108275408_row4012116315165"><td class="cellrowborder" rowspan="2" valign="top" width="13.15%" headers="mcps1.3.2.2.5.1.1 "><p id="dataartsstudio_01_0059__en-us_topic_0108275408_p2804194916375">Basic parameters</p>
</td>
<td class="cellrowborder" valign="top" width="14.44%" headers="mcps1.3.2.2.5.1.2 "><p id="dataartsstudio_01_0059__en-us_topic_0108275408_p2858877215165">Index</p>
</td>
<td class="cellrowborder" valign="top" width="55.08%" headers="mcps1.3.2.2.5.1.3 "><p id="dataartsstudio_01_0059__en-us_topic_0108275408_p15492661577">Elasticsearch index, which is similar to the name of a relational database. The index name can contain only lowercase letters. </p>
</td>
<td class="cellrowborder" valign="top" width="17.330000000000002%" headers="mcps1.3.2.2.5.1.4 "><p id="dataartsstudio_01_0059__en-us_topic_0108275408_p166427315165">index</p>
</td>
</tr>
<tr id="dataartsstudio_01_0059__en-us_topic_0108275408_row1497845915165"><td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.1 "><p id="dataartsstudio_01_0059__en-us_topic_0108275408_p529563715165">Type</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.2 "><p id="dataartsstudio_01_0059__en-us_topic_0108275408_p527763715824">Elasticsearch type, which is similar to the table name of a relational database. The type name can contain only lowercase letters.</p>
<div class="note" id="dataartsstudio_01_0059__en-us_topic_0108275408_note379492611815"><span class="notetitle"> NOTE: </span><div class="notebody"><p id="dataartsstudio_01_0059__en-us_topic_0108275408_p47952026161812">Elasticsearch 7.x and later versions do not support custom types. Instead, only the <strong id="dataartsstudio_01_0059__en-us_topic_0108275408_b44102315505">_doc</strong> type can be used. In this case, this parameter does not take effect even if it is set.</p>
</div></div>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.3 "><p id="dataartsstudio_01_0059__en-us_topic_0108275408_p3753014815165">_doc</p>
</td>
</tr>
<tr id="dataartsstudio_01_0059__en-us_topic_0108275408_row15286142463917"><td class="cellrowborder" rowspan="5" valign="top" width="13.15%" headers="mcps1.3.2.2.5.1.1 "><p id="dataartsstudio_01_0059__en-us_topic_0108275408_p780414499378">Advanced attributes</p>
</td>
<td class="cellrowborder" valign="top" width="14.44%" headers="mcps1.3.2.2.5.1.2 "><p id="dataartsstudio_01_0059__en-us_topic_0108275408_p142861324163913">Split Nested Field</p>
</td>
<td class="cellrowborder" valign="top" width="55.08%" headers="mcps1.3.2.2.5.1.3 "><p id="dataartsstudio_01_0059__en-us_topic_0108275408_p9286172411394">(Optional) Whether to split the JSON content of the nested fields. For example, <span class="uicontrol" id="dataartsstudio_01_0059__en-us_topic_0108275408_uicontrol1089212913485"><b>a:{ b:{ c:1, d:{ e:2, f:3 } } }</b></span> can be split into <span class="uicontrol" id="dataartsstudio_01_0059__en-us_topic_0108275408_uicontrol6892592485"><b>a.b.c</b></span>, <span class="uicontrol" id="dataartsstudio_01_0059__en-us_topic_0108275408_uicontrol19892996483"><b>a.b.d.e</b></span>, and <span class="uicontrol" id="dataartsstudio_01_0059__en-us_topic_0108275408_uicontrol28924914813"><b>a.b.d.f</b></span>.</p>
</td>
<td class="cellrowborder" valign="top" width="17.330000000000002%" headers="mcps1.3.2.2.5.1.4 "><p id="dataartsstudio_01_0059__en-us_topic_0108275408_p728616244391">No</p>
</td>
</tr>
<tr id="dataartsstudio_01_0059__en-us_topic_0108275408_row4983279398"><td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.1 "><p id="dataartsstudio_01_0059__en-us_topic_0108275408_p169822715398">Filter Conditions</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.2 "><div class="p" id="dataartsstudio_01_0059__en-us_topic_0108275408_p1375811413509">(Optional) CDM migrates only the data that meets the filter conditions.<ul id="dataartsstudio_01_0059__en-us_topic_0108275408_ul195507460502"><li id="dataartsstudio_01_0059__en-us_topic_0108275408_li185501469506">Currently, only the query string (q syntax) of Elasticsearch can be used to filter source data. The q syntax is used in the following way:<ul id="dataartsstudio_01_0059__en-us_topic_0108275408_ul0206183811545"><li id="dataartsstudio_01_0059__en-us_topic_0108275408_li1069253812819">In exact match, the <strong id="dataartsstudio_01_0059__en-us_topic_0108275408_b106920386811"><em id="dataartsstudio_01_0059__en-us_topic_0108275408_i1869217381689">column</em>:<em id="dataartsstudio_01_0059__en-us_topic_0108275408_i16921138987">data</em></strong> format is used to match and filter data. <strong id="dataartsstudio_01_0059__en-us_topic_0108275408_b218517418451"><em id="dataartsstudio_01_0059__en-us_topic_0108275408_i266113944510">column</em></strong> indicates the field name, and <strong id="dataartsstudio_01_0059__en-us_topic_0108275408_b4552453144516"><em id="dataartsstudio_01_0059__en-us_topic_0108275408_i98661352184518">data</em></strong> indicates the query condition, for example, <strong id="dataartsstudio_01_0059__en-us_topic_0108275408_b560464611">last_name:Smith</strong>.<p id="dataartsstudio_01_0059__en-us_topic_0108275408_p1652817287102">In addition, if <strong id="dataartsstudio_01_0059__en-us_topic_0108275408_b84313315548"><em id="dataartsstudio_01_0059__en-us_topic_0108275408_i101886218548">data</em></strong> is a string containing spaces, it must be enclosed in double quotation marks. If <strong id="dataartsstudio_01_0059__en-us_topic_0108275408_b59531838105314"><em id="dataartsstudio_01_0059__en-us_topic_0108275408_i148141436145315">column</em></strong> is not specified, all fields will be matched by <strong id="dataartsstudio_01_0059__en-us_topic_0108275408_b1946341620546"><em id="dataartsstudio_01_0059__en-us_topic_0108275408_i11156131611547">data</em></strong>.</p>
</li><li id="dataartsstudio_01_0059__en-us_topic_0108275408_li885413832917">Multiple query conditions can be combined with connection words. The format is <strong id="dataartsstudio_01_0059__en-us_topic_0108275408_b1186083819291"><em id="dataartsstudio_01_0059__en-us_topic_0108275408_i58606386294">column</em><em id="dataartsstudio_01_0059__en-us_topic_0108275408_i88601838142912">1</em>:<em id="dataartsstudio_01_0059__en-us_topic_0108275408_i158605382292">data</em></strong><em id="dataartsstudio_01_0059__en-us_topic_0108275408_i10860338192914"><strong id="dataartsstudio_01_0059__en-us_topic_0108275408_b1286019386295">1 </strong></em><strong id="dataartsstudio_01_0059__en-us_topic_0108275408_b128601538112915">AND</strong> <strong id="dataartsstudio_01_0059__en-us_topic_0108275408_b3860113802920"><em id="dataartsstudio_01_0059__en-us_topic_0108275408_i78608388293">column2</em>:<em id="dataartsstudio_01_0059__en-us_topic_0108275408_i19860193817297">data2</em></strong>. The connection words can be <strong id="dataartsstudio_01_0059__en-us_topic_0108275408_b17842105220572">AND</strong>, <strong id="dataartsstudio_01_0059__en-us_topic_0108275408_b17986195412573">OR</strong>, or <strong id="dataartsstudio_01_0059__en-us_topic_0108275408_b17845195720579">NOT</strong>. They must be in uppercase, and there must be a space before and after each connection word.<p id="dataartsstudio_01_0059__en-us_topic_0108275408_p65831139152916">Example: <strong id="dataartsstudio_01_0059__en-us_topic_0108275408_b145380188338">first_name:Alec AND last_name:John</strong></p>
</li><li id="dataartsstudio_01_0059__en-us_topic_0108275408_li12314536162917">In range matching, you can directly use a condition expression to filter data. The expression is in <strong id="dataartsstudio_01_0059__en-us_topic_0108275408_b1131623672916"><em id="dataartsstudio_01_0059__en-us_topic_0108275408_i4316103611293">column</em>:&gt;<em id="dataartsstudio_01_0059__en-us_topic_0108275408_i19316133619297">data</em></strong> format. The operator can be <strong id="dataartsstudio_01_0059__en-us_topic_0108275408_b171618810014">&gt;</strong>, <strong id="dataartsstudio_01_0059__en-us_topic_0108275408_b68101911609">&gt;=</strong>, <strong id="dataartsstudio_01_0059__en-us_topic_0108275408_b11795713303">&lt;</strong>, or <strong id="dataartsstudio_01_0059__en-us_topic_0108275408_b1075081619014">&lt;=</strong>.<p id="dataartsstudio_01_0059__en-us_topic_0108275408_p6811233163817">An example is <strong id="dataartsstudio_01_0059__en-us_topic_0108275408_b7376441807">time:&gt;=1636905600000 AND time:&lt;1637078400000</strong>. It can also be used together with a macro variable of date and time, for example, <strong id="dataartsstudio_01_0059__en-us_topic_0108275408_b154530308212">createTime:&gt;=${timestamp(dateformat(yyyyMMdd,-1,DAY))} AND createTime:&lt; ${timestamp(dateformat(yyyyMMdd))}</strong>.</p>
</li><li id="dataartsstudio_01_0059__en-us_topic_0108275408_li7995191751512">In range matching, you can also use the range syntax to filter data. The format is <strong id="dataartsstudio_01_0059__en-us_topic_0108275408_b182012442813"><em id="dataartsstudio_01_0059__en-us_topic_0108275408_i14201324142816">column</em>:{<em id="dataartsstudio_01_0059__en-us_topic_0108275408_i0203249284">data</em><em id="dataartsstudio_01_0059__en-us_topic_0108275408_i18201224142810">1</em> TO <em id="dataartsstudio_01_0059__en-us_topic_0108275408_i19201242287">data2</em></strong><strong id="dataartsstudio_01_0059__en-us_topic_0108275408_b3647710131718">}</strong>. <strong id="dataartsstudio_01_0059__en-us_topic_0108275408_b2601564368">{</strong> and <strong id="dataartsstudio_01_0059__en-us_topic_0108275408_b66015560363">}</strong> indicate that a value is not included. <strong id="dataartsstudio_01_0059__en-us_topic_0108275408_b16105653618">[</strong> and <strong id="dataartsstudio_01_0059__en-us_topic_0108275408_b26115618363">]</strong> indicate that a value is included. <strong id="dataartsstudio_01_0059__en-us_topic_0108275408_b1961145610366">TO</strong> must be capitalized, and there must be a space before and after it. <strong id="dataartsstudio_01_0059__en-us_topic_0108275408_b16611556193610">*</strong> indicates all data.<p id="dataartsstudio_01_0059__en-us_topic_0108275408_p2734852152918">For example, <strong id="dataartsstudio_01_0059__en-us_topic_0108275408_b1145415590519">time:{1636992000000 TO *]</strong> filters out all the data greater than 1636992000000 in the <strong id="dataartsstudio_01_0059__en-us_topic_0108275408_b36272331712">time</strong> field. It can also be used together with a macro variable of date and time, for example, <strong id="dataartsstudio_01_0059__en-us_topic_0108275408_b138691339275">createTime:[${timestamp(dateformat(yyyyMMdd,-1,DAY))} TO ${timestamp(dateformat(yyyyMMdd))}}</strong>.</p>
</li></ul>
</li><li id="dataartsstudio_01_0059__en-us_topic_0108275408_li1023905521910">Source data cannot be filtered using the query domain-specific language (DSL) of Elasticsearch.</li></ul>
</div>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.3 "><p id="dataartsstudio_01_0059__en-us_topic_0108275408_p398627133918">last_name:Smith</p>
</td>
</tr>
<tr id="dataartsstudio_01_0059__en-us_topic_0108275408_row1969518119321"><td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.1 "><p id="dataartsstudio_01_0059__en-us_topic_0108275408_p413919913217">Extract Meta-field</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.2 "><p id="dataartsstudio_01_0059__en-us_topic_0108275408_p4695171163214">Whether to extract index meta-fields. For example, _index, _type, _id, and _score.</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.3 "><p id="dataartsstudio_01_0059__en-us_topic_0108275408_p1269612110329">Yes</p>
</td>
</tr>
<tr id="dataartsstudio_01_0059__en-us_topic_0108275408_row927614372145"><td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.1 "><p id="dataartsstudio_01_0059__en-us_topic_0108275408_p1339161171519">Page size</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.2 "><p id="dataartsstudio_01_0059__en-us_topic_0108275408_p13277337131417">Elasticsearch page size</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.3 "><p id="dataartsstudio_01_0059__en-us_topic_0108275408_p2278163761411">1000</p>
</td>
</tr>
<tr id="dataartsstudio_01_0059__en-us_topic_0108275408_row4236655111415"><td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.1 "><p id="dataartsstudio_01_0059__en-us_topic_0108275408_p1523625516143">ScrollId Time Out</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.2 "><p id="dataartsstudio_01_0059__en-us_topic_0108275408_p723655551416">During a scroll query using Elasticsearch, a <strong id="dataartsstudio_01_0059__en-us_topic_0108275408_b61618485353">scroll_id</strong> is recorded. When the query times out or is complete, the recorded <strong id="dataartsstudio_01_0059__en-us_topic_0108275408_b79098393369">srcoll_id</strong> will be cleared. You can set this parameter to specify the timeout duration.</p>
</td>
<td class="cellrowborder" valign="top" headers="mcps1.3.2.2.5.1.3 "><p id="dataartsstudio_01_0059__en-us_topic_0108275408_p1023685515143">5</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="dataartsstudio_01_0047.html">Configuring CDM Source Job Parameters</a></div>
</div>
</div>