forked from docs/doc-exports
Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com> Co-authored-by: chenxiaoxiong <chenxiaoxiong@huawei.com> Co-committed-by: chenxiaoxiong <chenxiaoxiong@huawei.com>
289 lines
61 KiB
HTML
289 lines
61 KiB
HTML
<a name="dataartsstudio_01_0092"></a><a name="dataartsstudio_01_0092"></a>
|
|
|
|
<h1 class="topictitle1">Migrating Data from MySQL to MRS Hive</h1>
|
|
<div id="body8662426"><p id="dataartsstudio_01_0092__en-us_topic_0111325168_p66608011462">MRS provides enterprise-level big data clusters on the cloud. It contains HDFS, Hive, and Spark components and is applicable to massive data analysis of enterprises.</p>
|
|
<p id="dataartsstudio_01_0092__en-us_topic_0111325168_p17289542509">Hive supports SQL to help users perform extraction, transformation, and loading (ETL) operations on large-scale data sets. Query on large-scale data sets takes a long time. In many scenarios, you can create Hive partitions to reduce the total amount of data to be scanned each time. This significantly improves query performance.</p>
|
|
<p id="dataartsstudio_01_0092__en-us_topic_0111325168_p928917411503">Hive partitions are implemented by using the HDFS subdirectory function. Each subdirectory contains the column names and values of each partition. If there are multiple partitions, many HDFS subdirectories exist. It is not easy to load external data to each partition of the Hive table without relying on tools. With CDM, you can easily load data of the external data sources (relational databases, object storage services, and file system services) to Hive partition tables.</p>
|
|
<p id="dataartsstudio_01_0092__en-us_topic_0111325168_p11937920913">This section describes how to migrate data from the MySQL database to the MRS Hive partition table.</p>
|
|
<div class="section" id="dataartsstudio_01_0092__en-us_topic_0111325168_section848194854517"><h4 class="sectiontitle">Scenario</h4><p id="dataartsstudio_01_0092__en-us_topic_0111325168_p1923117034619">Suppose that there is a <strong id="dataartsstudio_01_0092__b3708132191312">trip_data</strong> table in the MySQL database. The table stores cycling records such as the start time, end time, start sites, end sites, and rider IDs. For details about the fields in the <strong id="dataartsstudio_01_0092__b970819241318">trip_data</strong> table, see <a href="#dataartsstudio_01_0092__en-us_topic_0111325168_fig5406153795610">Figure 1</a>.</p>
|
|
<div class="fignone" id="dataartsstudio_01_0092__en-us_topic_0111325168_fig5406153795610"><a name="dataartsstudio_01_0092__en-us_topic_0111325168_fig5406153795610"></a><a name="en-us_topic_0111325168_fig5406153795610"></a><span class="figcap"><b>Figure 1 </b>MySQL table fields</span><br><span><img id="dataartsstudio_01_0092__en-us_topic_0111325168_image19881151119589" src="en-us_image_0000002234085224.png"></span></div>
|
|
<p id="dataartsstudio_01_0092__en-us_topic_0111325168_p5917174621820">The following describes how to use CDM to import the <strong id="dataartsstudio_01_0092__en-us_topic_0111325168_b3891741111810">trip_data</strong> table in the MySQL database to the MRS Hive partition table. The procedure is as follows:</p>
|
|
<ol id="dataartsstudio_01_0092__en-us_topic_0111325168_ol873184981913"><li id="dataartsstudio_01_0092__en-us_topic_0111325168_li137318493195"><a href="#dataartsstudio_01_0092__en-us_topic_0111325168_section143383811272">Creating a Hive Partition Table on MRS Hive</a></li><li id="dataartsstudio_01_0092__en-us_topic_0111325168_li460161173519"><a href="#dataartsstudio_01_0092__en-us_topic_0111325168_section563314494359">Creating a CDM Cluster and Binding an EIP to the Cluster</a></li><li id="dataartsstudio_01_0092__en-us_topic_0111325168_li277115413438"><a href="#dataartsstudio_01_0092__en-us_topic_0111325168_section459563891734">Creating a MySQL Link</a></li><li id="dataartsstudio_01_0092__en-us_topic_0111325168_li12787439195215"><a href="#dataartsstudio_01_0092__en-us_topic_0111325168_section209397834812">Creating a Hive Link</a></li><li id="dataartsstudio_01_0092__en-us_topic_0111325168_li030310465524"><a href="#dataartsstudio_01_0092__en-us_topic_0111325168_section1821596484">Creating a Migration Job</a></li></ol>
|
|
</div>
|
|
<div class="section" id="dataartsstudio_01_0092__en-us_topic_0111325168_section425442671733"><h4 class="sectiontitle">Prerequisites</h4><ul id="dataartsstudio_01_0092__en-us_topic_0111325168_ul45433458114323"><li id="dataartsstudio_01_0092__en-us_topic_0111325168_li53278713114323">MRS is available.</li><li id="dataartsstudio_01_0092__en-us_topic_0111325168_li2723271211445">You have obtained the IP address, port, database name, username, and password for connecting to the MySQL database. In addition, the user must have the read and write permissions on the MySQL database.</li><li id="dataartsstudio_01_0092__en-us_topic_0111325168_li178721321413">You have uploaded the MySQL database driver on the <span class="menucascade" id="dataartsstudio_01_0092__en-us_topic_0111325168_en-us_topic_0286032703_menucascade177204413614"><b><span class="uicontrol" id="dataartsstudio_01_0092__en-us_topic_0111325168_en-us_topic_0286032703_uicontrol17264483610">Job Management</span></b> > <b><span class="uicontrol" id="dataartsstudio_01_0092__en-us_topic_0111325168_en-us_topic_0286032703_uicontrol1072134453617">Links</span></b> > <b><span class="uicontrol" id="dataartsstudio_01_0092__en-us_topic_0111325168_en-us_topic_0286032703_uicontrol12728448364">Driver Management</span></b></span> page.</li></ul>
|
|
</div>
|
|
<div class="section" id="dataartsstudio_01_0092__en-us_topic_0111325168_section143383811272"><a name="dataartsstudio_01_0092__en-us_topic_0111325168_section143383811272"></a><a name="en-us_topic_0111325168_section143383811272"></a><h4 class="sectiontitle">Creating a Hive Partition Table on MRS Hive</h4><div class="p" id="dataartsstudio_01_0092__en-us_topic_0111325168_p1748723119286">On MRS Hive, run the following SQL statement to create a Hive partition table named <strong id="dataartsstudio_01_0092__en-us_topic_0111325168_b07221049142315">trip_data</strong> with three new fields <strong id="dataartsstudio_01_0092__en-us_topic_0111325168_b770820436267">y</strong>, <strong id="dataartsstudio_01_0092__en-us_topic_0111325168_b1652214692617">ym</strong>, and <strong id="dataartsstudio_01_0092__en-us_topic_0111325168_b15444450142612">ymd</strong> used as partition fields. The SQL statement is as follows:<div class="codecoloring" codetype="Sql" id="dataartsstudio_01_0092__en-us_topic_0111325168_screen2431123263012"><div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal">1</span></pre></div></td><td class="code"><div><pre><span></span><span class="k">create</span><span class="w"> </span><span class="k">table</span><span class="w"> </span><span class="n">trip_data</span><span class="p">(</span><span class="n">TripID</span><span class="w"> </span><span class="nb">int</span><span class="p">,</span><span class="n">Duration</span><span class="w"> </span><span class="nb">int</span><span class="p">,</span><span class="n">StartDate</span><span class="w"> </span><span class="k">timestamp</span><span class="p">,</span><span class="n">StartStation</span><span class="w"> </span><span class="nb">varchar</span><span class="p">(</span><span class="mi">64</span><span class="p">),</span><span class="n">StartTerminal</span><span class="w"> </span><span class="nb">int</span><span class="p">,</span><span class="n">EndDate</span><span class="w"> </span><span class="k">timestamp</span><span class="p">,</span><span class="n">EndStation</span><span class="w"> </span><span class="nb">varchar</span><span class="p">(</span><span class="mi">64</span><span class="p">),</span><span class="n">EndTerminal</span><span class="w"> </span><span class="nb">int</span><span class="p">,</span><span class="n">Bike</span><span class="w"> </span><span class="nb">int</span><span class="p">,</span><span class="n">SubscriberType</span><span class="w"> </span><span class="nb">varchar</span><span class="p">(</span><span class="mi">32</span><span class="p">),</span><span class="n">ZipCodev</span><span class="w"> </span><span class="nb">varchar</span><span class="p">(</span><span class="mi">10</span><span class="p">))</span><span class="n">partitioned</span><span class="w"> </span><span class="k">by</span><span class="w"> </span><span class="p">(</span><span class="n">y</span><span class="w"> </span><span class="nb">int</span><span class="p">,</span><span class="n">ym</span><span class="w"> </span><span class="nb">int</span><span class="p">,</span><span class="n">ymd</span><span class="w"> </span><span class="nb">int</span><span class="p">);</span>
|
|
</pre></div></td></tr></table></div>
|
|
</div>
|
|
<div class="note" id="dataartsstudio_01_0092__en-us_topic_0111325168_note176471452182510"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="dataartsstudio_01_0092__en-us_topic_0111325168_p14647252102517">The <strong id="dataartsstudio_01_0092__en-us_topic_0111325168_b174244201275">trip_data</strong> partition table has three partition fields: year, year and month, and year, month, and date of the start time of a ride. For example, if the start time of a ride is <strong id="dataartsstudio_01_0092__en-us_topic_0111325168_b1190455310292">2018/5/11 9:40</strong>, the record is saved in the <strong id="dataartsstudio_01_0092__en-us_topic_0111325168_b1362185042920">trip_data/2018/201805/20180511</strong> partition. When the records in the <strong id="dataartsstudio_01_0092__en-us_topic_0111325168_b1621671817366">trip_data</strong> table are summarized, only part of the data needs to be scanned, greatly improving the performance.</p>
|
|
</div></div>
|
|
</div>
|
|
</div>
|
|
<div class="section" id="dataartsstudio_01_0092__en-us_topic_0111325168_section563314494359"><a name="dataartsstudio_01_0092__en-us_topic_0111325168_section563314494359"></a><a name="en-us_topic_0111325168_section563314494359"></a><h4 class="sectiontitle">Creating a CDM Cluster and Binding an EIP to the Cluster</h4><ol id="dataartsstudio_01_0092__en-us_topic_0111325168_ol686765753513"><li id="dataartsstudio_01_0092__en-us_topic_0111325168_li19019556299"><span id="dataartsstudio_01_0092__en-us_topic_0111325168_p192201826193011">The key configurations are as follows:</span><p><ul id="dataartsstudio_01_0092__en-us_topic_0111325168_u4fa55d1ed2cb44c8bd08d0cb52a91fdb"><li id="dataartsstudio_01_0092__en-us_topic_0111325168_li603163259221">The flavor of the CDM cluster is selected based on the amount of data to be migrated. Generally, cdm.medium meets the requirements for most migration scenarios.</li><li id="dataartsstudio_01_0092__en-us_topic_0111325168_l965568d35ef54f2f840ec6af49e68d19">The CDM and MRS clusters must be in the same VPC, subnet, and security group.</li></ul>
|
|
</p></li><li id="dataartsstudio_01_0092__en-us_topic_0111325168_li1844914916273"><span>After the CDM cluster is created, on the <strong id="dataartsstudio_01_0092__en-us_topic_0111325168_b20761154618342">Cluster Management</strong> page, click <span class="uicontrol" id="dataartsstudio_01_0092__en-us_topic_0111325168_uicontrol14761846163410"><b>Bind EIP</b></span> in the <strong id="dataartsstudio_01_0092__en-us_topic_0111325168_b7761194617342">Operation</strong> column to bind an EIP to the cluster. The CDM cluster uses the EIP to access MySQL.</span><p><div class="note" id="dataartsstudio_01_0092__en-us_topic_0111325168_note24642513383"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="dataartsstudio_01_0092__en-us_topic_0108275471_p1693084519332">If SSL encryption is configured for the access channel of a local data source, CDM cannot connect to the data source using the EIP.</p>
|
|
</div></div>
|
|
</p></li></ol>
|
|
</div>
|
|
<div class="section" id="dataartsstudio_01_0092__en-us_topic_0111325168_section459563891734"><a name="dataartsstudio_01_0092__en-us_topic_0111325168_section459563891734"></a><a name="en-us_topic_0111325168_section459563891734"></a><h4 class="sectiontitle">Creating a MySQL Link</h4><ol id="dataartsstudio_01_0092__en-us_topic_0111325168_ol1175033817329"><li id="dataartsstudio_01_0092__en-us_topic_0111325168_li16641390115738"><span>On the <span class="uicontrol" id="dataartsstudio_01_0092__uicontrol388595215151"><b>Cluster Management</b></span> page, locate a cluster and click <span class="uicontrol" id="dataartsstudio_01_0092__uicontrol688518528151"><b>Job Management</b></span> in the <strong id="dataartsstudio_01_0092__b88861552101514">Operation</strong> column. On the displayed page, click the <strong id="dataartsstudio_01_0092__b38861152151518">Links</strong> tab and then <strong id="dataartsstudio_01_0092__b888605231519">Create Link</strong>.</span></li><li id="dataartsstudio_01_0092__en-us_topic_0111325168_li55925676152157"><span>Select <span class="uicontrol" id="dataartsstudio_01_0092__uicontrol14592130121615"><b>RDS for MySQL</b></span> and click <span class="uicontrol" id="dataartsstudio_01_0092__uicontrol17592150171618"><b>Next</b></span> to set the link parameters.</span><p><div class="fignone" id="dataartsstudio_01_0092__en-us_topic_0108275298_fig17438459415"><span class="figcap"><b>Figure 2 </b>Creating a MySQL Link</span><br><span><img id="dataartsstudio_01_0092__en-us_topic_0108275298_image121465475116" src="en-us_image_0000002234245040.png" title="Click to enlarge" class="imgResize"></span></div>
|
|
<p id="dataartsstudio_01_0092__en-us_topic_0108275298_p30757446152044">Click <strong id="dataartsstudio_01_0092__b858514114166">Show Advanced Attributes</strong> to view more optional parameters. For details, see <a href="dataartsstudio_01_1211.html">RDS for MySQL/MySQL Database Link Parameters</a>. Retain the default values for the optional parameters and configure the mandatory parameters described in <a href="#dataartsstudio_01_0092__en-us_topic_0108275298_table5321744015490">Table 1</a>.</p>
|
|
|
|
<div class="tablenoborder"><a name="dataartsstudio_01_0092__en-us_topic_0108275298_table5321744015490"></a><a name="en-us_topic_0108275298_table5321744015490"></a><table cellpadding="4" cellspacing="0" summary="" id="dataartsstudio_01_0092__en-us_topic_0108275298_table5321744015490" frame="border" border="1" rules="all"><caption><b>Table 1 </b>MySQL link parameters</caption><thead align="left"><tr id="dataartsstudio_01_0092__en-us_topic_0108275298_en-us_topic_0000001147041354_en-us_topic_0284710796_en-us_topic_0111325168_en-us_topic_0108275298_row185605615490"><th align="left" class="cellrowborder" valign="top" width="21.39%" id="mcps1.3.9.2.2.2.3.2.4.1.1"><p id="dataartsstudio_01_0092__en-us_topic_0108275298_en-us_topic_0000001147041354_en-us_topic_0284710796_en-us_topic_0111325168_en-us_topic_0108275298_p3088488815490">Parameter</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="46.01%" id="mcps1.3.9.2.2.2.3.2.4.1.2"><p id="dataartsstudio_01_0092__en-us_topic_0108275298_en-us_topic_0000001147041354_en-us_topic_0284710796_en-us_topic_0111325168_en-us_topic_0108275298_p1864797615490">Description</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="32.6%" id="mcps1.3.9.2.2.2.3.2.4.1.3"><p id="dataartsstudio_01_0092__en-us_topic_0108275298_en-us_topic_0000001147041354_en-us_topic_0284710796_en-us_topic_0111325168_en-us_topic_0108275298_p12195902165556">Example Value</p>
|
|
</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody><tr id="dataartsstudio_01_0092__en-us_topic_0108275298_en-us_topic_0000001147041354_en-us_topic_0284710796_en-us_topic_0111325168_en-us_topic_0108275298_row6448267615421"><td class="cellrowborder" valign="top" width="21.39%" headers="mcps1.3.9.2.2.2.3.2.4.1.1 "><p id="dataartsstudio_01_0092__en-us_topic_0108275298_en-us_topic_0000001147041354_en-us_topic_0284710796_en-us_topic_0111325168_en-us_topic_0108275298_p5571423915421">Name</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="46.01%" headers="mcps1.3.9.2.2.2.3.2.4.1.2 "><p id="dataartsstudio_01_0092__en-us_topic_0108275298_en-us_topic_0000001147041354_en-us_topic_0284710796_en-us_topic_0111325168_en-us_topic_0108275298_p1655951515421">Enter a unique link name.</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="32.6%" headers="mcps1.3.9.2.2.2.3.2.4.1.3 "><p id="dataartsstudio_01_0092__en-us_topic_0108275298_en-us_topic_0000001147041354_en-us_topic_0284710796_en-us_topic_0111325168_en-us_topic_0108275298_p6625233515421">mysqllink</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dataartsstudio_01_0092__en-us_topic_0108275298_en-us_topic_0000001147041354_en-us_topic_0284710796_en-us_topic_0111325168_en-us_topic_0108275298_row23645714155554"><td class="cellrowborder" valign="top" width="21.39%" headers="mcps1.3.9.2.2.2.3.2.4.1.1 "><p id="dataartsstudio_01_0092__en-us_topic_0108275298_en-us_topic_0000001147041354_en-us_topic_0284710796_en-us_topic_0111325168_en-us_topic_0108275298_p36254680155554">Database Server</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="46.01%" headers="mcps1.3.9.2.2.2.3.2.4.1.2 "><p id="dataartsstudio_01_0092__en-us_topic_0108275298_en-us_topic_0000001147041354_en-us_topic_0284710796_en-us_topic_0111325168_en-us_topic_0108275298_p57055815164650">IP address or domain name of the MySQL database</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="32.6%" headers="mcps1.3.9.2.2.2.3.2.4.1.3 "><p id="dataartsstudio_01_0092__en-us_topic_0108275298_en-us_topic_0000001147041354_en-us_topic_0284710796_en-us_topic_0111325168_en-us_topic_0108275298_p54006514165556">N/A</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dataartsstudio_01_0092__en-us_topic_0108275298_en-us_topic_0000001147041354_en-us_topic_0284710796_en-us_topic_0111325168_en-us_topic_0108275298_row35721234155558"><td class="cellrowborder" valign="top" width="21.39%" headers="mcps1.3.9.2.2.2.3.2.4.1.1 "><p id="dataartsstudio_01_0092__en-us_topic_0108275298_en-us_topic_0000001147041354_en-us_topic_0284710796_en-us_topic_0111325168_en-us_topic_0108275298_p7738819155558">Port</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="46.01%" headers="mcps1.3.9.2.2.2.3.2.4.1.2 "><p id="dataartsstudio_01_0092__en-us_topic_0108275298_en-us_topic_0000001147041354_en-us_topic_0284710796_en-us_topic_0111325168_en-us_topic_0108275298_p44462215165646">MySQL database port</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="32.6%" headers="mcps1.3.9.2.2.2.3.2.4.1.3 "><p id="dataartsstudio_01_0092__en-us_topic_0108275298_en-us_topic_0000001147041354_en-us_topic_0284710796_en-us_topic_0111325168_en-us_topic_0108275298_p44954710165556">3306</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dataartsstudio_01_0092__en-us_topic_0108275298_en-us_topic_0000001147041354_en-us_topic_0284710796_en-us_topic_0111325168_en-us_topic_0108275298_row58054787162632"><td class="cellrowborder" valign="top" width="21.39%" headers="mcps1.3.9.2.2.2.3.2.4.1.1 "><p id="dataartsstudio_01_0092__en-us_topic_0108275298_en-us_topic_0000001147041354_en-us_topic_0284710796_en-us_topic_0111325168_en-us_topic_0108275298_p4817321162632">Database Name</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="46.01%" headers="mcps1.3.9.2.2.2.3.2.4.1.2 "><p id="dataartsstudio_01_0092__en-us_topic_0108275298_en-us_topic_0000001147041354_en-us_topic_0284710796_en-us_topic_0111325168_en-us_topic_0108275298_p23569444165647">Name of the MySQL database</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="32.6%" headers="mcps1.3.9.2.2.2.3.2.4.1.3 "><p id="dataartsstudio_01_0092__en-us_topic_0108275298_en-us_topic_0000001147041354_en-us_topic_0284710796_en-us_topic_0111325168_en-us_topic_0108275298_p22858665165556">sqoop</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dataartsstudio_01_0092__en-us_topic_0108275298_en-us_topic_0000001147041354_en-us_topic_0284710796_en-us_topic_0111325168_en-us_topic_0108275298_row121116115490"><td class="cellrowborder" valign="top" width="21.39%" headers="mcps1.3.9.2.2.2.3.2.4.1.1 "><p id="dataartsstudio_01_0092__en-us_topic_0108275298_en-us_topic_0000001147041354_en-us_topic_0284710796_en-us_topic_0111325168_en-us_topic_0108275298_p3099525315490">Username</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="46.01%" headers="mcps1.3.9.2.2.2.3.2.4.1.2 "><p id="dataartsstudio_01_0092__en-us_topic_0108275298_en-us_topic_0000001147041354_en-us_topic_0284710796_en-us_topic_0111325168_en-us_topic_0108275298_p2758753215490">User who has the read, write, and delete permissions on the MySQL database</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="32.6%" headers="mcps1.3.9.2.2.2.3.2.4.1.3 "><p id="dataartsstudio_01_0092__en-us_topic_0108275298_en-us_topic_0000001147041354_en-us_topic_0284710796_en-us_topic_0111325168_en-us_topic_0108275298_p14053644165556">admin</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dataartsstudio_01_0092__en-us_topic_0108275298_en-us_topic_0000001147041354_en-us_topic_0284710796_en-us_topic_0111325168_en-us_topic_0108275298_row4576104015490"><td class="cellrowborder" valign="top" width="21.39%" headers="mcps1.3.9.2.2.2.3.2.4.1.1 "><p id="dataartsstudio_01_0092__en-us_topic_0108275298_en-us_topic_0000001147041354_en-us_topic_0284710796_en-us_topic_0111325168_en-us_topic_0108275298_p1565673415490">Password</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="46.01%" headers="mcps1.3.9.2.2.2.3.2.4.1.2 "><p id="dataartsstudio_01_0092__en-us_topic_0108275298_en-us_topic_0000001147041354_en-us_topic_0284710796_en-us_topic_0111325168_en-us_topic_0108275298_p6023590815490">Password of the user</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="32.6%" headers="mcps1.3.9.2.2.2.3.2.4.1.3 "><p id="dataartsstudio_01_0092__en-us_topic_0108275298_en-us_topic_0000001147041354_en-us_topic_0284710796_en-us_topic_0111325168_en-us_topic_0108275298_p44559445165556">N/A</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dataartsstudio_01_0092__en-us_topic_0108275298_en-us_topic_0000001147041354_row1695382013335"><td class="cellrowborder" valign="top" width="21.39%" headers="mcps1.3.9.2.2.2.3.2.4.1.1 "><p id="dataartsstudio_01_0092__en-us_topic_0108275298_en-us_topic_0000001147041354_p1395410209337">Use Local API</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="46.01%" headers="mcps1.3.9.2.2.2.3.2.4.1.2 "><p id="dataartsstudio_01_0092__en-us_topic_0108275298_en-us_topic_0000001147041354_p16954220143310">Whether to use the local API of the database for acceleration. (The system attempts to enable the <strong id="dataartsstudio_01_0092__b65036418072051">local_infile</strong> system variable of the MySQL database.)</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="32.6%" headers="mcps1.3.9.2.2.2.3.2.4.1.3 "><p id="dataartsstudio_01_0092__en-us_topic_0108275298_en-us_topic_0000001147041354_p195462013337">Yes</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dataartsstudio_01_0092__en-us_topic_0108275298_en-us_topic_0000001147041354_en-us_topic_0284710796_en-us_topic_0111325168_en-us_topic_0108275298_row117692617437"><td class="cellrowborder" valign="top" width="21.39%" headers="mcps1.3.9.2.2.2.3.2.4.1.1 "><p id="dataartsstudio_01_0092__en-us_topic_0108275298_en-us_topic_0000001147041354_en-us_topic_0284710796_en-us_topic_0111325168_en-us_topic_0108275298_p18773153334318">Use Agent</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="46.01%" headers="mcps1.3.9.2.2.2.3.2.4.1.2 "><p id="dataartsstudio_01_0092__en-us_topic_0108275298_en-us_topic_0000001147041354_en-us_topic_0284710796_en-us_topic_0111325168_en-us_topic_0108275298_p877373317439">Whether to extract data from the data source through an agent</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="32.6%" headers="mcps1.3.9.2.2.2.3.2.4.1.3 "><p id="dataartsstudio_01_0092__en-us_topic_0108275298_en-us_topic_0000001147041354_en-us_topic_0284710796_en-us_topic_0111325168_en-us_topic_0108275298_p1977311335439">No</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dataartsstudio_01_0092__en-us_topic_0108275298_en-us_topic_0000001147041354_row189746446585"><td class="cellrowborder" valign="top" width="21.39%" headers="mcps1.3.9.2.2.2.3.2.4.1.1 "><p id="dataartsstudio_01_0092__en-us_topic_0108275298_en-us_topic_0000001147041354_p179755443589">local_infile Character Set</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="46.01%" headers="mcps1.3.9.2.2.2.3.2.4.1.2 "><p id="dataartsstudio_01_0092__en-us_topic_0108275298_en-us_topic_0000001147041354_p497514412583">When using local_infile to import data to MySQL, you can configure the encoding format.</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="32.6%" headers="mcps1.3.9.2.2.2.3.2.4.1.3 "><p id="dataartsstudio_01_0092__en-us_topic_0108275298_en-us_topic_0000001147041354_p9975244135817">utf8</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dataartsstudio_01_0092__en-us_topic_0108275298_en-us_topic_0000001147041354_row29922517620"><td class="cellrowborder" valign="top" width="21.39%" headers="mcps1.3.9.2.2.2.3.2.4.1.1 "><p id="dataartsstudio_01_0092__en-us_topic_0108275298_en-us_topic_0000001147041354_p199921356617">Driver Version</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="46.01%" headers="mcps1.3.9.2.2.2.3.2.4.1.2 "><p id="dataartsstudio_01_0092__en-us_topic_0108275298_en-us_topic_0000001147041354_p189932517616">Before connecting CDM to a relational database, you need to upload the JDK 8 .jar driver of the relational database. Download the MySQL driver 5.1.48 from <a href="https://downloads.mysql.com/archives/c-j/" target="_blank" rel="noopener noreferrer">https://downloads.mysql.com/archives/c-j/</a>, obtain <strong id="dataartsstudio_01_0092__b19411746193210">mysql-connector-java-5.1.48.jar</strong>, and upload it.</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="32.6%" headers="mcps1.3.9.2.2.2.3.2.4.1.3 "><p id="dataartsstudio_01_0092__en-us_topic_0108275298_en-us_topic_0000001147041354_p899375564">N/A</p>
|
|
</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
</div>
|
|
</p></li><li id="dataartsstudio_01_0092__en-us_topic_0111325168_li1697949616149"><span>Click <span class="uicontrol" id="dataartsstudio_01_0092__en-us_topic_0111325168_uicontrol5119298426"><b>Save</b></span>. The <span class="uicontrol" id="dataartsstudio_01_0092__en-us_topic_0111325168_uicontrol31182964212"><b>Link Management</b></span> page is displayed.</span><p><div class="note" id="dataartsstudio_01_0092__en-us_topic_0111325168_note1896191114270"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="dataartsstudio_01_0092__en-us_topic_0111325168_p296110116274">If an error occurs during the saving, the security settings of the MySQL database are incorrect. In this case, you need to enable the EIP of the CDM cluster to access the MySQL database.</p>
|
|
</div></div>
|
|
</p></li></ol>
|
|
</div>
|
|
<div class="section" id="dataartsstudio_01_0092__en-us_topic_0111325168_section209397834812"><a name="dataartsstudio_01_0092__en-us_topic_0111325168_section209397834812"></a><a name="en-us_topic_0111325168_section209397834812"></a><h4 class="sectiontitle">Creating a Hive Link</h4><ol id="dataartsstudio_01_0092__en-us_topic_0111325168_ol153469406486"><li id="dataartsstudio_01_0092__en-us_topic_0111325168_en-us_topic_0108275437_li135396271350"><span>Click <span class="uicontrol" id="dataartsstudio_01_0092__uicontrol147351318123312"><b>Job Management</b></span> in the <strong id="dataartsstudio_01_0092__b1373581813313">Operation</strong> column of the CDM cluster. On the displayed page, click the <strong id="dataartsstudio_01_0092__b373541883314">Links</strong> tab and then <strong id="dataartsstudio_01_0092__b127361184334">Create Link</strong>. The <strong id="dataartsstudio_01_0092__b473611181336">Select Connector</strong> page is displayed.</span><p><div class="fignone" id="dataartsstudio_01_0092__en-us_topic_0111325168_en-us_topic_0108275437_en-us_topic_0108275298_fig13640155194015"><span class="figcap"><b>Figure 3 </b>Selecting a connector type</span><br><span><img id="dataartsstudio_01_0092__en-us_topic_0108275477_image53893842012" src="en-us_image_0000002234235252.png" title="Click to enlarge" class="imgResize"></span></div>
|
|
</p></li><li id="dataartsstudio_01_0092__en-us_topic_0111325168_li71891144134815"><span>Select <strong id="dataartsstudio_01_0092__b1296182933319">MRS Hive</strong> and click <strong id="dataartsstudio_01_0092__b496122983315">Next</strong> to configure parameters for the MRS Hive link.</span><p><p id="dataartsstudio_01_0092__en-us_topic_0111325168_p13261143681414"><a href="#dataartsstudio_01_0092__en-us_topic_0111325168_table6441152003419">Table 2</a> lists the parameters. Configure these parameters based on your actual situation.</p>
|
|
|
|
<div class="tablenoborder"><a name="dataartsstudio_01_0092__en-us_topic_0111325168_table6441152003419"></a><a name="en-us_topic_0111325168_table6441152003419"></a><table cellpadding="4" cellspacing="0" summary="" id="dataartsstudio_01_0092__en-us_topic_0111325168_table6441152003419" frame="border" border="1" rules="all"><caption><b>Table 2 </b>MRS Hive link parameters</caption><thead align="left"><tr id="dataartsstudio_01_0092__en-us_topic_0108618545_row918111835015"><th align="left" class="cellrowborder" valign="top" width="16.72%" id="mcps1.3.10.2.2.2.2.2.4.1.1"><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p111816835015">Parameter</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="62.1%" id="mcps1.3.10.2.2.2.2.2.4.1.2"><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p171813845018">Description</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="21.18%" id="mcps1.3.10.2.2.2.2.2.4.1.3"><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p8181198115015">Example Value</p>
|
|
</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody><tr id="dataartsstudio_01_0092__en-us_topic_0108618545_row151817815505"><td class="cellrowborder" valign="top" width="16.72%" headers="mcps1.3.10.2.2.2.2.2.4.1.1 "><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p218198205013">Name</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="62.1%" headers="mcps1.3.10.2.2.2.2.2.4.1.2 "><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p1369564463813">Link name, which should be defined based on the data source type, so it is easier to remember what the link is for</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="21.18%" headers="mcps1.3.10.2.2.2.2.2.4.1.3 "><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p10181138135017">hivelink</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dataartsstudio_01_0092__en-us_topic_0108618545_row16181208165013"><td class="cellrowborder" valign="top" width="16.72%" headers="mcps1.3.10.2.2.2.2.2.4.1.1 "><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p21818818506">Manager IP</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="62.1%" headers="mcps1.3.10.2.2.2.2.2.4.1.2 "><div class="p" id="dataartsstudio_01_0092__en-us_topic_0108618545_p218148165010">Floating IP address of MRS Manager. Click <span class="uicontrol" id="dataartsstudio_01_0092__en-us_topic_0108618545_en-us_topic_0108275286_uicontrol578864992012"><b>Select</b></span> next to the <strong id="dataartsstudio_01_0092__en-us_topic_0108618545_en-us_topic_0108275286_b2078834920203">Manager IP</strong> text box to select an MRS cluster. CDM automatically fills in the authentication information.<div class="note" id="dataartsstudio_01_0092__en-us_topic_0108618545_en-us_topic_0108275286_note1340014473413"><span class="notetitle"> NOTE: </span><div class="notebody"><p id="dataartsstudio_01_0092__en-us_topic_0108618545_en-us_topic_0108275286_en-us_topic_0182566327_p116249795810"><span id="dataartsstudio_01_0092__en-us_topic_0108618545_en-us_topic_0108275286_en-us_topic_0182566327_text7355181825814">DataArts Studio</span> does not support MRS clusters whose Kerberos encryption type is <strong id="dataartsstudio_01_0092__en-us_topic_0108618545_en-us_topic_0108275286_en-us_topic_0182566327_b143479402717">aes256-sha2,aes128-sha2</strong>, and only supports MRS clusters whose Kerberos encryption type is <strong id="dataartsstudio_01_0092__en-us_topic_0108618545_en-us_topic_0108275286_en-us_topic_0182566327_b19532103219813">aes256-sha1,aes128-sha1</strong>.</p>
|
|
</div></div>
|
|
</div>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="21.18%" headers="mcps1.3.10.2.2.2.2.2.4.1.3 "><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p718110815509">127.0.0.1</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dataartsstudio_01_0092__en-us_topic_0108618545_row7181168105018"><td class="cellrowborder" valign="top" width="16.72%" headers="mcps1.3.10.2.2.2.2.2.4.1.1 "><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p141816885017">Authentication Method</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="62.1%" headers="mcps1.3.10.2.2.2.2.2.4.1.2 "><div class="p" id="dataartsstudio_01_0092__en-us_topic_0108618545_p11181148185017">Authentication method used for accessing MRS<ul id="dataartsstudio_01_0092__en-us_topic_0108618545_ul12623191718453"><li id="dataartsstudio_01_0092__en-us_topic_0108618545_li1362321718457"><strong id="dataartsstudio_01_0092__en-us_topic_0108618545_b84235270611123">SIMPLE</strong>: Select this for non-security mode.</li><li id="dataartsstudio_01_0092__en-us_topic_0108618545_li762371724519"><strong id="dataartsstudio_01_0092__en-us_topic_0108618545_b842352706183921">KERBEROS</strong>: Select this for security mode.</li></ul>
|
|
</div>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="21.18%" headers="mcps1.3.10.2.2.2.2.2.4.1.3 "><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p818111819508">SIMPLE</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dataartsstudio_01_0092__en-us_topic_0108618545_row6181138195013"><td class="cellrowborder" valign="top" width="16.72%" headers="mcps1.3.10.2.2.2.2.2.4.1.1 "><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p51811188500">HIVE Version</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="62.1%" headers="mcps1.3.10.2.2.2.2.2.4.1.2 "><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p111811682501">Set this to the Hive version on the server.</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="21.18%" headers="mcps1.3.10.2.2.2.2.2.4.1.3 "><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p191811987501">HIVE_3_X</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dataartsstudio_01_0092__en-us_topic_0108618545_row818110813502"><td class="cellrowborder" valign="top" width="16.72%" headers="mcps1.3.10.2.2.2.2.2.4.1.1 "><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p1818118185015">Username</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="62.1%" headers="mcps1.3.10.2.2.2.2.2.4.1.2 "><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p1018178155019">If <strong id="dataartsstudio_01_0092__en-us_topic_0108618545_en-us_topic_0108275286_b7848458134816">Authentication Method</strong> is set to <strong id="dataartsstudio_01_0092__en-us_topic_0108618545_en-us_topic_0108275286_b12431727490">KERBEROS</strong>, you must provide the username and password used for logging in to MRS Manager. If you need to create a snapshot when exporting a directory from HDFS, the user configured here must have the administrator permission on HDFS.</p>
|
|
<div class="p" id="dataartsstudio_01_0092__en-us_topic_0108618545_p129641211269">To create a data connection for an MRS security cluster, do not use user <strong id="dataartsstudio_01_0092__en-us_topic_0108618545_b554524219105248">admin</strong>. The <strong id="dataartsstudio_01_0092__en-us_topic_0108618545_b123106598105248">admin</strong> user is the default management page user and cannot be used as the authentication user of the security cluster. You can create an MRS user and set <span class="parmname" id="dataartsstudio_01_0092__en-us_topic_0108618545_en-us_topic_0187412481_parmname3468191262313"><b>Username</b></span> and <span class="parmname" id="dataartsstudio_01_0092__en-us_topic_0108618545_en-us_topic_0187412481_parmname64683124231"><b>Password</b></span> to the username and password of the created MRS user when creating an MRS data connection.<div class="note" id="dataartsstudio_01_0092__en-us_topic_0108618545_note15451659151217"><span class="notetitle"> NOTE: </span><div class="notebody"><ul id="dataartsstudio_01_0092__en-us_topic_0108618545_ul17715141011134"><li id="dataartsstudio_01_0092__en-us_topic_0108618545_li8715121031318">If the CDM cluster version is 2.9.0 or later and the MRS cluster version is 3.1.0 or later, the created user must have the permissions of the <strong id="dataartsstudio_01_0092__en-us_topic_0108618545_b8979935645">Manager_viewer</strong> role to create links on CDM. To perform operations on databases, tables, and columns of an MRS component, you also need to add the database, table, and column permissions of the MRS component to the user by following the instructions in the MRS documentation.</li><li id="dataartsstudio_01_0092__en-us_topic_0108618545_li5415103511136">If the CDM cluster version is earlier than 2.9.0 or the MRS cluster version is earlier than 3.1.0, the created user must have the permissions of <strong id="dataartsstudio_01_0092__en-us_topic_0108618545_b1088016562338">Manager_administrator</strong> or <strong id="dataartsstudio_01_0092__en-us_topic_0108618545_b1288695693319">System_administrator</strong> to create links on CDM.</li><li id="dataartsstudio_01_0092__en-us_topic_0108618545_li1733135017192">A user with only the <strong id="dataartsstudio_01_0092__en-us_topic_0108618545_b6765133911341">Manager_tenant</strong> or <strong id="dataartsstudio_01_0092__en-us_topic_0108618545_b1377216393342">Manager_auditor</strong> permission cannot create connections.</li></ul>
|
|
</div></div>
|
|
</div>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="21.18%" headers="mcps1.3.10.2.2.2.2.2.4.1.3 "><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p1318119865013">cdm</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dataartsstudio_01_0092__en-us_topic_0108618545_row8181188155013"><td class="cellrowborder" valign="top" width="16.72%" headers="mcps1.3.10.2.2.2.2.2.4.1.1 "><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p12181583505">Password</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="62.1%" headers="mcps1.3.10.2.2.2.2.2.4.1.2 "><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p14181480502">Password used for logging in to MRS Manager</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="21.18%" headers="mcps1.3.10.2.2.2.2.2.4.1.3 "><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p6181689507">-</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dataartsstudio_01_0092__en-us_topic_0108618545_row18166181464511"><td class="cellrowborder" valign="top" width="16.72%" headers="mcps1.3.10.2.2.2.2.2.4.1.1 "><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p61671114124511">Enable ldap</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="62.1%" headers="mcps1.3.10.2.2.2.2.2.4.1.2 "><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p2245953101014">This parameter is available when <strong id="dataartsstudio_01_0092__en-us_topic_0108618545_b1485464294">Proxy connection</strong> is selected for <strong id="dataartsstudio_01_0092__en-us_topic_0108618545_b189146182920">Connection Type</strong>.</p>
|
|
<p id="dataartsstudio_01_0092__en-us_topic_0108618545_p171436423112">If LDAP authentication is enabled for an external LDAP server connected to MRS Hive, the LDAP username and password are required for authenticating the connection to MRS Hive. In this case, this option must be enabled. Otherwise, the connection will fail.</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="21.18%" headers="mcps1.3.10.2.2.2.2.2.4.1.3 "><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p20167141444513">No</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dataartsstudio_01_0092__en-us_topic_0108618545_row167980166457"><td class="cellrowborder" valign="top" width="16.72%" headers="mcps1.3.10.2.2.2.2.2.4.1.1 "><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p2798416154516">ldapUsername</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="62.1%" headers="mcps1.3.10.2.2.2.2.2.4.1.2 "><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p228323461719">This parameter is mandatory when <strong id="dataartsstudio_01_0092__en-us_topic_0108618545_b2055819378149">Enable ldap</strong> is enabled.</p>
|
|
<p id="dataartsstudio_01_0092__en-us_topic_0108618545_p720114494318">Enter the username configured when LDAP authentication was enabled for MRS Hive.</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="21.18%" headers="mcps1.3.10.2.2.2.2.2.4.1.3 "><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p879841613453">-</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dataartsstudio_01_0092__en-us_topic_0108618545_row203131218453"><td class="cellrowborder" valign="top" width="16.72%" headers="mcps1.3.10.2.2.2.2.2.4.1.1 "><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p173191212454">ldapPassword</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="62.1%" headers="mcps1.3.10.2.2.2.2.2.4.1.2 "><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p737215376176">This parameter is mandatory when <strong id="dataartsstudio_01_0092__en-us_topic_0108618545_b7346406146">Enable ldap</strong> is enabled.</p>
|
|
<p id="dataartsstudio_01_0092__en-us_topic_0108618545_en-us_topic_0108618545_p1431412184516">Enter the password configured when LDAP authentication was enabled for MRS Hive.</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="21.18%" headers="mcps1.3.10.2.2.2.2.2.4.1.3 "><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p1531412164511">-</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dataartsstudio_01_0092__en-us_topic_0108618545_row191817855019"><td class="cellrowborder" valign="top" width="16.72%" headers="mcps1.3.10.2.2.2.2.2.4.1.1 "><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p718111875014">OBS storage support</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="62.1%" headers="mcps1.3.10.2.2.2.2.2.4.1.2 "><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p15181783504">The server must support OBS storage. When creating a Hive table, you can store the table in OBS.</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="21.18%" headers="mcps1.3.10.2.2.2.2.2.4.1.3 "><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p141818819501">No</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dataartsstudio_01_0092__en-us_topic_0108618545_row182193532384"><td class="cellrowborder" valign="top" width="16.72%" headers="mcps1.3.10.2.2.2.2.2.4.1.1 "><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p4220155343814">AK</p>
|
|
</td>
|
|
<td class="cellrowborder" rowspan="2" valign="top" width="62.1%" headers="mcps1.3.10.2.2.2.2.2.4.1.2 "><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p595472512338">This parameter is mandatory when <strong id="dataartsstudio_01_0092__en-us_topic_0108618545_b193391985111223">OBS storage support</strong> is enabled. The account corresponding to the AK/SK pair must have the OBS Buckets Viewer permission. Otherwise, OBS cannot be accessed and the "403 AccessDenied" error is reported.</p>
|
|
<p id="dataartsstudio_01_0092__en-us_topic_0108618545_p14128105533915">You need to create an access key for the current account and obtain an AK/SK pair.</p>
|
|
<ol type="a" id="dataartsstudio_01_0092__en-us_topic_0108618545_ol1361418377715"><li id="dataartsstudio_01_0092__en-us_topic_0108618545_en-us_topic_0000001129241845_en-us_topic_0183643042_li1535103025819">Log in to the management console, move the cursor to the username in the upper right corner, and select <strong id="dataartsstudio_01_0092__en-us_topic_0108618545_en-us_topic_0000001129241845_b1586023610364">My Credentials</strong> from the drop-down list.</li><li id="dataartsstudio_01_0092__en-us_topic_0108618545_en-us_topic_0000001129241845_en-us_topic_0183643042_li173533018584">On the <strong id="dataartsstudio_01_0092__en-us_topic_0108618545_en-us_topic_0000001129241845_b149721128172418">My Credentials</strong> page, choose <strong id="dataartsstudio_01_0092__en-us_topic_0108618545_en-us_topic_0000001129241845_b15803656132414">Access Keys</strong>, and click <strong id="dataartsstudio_01_0092__en-us_topic_0108618545_en-us_topic_0000001129241845_b2452152020258">Create Access Key</strong>. See <a href="#dataartsstudio_01_0092__en-us_topic_0108618545_en-us_topic_0000001129241845_en-us_topic_0183643042_fig1552229194615">Figure 4</a>.<div class="fignone" id="dataartsstudio_01_0092__en-us_topic_0108618545_en-us_topic_0000001129241845_en-us_topic_0183643042_fig1552229194615"><a name="dataartsstudio_01_0092__en-us_topic_0108618545_en-us_topic_0000001129241845_en-us_topic_0183643042_fig1552229194615"></a><a name="en-us_topic_0108618545_en-us_topic_0000001129241845_en-us_topic_0183643042_fig1552229194615"></a><span class="figcap"><b>Figure 4 </b>Clicking Create Access Key</span><br><span><img id="dataartsstudio_01_0092__en-us_topic_0108618545_en-us_topic_0000001129241845_image20389043111611" src="en-us_image_0000002269194761.png" title="Click to enlarge" class="imgResize"></span></div>
|
|
</li><li id="dataartsstudio_01_0092__en-us_topic_0108618545_en-us_topic_0000001129241845_en-us_topic_0183643042_li1535530185815">Click <strong id="dataartsstudio_01_0092__en-us_topic_0108618545_en-us_topic_0000001129241845_b8340319122819">OK</strong> and save the access key file as prompted. The access key file will be saved to your browser's configured download location. Open the <strong id="dataartsstudio_01_0092__en-us_topic_0108618545_en-us_topic_0000001129241845_b1940852512813">credentials.csv</strong> file to view <strong id="dataartsstudio_01_0092__en-us_topic_0108618545_en-us_topic_0000001129241845_b191574128718">Access Key Id</strong> and <strong id="dataartsstudio_01_0092__en-us_topic_0108618545_en-us_topic_0000001129241845_b175769149718">Secret Access Key</strong>.<div class="note" id="dataartsstudio_01_0092__en-us_topic_0108618545_en-us_topic_0000001129241845_note4554158201"><span class="notetitle"> NOTE: </span><div class="notebody"><ul id="dataartsstudio_01_0092__en-us_topic_0108618545_en-us_topic_0000001129241845_ul25541081906"><li id="dataartsstudio_01_0092__en-us_topic_0108618545_en-us_topic_0000001129241845_li175541819018">Only two access keys can be added for each user.</li><li id="dataartsstudio_01_0092__en-us_topic_0108618545_en-us_topic_0000001129241845_li3554138305">To ensure access key security, the access key is automatically downloaded only when it is generated for the first time and cannot be obtained from the management console later. Keep them properly.</li></ul>
|
|
</div></div>
|
|
</li></ol>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="21.18%" headers="mcps1.3.10.2.2.2.2.2.4.1.3 "><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p10220115363818">-</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dataartsstudio_01_0092__en-us_topic_0108618545_row17883455193810"><td class="cellrowborder" valign="top" headers="mcps1.3.10.2.2.2.2.2.4.1.1 "><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p1888415555381">SK</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" headers="mcps1.3.10.2.2.2.2.2.4.1.2 "><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p6884105518385">-</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dataartsstudio_01_0092__en-us_topic_0108618545_row218148115017"><td class="cellrowborder" valign="top" width="16.72%" headers="mcps1.3.10.2.2.2.2.2.4.1.1 "><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p1718178205019">Run Mode</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="62.1%" headers="mcps1.3.10.2.2.2.2.2.4.1.2 "><div class="p" id="dataartsstudio_01_0092__en-us_topic_0108618545_p918188135016">This parameter is used only when the Hive version is <strong id="dataartsstudio_01_0092__en-us_topic_0108618545_b1731014251454">HIVE_3_X</strong>. Possible values are:<ul id="dataartsstudio_01_0092__en-us_topic_0108618545_ul111811818502"><li id="dataartsstudio_01_0092__en-us_topic_0108618545_li188444317283"><strong id="dataartsstudio_01_0092__en-us_topic_0108618545_b1095020874814">EMBEDDED</strong>: The link instance runs with CDM. This mode delivers better performance.</li><li id="dataartsstudio_01_0092__en-us_topic_0108618545_li118451236288"><strong id="dataartsstudio_01_0092__en-us_topic_0108618545_b1353409697105248">Standalone</strong>: The link instance runs in an independent process. If CDM needs to connect to multiple Hadoop data sources (MRS, Hadoop, or CloudTable) with both Kerberos and Simple authentication modes, <strong id="dataartsstudio_01_0092__en-us_topic_0108618545_b9202184312114">Standalone</strong> prevails.<div class="note" id="dataartsstudio_01_0092__en-us_topic_0108618545_note169619581157"><span class="notetitle"> NOTE: </span><div class="notebody"><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p1184511312281">The <strong id="dataartsstudio_01_0092__en-us_topic_0108618545_b1798836155817">STANDALONE</strong> mode is used to solve the version conflict problem. If the connector versions of the source and destination ends of the same link are different, a JAR file conflict occurs. In this case, you need to place the source or destination end in the STANDALONE process to prevent the migration failure caused by the conflict.</p>
|
|
</div></div>
|
|
</li></ul>
|
|
</div>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="21.18%" headers="mcps1.3.10.2.2.2.2.2.4.1.3 "><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p161817865014">EMBEDDED</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dataartsstudio_01_0092__en-us_topic_0108618545_row8331112111425"><td class="cellrowborder" valign="top" width="16.72%" headers="mcps1.3.10.2.2.2.2.2.4.1.1 "><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p10332821204214">Check Hive JDBC Connectivity</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="62.1%" headers="mcps1.3.10.2.2.2.2.2.4.1.2 "><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p5332102164218">Whether to check the Hive JDBC connectivity</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="21.18%" headers="mcps1.3.10.2.2.2.2.2.4.1.3 "><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p106001747104212">No</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dataartsstudio_01_0092__en-us_topic_0108618545_row1574651145712"><td class="cellrowborder" valign="top" width="16.72%" headers="mcps1.3.10.2.2.2.2.2.4.1.1 "><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p720517447382">Use Cluster Config</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="62.1%" headers="mcps1.3.10.2.2.2.2.2.4.1.2 "><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p15999193719436">You can use the cluster configuration to simplify parameter settings for the Hadoop connection.</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="21.18%" headers="mcps1.3.10.2.2.2.2.2.4.1.3 "><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p1820544419380">No</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dataartsstudio_01_0092__en-us_topic_0108618545_row25741451205711"><td class="cellrowborder" valign="top" width="16.72%" headers="mcps1.3.10.2.2.2.2.2.4.1.1 "><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p71575599434">Cluster Config Name</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="62.1%" headers="mcps1.3.10.2.2.2.2.2.4.1.2 "><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p16157155904317">This parameter is valid only when <strong id="dataartsstudio_01_0092__en-us_topic_0108618545_b762945800105248">Use Cluster Config</strong> is set to <strong id="dataartsstudio_01_0092__en-us_topic_0108618545_b1557791163105248">Yes</strong>. Select a cluster configuration that has been created.</p>
|
|
<p id="dataartsstudio_01_0092__en-us_topic_0108618545_p192441948154613">For details about how to configure a cluster, see "DataArts Migration" > "Managing Links" > "Managing Cluster Configurations" in <em id="dataartsstudio_01_0092__en-us_topic_0108618545_i19630121210434">User Guide</em>.</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="21.18%" headers="mcps1.3.10.2.2.2.2.2.4.1.3 "><p id="dataartsstudio_01_0092__en-us_topic_0108618545_p1115715591436">hive_01</p>
|
|
</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
</div>
|
|
</p></li><li id="dataartsstudio_01_0092__en-us_topic_0111325168_li5441819104918"><span>Click <span class="uicontrol" id="dataartsstudio_01_0092__en-us_topic_0111325168_uicontrol53609716594"><b>Save</b></span>. The <span class="uicontrol" id="dataartsstudio_01_0092__en-us_topic_0111325168_uicontrol8360197155916"><b>Link Management</b></span> page is displayed.</span></li></ol>
|
|
</div>
|
|
<div class="section" id="dataartsstudio_01_0092__en-us_topic_0111325168_section1821596484"><a name="dataartsstudio_01_0092__en-us_topic_0111325168_section1821596484"></a><a name="en-us_topic_0111325168_section1821596484"></a><h4 class="sectiontitle">Creating a Migration Job</h4><ol id="dataartsstudio_01_0092__en-us_topic_0111325168_ol0612144304910"><li id="dataartsstudio_01_0092__en-us_topic_0111325168_li1759724344919"><span>Click the <strong id="dataartsstudio_01_0092__b1544974794111">Table/File Migration</strong> tab and then <strong id="dataartsstudio_01_0092__b4450134744117">Create Job</strong>.</span><p><div class="note" id="dataartsstudio_01_0092__en-us_topic_0111325168_note13597443114916"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="dataartsstudio_01_0092__en-us_topic_0111325168_p559784374918">Set <span class="parmname" id="dataartsstudio_01_0092__en-us_topic_0111325168_parmname20939740105920"><b>Clear Data Before Import</b></span> to <span class="parmvalue" id="dataartsstudio_01_0092__en-us_topic_0111325168_parmvalue1593994010595"><b>Yes</b></span>, so that the data in the Hive table will be cleared before data import.</p>
|
|
</div></div>
|
|
</p></li><li id="dataartsstudio_01_0092__en-us_topic_0111325168_li5612164304917"><span>After configuring the parameters, click <span class="uicontrol" id="dataartsstudio_01_0092__uicontrol139778644216"><b>Next</b></span> to go to the <strong id="dataartsstudio_01_0092__b99778618422">Map Field</strong> page shown in <a href="#dataartsstudio_01_0092__en-us_topic_0111325168_fig1461204384916">Figure 5</a>.</span><p><p id="dataartsstudio_01_0092__en-us_topic_0111325168_p17597164315496">Map the fields of the MySQL table and Hive table. The Hive table has three more fields <strong id="dataartsstudio_01_0092__en-us_topic_0111325168_b05812105116">y</strong>, <strong id="dataartsstudio_01_0092__en-us_topic_0111325168_b195811510411">ym</strong>, and <strong id="dataartsstudio_01_0092__en-us_topic_0111325168_b95811210118">ymd</strong> than the MySQL table, which are the Hive partition fields. Because the fields of the source table cannot be directly mapped to the destination table, you need to configure an expression to extract data from the <strong id="dataartsstudio_01_0092__en-us_topic_0111325168_b13468205995012">StartDate</strong> field in the source table.</p>
|
|
<div class="fignone" id="dataartsstudio_01_0092__en-us_topic_0111325168_fig1461204384916"><a name="dataartsstudio_01_0092__en-us_topic_0111325168_fig1461204384916"></a><a name="en-us_topic_0111325168_fig1461204384916"></a><span class="figcap"><b>Figure 5 </b>Hive field mapping</span><br><span><img id="dataartsstudio_01_0092__en-us_topic_0111325168_image11158168144417" src="en-us_image_0000002269204489.png" title="Click to enlarge" class="imgResize"></span></div>
|
|
</p></li><li id="dataartsstudio_01_0092__en-us_topic_0111325168_li106121143184910"><span>Click <span><img id="dataartsstudio_01_0092__en-us_topic_0111325168_image12934218195720" src="en-us_image_0000002269204497.png"></span> to display the <strong id="dataartsstudio_01_0092__b6858183382110">Converter List</strong> dialog box, and then choose <span class="menucascade" id="dataartsstudio_01_0092__menucascade58594336216"><b><span class="uicontrol" id="dataartsstudio_01_0092__uicontrol1085963317210">Create Converter</span></b> > <b><span class="uicontrol" id="dataartsstudio_01_0092__uicontrol1085963362110">Expression conversion</span></b></span>. See <a href="#dataartsstudio_01_0092__en-us_topic_0111325168_fig261294344916">Figure 6</a>.</span><p><p id="dataartsstudio_01_0092__en-us_topic_0111325168_p1261254311492">The expressions for the <strong id="dataartsstudio_01_0092__en-us_topic_0111325168_b97944531281">y</strong>, <strong id="dataartsstudio_01_0092__en-us_topic_0111325168_b147942531684">ym</strong>, and <strong id="dataartsstudio_01_0092__en-us_topic_0111325168_b879419531688">ymd</strong> fields are as follows:</p>
|
|
<p id="dataartsstudio_01_0092__en-us_topic_0111325168_p1361244313490"><strong id="dataartsstudio_01_0092__en-us_topic_0111325168_b1661234315492">DateUtils.format(DateUtils.parseDate(row[2],"yyyy-MM-dd HH:mm:ss.SSS"),"yyyy")</strong></p>
|
|
<p id="dataartsstudio_01_0092__en-us_topic_0111325168_p196125438498"><strong id="dataartsstudio_01_0092__en-us_topic_0111325168_b861284319492">DateUtils.format(DateUtils.parseDate(row[2],"yyyy-MM-dd HH:mm:ss.SSS"),"yyyyMM")</strong></p>
|
|
<p id="dataartsstudio_01_0092__en-us_topic_0111325168_p2061214320491"><strong id="dataartsstudio_01_0092__en-us_topic_0111325168_b1612174314496">DateUtils.format(DateUtils.parseDate(row[2],"yyyy-MM-dd HH:mm:ss.SSS"),"yyyyMMdd")</strong></p>
|
|
<div class="fignone" id="dataartsstudio_01_0092__en-us_topic_0111325168_fig261294344916"><a name="dataartsstudio_01_0092__en-us_topic_0111325168_fig261294344916"></a><a name="en-us_topic_0111325168_fig261294344916"></a><span class="figcap"><b>Figure 6 </b>Configuring the expression</span><br><span><img id="dataartsstudio_01_0092__en-us_topic_0111325168_image63886406413" src="en-us_image_0000002234245052.png" title="Click to enlarge" class="imgResize"></span></div>
|
|
<div class="note" id="dataartsstudio_01_0092__en-us_topic_0111325168_note361204317492"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="dataartsstudio_01_0092__en-us_topic_0111325168_p1361214318496">The expressions in CDM support field conversion of common character strings, dates, and values. </p>
|
|
</div></div>
|
|
</p></li><li id="dataartsstudio_01_0092__en-us_topic_0111325168_li47753211373"><span>Click <span class="uicontrol" id="dataartsstudio_01_0092__en-us_topic_0108275315_uicontrol6471103355516"><b>Next</b></span> and set task parameters. Generally, retain the default values of all parameters.</span><p><div class="p" id="dataartsstudio_01_0092__en-us_topic_0108275315_p574914560220">In this step, you can configure the following optional functions:<ul id="dataartsstudio_01_0092__en-us_topic_0108275315_ul65067533203334"><li id="dataartsstudio_01_0092__en-us_topic_0108275315_li37800576203334"><strong id="dataartsstudio_01_0092__en-us_topic_0108275315_b95381451163716">Retry If Failed</strong>: Determine whether to automatically retry the job if it fails. Retain the default value <span class="parmvalue" id="dataartsstudio_01_0092__en-us_topic_0108275315_parmvalue137031560212"><b>Never</b></span>.</li><li id="dataartsstudio_01_0092__en-us_topic_0108275315_li158210387356"><strong id="dataartsstudio_01_0092__en-us_topic_0108275315_b117553917428">Group</strong>: Select the group to which the job belongs. The default group is <span class="parmvalue" id="dataartsstudio_01_0092__en-us_topic_0108275315_parmvalue1229163714324"><b>DEFAULT</b></span>. On the <span class="wintitle" id="dataartsstudio_01_0092__en-us_topic_0108275315_wintitle161211277446"><b>Job Management</b></span> page, jobs can be displayed, started, or exported by group.</li><li id="dataartsstudio_01_0092__en-us_topic_0108275315_li33236331203339"><strong id="dataartsstudio_01_0092__en-us_topic_0108275315_b447277982112428">Schedule Execution</strong>: Determine whether to automatically execute the job at a scheduled time. Retain the default value <span class="parmvalue" id="dataartsstudio_01_0092__en-us_topic_0108275315_parmvalue12389152254610"><b>No</b></span> in this example.</li><li id="dataartsstudio_01_0092__en-us_topic_0108275315_li65386286258"><strong id="dataartsstudio_01_0092__en-us_topic_0108275315_b0990135545110">Concurrent Extractors</strong>: Enter the number of concurrent extractors. An appropriate value improves migration efficiency. Retain the default value <span class="parmvalue" id="dataartsstudio_01_0092__en-us_topic_0108275315_parmvalue1939217251214"><b>1</b></span>.</li><li id="dataartsstudio_01_0092__en-us_topic_0108275315_li1396265172616"><strong id="dataartsstudio_01_0092__en-us_topic_0108275315_b115901914176">Write Dirty Data</strong>: Specify this parameter if data that fails to be processed or filtered out during job execution needs to be written to OBS for future viewing. Before writing dirty data, create an OBS link on the CDM console. Retain the default value <span class="parmvalue" id="dataartsstudio_01_0092__en-us_topic_0108275315_parmvalue1890595274511"><b>No</b></span> so that dirty data is not recorded.</li></ul>
|
|
<div class="fignone" id="dataartsstudio_01_0092__en-us_topic_0108275315_fig81141229205912"><span class="figcap"><b>Figure 7 </b>Configuring the task</span><br><span><img id="dataartsstudio_01_0092__en-us_topic_0108275315_image1793431113212" src="en-us_image_0000002269114701.png" title="Click to enlarge" class="imgResize"></span></div>
|
|
</div>
|
|
</p></li><li id="dataartsstudio_01_0092__en-us_topic_0111325168_li1577510223710"><span>Click <span class="uicontrol" id="dataartsstudio_01_0092__en-us_topic_0108275326_uicontrol1362446142110"><b>Save and Run</b></span>. The <strong id="dataartsstudio_01_0092__en-us_topic_0108275326_b867281052113">Job Management</strong> page is displayed, on which you can view the job execution progress and result.</span></li><li id="dataartsstudio_01_0092__en-us_topic_0111325168_li1189115352456"><span>After the job is successfully executed, in the <strong id="dataartsstudio_01_0092__en-us_topic_0108275326_b78588166211">Operation</strong> column of the job, click <span class="uicontrol" id="dataartsstudio_01_0092__en-us_topic_0108275326_uicontrol6858616102110"><b>Historical Record</b></span> to view the job's historical execution records and read/write statistics.</span><p><p id="dataartsstudio_01_0092__en-us_topic_0108275326_p1891735144518">On the <strong id="dataartsstudio_01_0092__en-us_topic_0108275326_b1470242882110">Historical Record</strong> page, click <span class="uicontrol" id="dataartsstudio_01_0092__en-us_topic_0108275326_uicontrol147021628182113"><b>Log</b></span> to view the job logs.</p>
|
|
</p></li></ol>
|
|
</div>
|
|
</div>
|
|
<div>
|
|
<div class="familylinks">
|
|
<div class="parentlink"><strong>Parent topic:</strong> <a href="dataartsstudio_01_0086.html">Tutorials</a></div>
|
|
</div>
|
|
</div>
|
|
|
|
|
|
<script language="JavaScript">
|
|
<!--
|
|
initImageViewer('.imgResize');
|
|
var msg_imageMax = "view original image";
|
|
var msg_imageClose = "close";
|
|
//--></script> |