Yang, Tong 48706b7552 MRS COMP-LTS 320-lts.1 version
Reviewed-by: Kacur, Michal <michal.kacur@t-systems.com>
Co-authored-by: Yang, Tong <yangtong2@huawei.com>
Co-committed-by: Yang, Tong <yangtong2@huawei.com>
2024-04-12 12:51:10 +00:00

60 lines
21 KiB
HTML

<a name="mrs_01_24741"></a><a name="mrs_01_24741"></a>
<h1 class="topictitle1">Importing and Exporting Table/Partition Data in Hive</h1>
<div id="body0000001533213830"><div class="section" id="mrs_01_24741__section1163955145014"><h4 class="sectiontitle">Scenario</h4><p id="mrs_01_24741__p141941165417">In big data application scenarios, data tables in Hive usually need to be migrated to another cluster. You can run the Hive <strong id="mrs_01_24741__b16991079501">import</strong> and <strong id="mrs_01_24741__b11154119195020">export</strong> commands to migrate data in tables. That is, you can run the <strong id="mrs_01_24741__b12398911195014">export</strong> command to export Hive tables from the source cluster to the HDFS of the target cluster, run the <strong id="mrs_01_24741__b14854214155013">import</strong> command in the target cluster to import the exported data to the corresponding Hive table.</p>
<div class="note" id="mrs_01_24741__note796619512285"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="mrs_01_24741__p1996625102820">This section applies to MRS 3.2.0 or later.</p>
<p id="mrs_01_24741__p513214241502">The Hive table import and export function does not support importing or exporting encrypted tables, HBase external tables, transaction tables, Hudi tables, view tables, and materialized view tables.</p>
</div></div>
</div>
<div class="section" id="mrs_01_24741__section1857166142714"><h4 class="sectiontitle">Prerequisites</h4><ul id="mrs_01_24741__ul104783116711"><li id="mrs_01_24741__li12981441131015">If Hive tables or partition data is imported or exported across clusters and Kerberos authentication is enabled for both the source and destination clusters, configure cross-cluster mutual trust.</li><li id="mrs_01_24741__li13244162814167">If you want to run the <strong id="mrs_01_24741__b0255193617508">import</strong> or <strong id="mrs_01_24741__b6557338155018">export</strong> command to import or export tables or partitions created by other users, grant the corresponding table permission to the users.<ul id="mrs_01_24741__ul796272916161"><li id="mrs_01_24741__li19318103771612">If Ranger authentication is not enabled for the cluster, log in to FusionInsight Manager to grant the <strong id="mrs_01_24741__b1331211685210">Select Authorization</strong> permission of the table corresponding to the role to which the user belongs. For details, see section <a href="mrs_01_0950.html">Configuring Permissions for Hive Tables, Columns, or Databases</a>.</li><li id="mrs_01_24741__li1868128141818">If Ranger authentication is enabled for the cluster, grant users the permission to import and export tables. For details, see <a href="mrs_01_1858.html">Adding a Ranger Access Permission Policy for Hive</a>.</li></ul>
</li><li id="mrs_01_24741__li1825913371462">Enable the inter-cluster copy function in the source cluster and destination cluster.</li><li id="mrs_01_24741__li167313544453">Configure the HDFS service address parameter for the source cluster to access the destination cluster.<p id="mrs_01_24741__p1140993818319"><a name="mrs_01_24741__li167313544453"></a><a name="li167313544453"></a>Log in to FusionInsight Manager of the source cluster, click <strong id="mrs_01_24741__b161942891512">Cluster</strong>, choose <strong id="mrs_01_24741__b538621011510">Services</strong> &gt; <strong id="mrs_01_24741__b116172013141516">Hive</strong>, and click <strong id="mrs_01_24741__b1637613355158">Configuration</strong>. On the displayed page, search for <strong id="mrs_01_24741__b1062955221515">hdfs.site.customized.configs</strong>, add custom parameter <strong id="mrs_01_24741__b15160101716168">dfs.namenode.rpc-address.haclusterX</strong>, and set its value to <em id="mrs_01_24741__i13639727173">Service IP address of the active NameNode instance node in the destination cluster</em>:<em id="mrs_01_24741__i0674191841720">RPC port</em>. Add custom parameter <strong id="mrs_01_24741__b627665651716">dfs.namenode.rpc-address.haclusterX1</strong> and set its value to <em id="mrs_01_24741__i24381633181814">Service IP address of the standby NameNode instance node in the destination cluster</em>:<em id="mrs_01_24741__i756913612180">RPC port</em>. The RPC port of NameNode is <strong id="mrs_01_24741__b13245133017191">25000</strong> by default. After saving the configuration, roll-restart the Hive service.</p>
</li></ul>
</div>
<div class="section" id="mrs_01_24741__section9108828569"><h4 class="sectiontitle">Procedure</h4><ol id="mrs_01_24741__ol19188206135911"><li id="mrs_01_24741__li1793444135814"><a name="mrs_01_24741__li1793444135814"></a><a name="li1793444135814"></a><span>Log in to the node where the client is installed in the destination cluster as the Hive client installation user.</span></li><li id="mrs_01_24741__li44323875912"><span>Run the following command to switch to the client installation directory, for example, <strong id="mrs_01_24741__b185665917212"><span id="mrs_01_24741__ph255205922111">/opt/client</span></strong>:</span><p><p id="mrs_01_24741__p545358125919"><strong id="mrs_01_24741__b24534815591">cd <span id="mrs_01_24741__ph65691133174114">/opt/client</span></strong></p>
</p></li><li id="mrs_01_24741__li545317816598"><span>Run the following command to configure environment variables:</span><p><p id="mrs_01_24741__p194531985598"><strong id="mrs_01_24741__b1945312815912">source bigdata_env</strong></p>
</p></li><li id="mrs_01_24741__li610182915598"><a name="mrs_01_24741__li610182915598"></a><a name="li610182915598"></a><span>If Kerberos authentication is enabled for the cluster, run the following command to authenticate the user. Otherwise, skip this step.</span><p><p id="mrs_01_24741__p10378152213593"><strong id="mrs_01_24741__b0993133172210">kinit</strong> <i><span class="varname" id="mrs_01_24741__varname1299318338224">Hive service user</span></i></p>
</p></li><li id="mrs_01_24741__li333142945916"><span>Run the following command to log in to the Hive client in the destination cluster:</span><p><p id="mrs_01_24741__p173315293591"><strong id="mrs_01_24741__b1133202915591">beeline</strong></p>
</p></li><li id="mrs_01_24741__li638086151111"><span>Run the following command to create the <strong id="mrs_01_24741__b335692940105146">export_test</strong> table:</span><p><p id="mrs_01_24741__p8931427121320"><strong id="mrs_01_24741__b108341943153015">create table </strong><em id="mrs_01_24741__i1983612435307">export_test(id int)</em> <strong id="mrs_01_24741__b103191755330">;</strong></p>
</p></li><li id="mrs_01_24741__li722562251412"><span>Run the following command to insert data to the <strong id="mrs_01_24741__b1464231922314">export_test</strong> table:</span><p><p id="mrs_01_24741__p6845133617145"><strong id="mrs_01_24741__b18748554309">insert into </strong><em id="mrs_01_24741__i387785515304">export_test values(123)</em><strong id="mrs_01_24741__b178740552301">;</strong></p>
</p></li><li id="mrs_01_24741__li89938161394"><a name="mrs_01_24741__li89938161394"></a><a name="li89938161394"></a><span>Repeat <a href="#mrs_01_24741__li1793444135814">1</a> to <a href="#mrs_01_24741__li610182915598">4</a> in the destination cluster and run the following command to create an HDFS path for storing the exported <strong id="mrs_01_24741__b849591913262">export_test</strong> table:</span><p><p id="mrs_01_24741__p92183496102"><strong id="mrs_01_24741__b187620290157">dfs -mkdir </strong><em id="mrs_01_24741__i9667102918153">/tmp/export</em></p>
</p></li><li id="mrs_01_24741__li121504204424"><span>Run the following command to log in to the Hive client:</span><p><p id="mrs_01_24741__p815020201428"><strong id="mrs_01_24741__b115052094219">beeline</strong></p>
</p></li><li id="mrs_01_24741__li4339197161515"><span>Import and export the <strong id="mrs_01_24741__b2095932612720">export_test</strong> table.</span><p><p id="mrs_01_24741__p1926491821611">The Hive <strong id="mrs_01_24741__b181869416287">import</strong> and <strong id="mrs_01_24741__b121631375288">export</strong> commands can be used to migrate table data in the following modes. Select a proper data migration mode as required.</p>
<ul id="mrs_01_24741__ul3173151041918"><li id="mrs_01_24741__li1017391010198">Mode 1: Table export and import<ol type="a" id="mrs_01_24741__ol5227455171415"><li id="mrs_01_24741__li422755591415"><a name="mrs_01_24741__li422755591415"></a><a name="li422755591415"></a>Run the following command in the source cluster to export the metadata and service data of the <strong id="mrs_01_24741__b5433930173015">export_test</strong> table to the directory created in <a href="#mrs_01_24741__li89938161394">8</a>:<p id="mrs_01_24741__p13410639182418"><strong id="mrs_01_24741__b99591025113616">export table </strong><em id="mrs_01_24741__i16523142683611">export_test</em><strong id="mrs_01_24741__b17071337203919"> to</strong><strong id="mrs_01_24741__b19404141123912"> 'hdfs</strong><strong id="mrs_01_24741__b670773720396">://hacluster</strong><em id="mrs_01_24741__i1525285216132">X</em><em id="mrs_01_24741__i3817950141315">/tmp/export</em><strong id="mrs_01_24741__b1666014681713">';</strong></p>
</li><li id="mrs_01_24741__li108404465167">Run the following command in the destination cluster to import the table data exported in <a href="#mrs_01_24741__li422755591415">10.a</a> to the <strong id="mrs_01_24741__b1442513518317">export_test</strong> table:<p id="mrs_01_24741__p1417311012199"><strong id="mrs_01_24741__b162999412582">import from '</strong><em id="mrs_01_24741__i171996552114">/tmp/export</em><strong id="mrs_01_24741__b63001148582">';</strong></p>
</li></ol>
</li></ul>
<ul id="mrs_01_24741__ul127521825102112"><li id="mrs_01_24741__li67524251214">Mode 2: Renaming a table during the import<ol type="a" id="mrs_01_24741__ol19711113414211"><li id="mrs_01_24741__li207111134162118"><a name="mrs_01_24741__li207111134162118"></a><a name="li207111134162118"></a>Run the following command in the source cluster to export the metadata and service data of the <strong id="mrs_01_24741__b149971312613">export_test</strong> table to the directory created in <a href="#mrs_01_24741__li89938161394">8</a>:<p id="mrs_01_24741__p1375212256215"><strong id="mrs_01_24741__b11131113122413">export table</strong> <em id="mrs_01_24741__i1664363415361">export_test</em><strong id="mrs_01_24741__b594310332363"> </strong><strong id="mrs_01_24741__b1512312732415">to</strong> <strong id="mrs_01_24741__b1642015242">'hdfs</strong><strong id="mrs_01_24741__b10422124421317">://hacluster</strong><em id="mrs_01_24741__i10102184541318">X</em><em id="mrs_01_24741__i84931938566">/tmp/export</em><strong id="mrs_01_24741__b86142017248">';</strong></p>
</li><li id="mrs_01_24741__li168051118162611">Run the following command in the destination cluster to import the table data exported in <a href="#mrs_01_24741__li207111134162118">10.a</a> to the <strong id="mrs_01_24741__b12486538193218">import_test</strong> table:<p id="mrs_01_24741__p87521925112113"><strong id="mrs_01_24741__b1084445365">import table </strong><em id="mrs_01_24741__i58771544143616">import_test</em><strong id="mrs_01_24741__b14854414368"> from '</strong><em id="mrs_01_24741__i16746164217306">/tmp/export</em><strong id="mrs_01_24741__b114143313303">';</strong></p>
</li></ol>
</li></ul>
<ul id="mrs_01_24741__ul19173210121912"><li id="mrs_01_24741__li75939451759">Mode 3: Partition export and import<ol type="a" id="mrs_01_24741__ol8743173413413"><li id="mrs_01_24741__li77435347346"><a name="mrs_01_24741__li77435347346"></a><a name="li77435347346"></a>Run the following commands in the source cluster to export the <strong id="mrs_01_24741__b1110615366353">pt1</strong> and <strong id="mrs_01_24741__b85063910356">pt2</strong> partitions of the <strong id="mrs_01_24741__b1822685523514">export_test</strong> table to the directory created in <a href="#mrs_01_24741__li89938161394">8</a>:<p id="mrs_01_24741__p16173810141919"><strong id="mrs_01_24741__b1472181353613">export table </strong><em id="mrs_01_24741__i119691635143719">export_test</em> <strong id="mrs_01_24741__b1539064012377">partition</strong><strong id="mrs_01_24741__b035334413371"> (</strong><em id="mrs_01_24741__i31778020397">pt1</em><strong id="mrs_01_24741__b846311315398">=</strong><strong id="mrs_01_24741__b8791201003918">"</strong><em id="mrs_01_24741__i19384181113911">in</em><strong id="mrs_01_24741__b179151083913">"</strong>, <em id="mrs_01_24741__i35632150396">pt2</em><strong id="mrs_01_24741__b15164161973913">=</strong><strong id="mrs_01_24741__b16291727103914">"</strong><em id="mrs_01_24741__i19786627193915">ka</em><strong id="mrs_01_24741__b2291327103916">"</strong><strong id="mrs_01_24741__b1555304823713">)</strong> <strong id="mrs_01_24741__b1639434103820">to</strong> <strong id="mrs_01_24741__b1249010179384">'hdfs</strong><strong id="mrs_01_24741__b20827174295614">://hacluster</strong><em id="mrs_01_24741__i9811142185610"><strong id="mrs_01_24741__b231658124417">X</strong>/tmp/export</em><strong id="mrs_01_24741__b18490181712386">'</strong><strong id="mrs_01_24741__b199393174111">;</strong></p>
</li><li id="mrs_01_24741__li2060917813415">Run the following command in the destination cluster to import the table data exported in <a href="#mrs_01_24741__li77435347346">10.a</a> to the <strong id="mrs_01_24741__b163881016133711">export_test</strong> table:<p id="mrs_01_24741__p131187235195"><strong id="mrs_01_24741__b1480815844613">import from '</strong><em id="mrs_01_24741__i253918912463">/tmp/export</em><strong id="mrs_01_24741__b14808138154616">';</strong></p>
</li></ol>
</li><li id="mrs_01_24741__li4649103111910">Mode 4: Exporting table data to a Partition<ol type="a" id="mrs_01_24741__ol1678591410476"><li id="mrs_01_24741__li19785214114715"><a name="mrs_01_24741__li19785214114715"></a><a name="li19785214114715"></a>Run the following command in the source cluster to export the metadata and service data of the <strong id="mrs_01_24741__b1438163315383">export_test</strong> table to the directory created in <a href="#mrs_01_24741__li89938161394">8</a>:<p id="mrs_01_24741__p2174171031912"><strong id="mrs_01_24741__b5662155694920">export table </strong><em id="mrs_01_24741__i16471113165010">export_test</em><strong id="mrs_01_24741__b380642125017"> </strong><strong id="mrs_01_24741__b20227141245014">to 'hdfs</strong><strong id="mrs_01_24741__b101023527596">://hacluster</strong><em id="mrs_01_24741__i0100552115914">X/tmp/export</em><strong id="mrs_01_24741__b12227141217501">';</strong></p>
</li><li id="mrs_01_24741__li18841151912502">Run the following command in the destination cluster to import the table data exported in <a href="#mrs_01_24741__li19785214114715">10.a</a> to the <strong id="mrs_01_24741__b1059171012391">pt1</strong> and <strong id="mrs_01_24741__b978301143915">pt2</strong> partitions of the <strong id="mrs_01_24741__b15108416193912">import_test</strong> table:<p id="mrs_01_24741__p12174161011191"><strong id="mrs_01_24741__b103681310115315">import table </strong><em id="mrs_01_24741__i9123911155316">import_test</em><strong id="mrs_01_24741__b728651575314"> partition (</strong><em id="mrs_01_24741__i989481517535">pt1</em><strong id="mrs_01_24741__b10901131812532">="</strong><em id="mrs_01_24741__i14596719185310">us</em><strong id="mrs_01_24741__b2094732216539">", </strong><em id="mrs_01_24741__i1082011237534">pt2</em><strong id="mrs_01_24741__b24171127185310">="</strong><em id="mrs_01_24741__i1410992815539">tn</em><strong id="mrs_01_24741__b12434153265318">") from '</strong><em id="mrs_01_24741__i89611336533">/tmp/export</em><strong id="mrs_01_24741__b3434153210538">';</strong></p>
</li></ol>
</li><li id="mrs_01_24741__li20174910141916">Mode 5: Specifying the table location during the import<ol type="a" id="mrs_01_24741__ol56351056115518"><li id="mrs_01_24741__li11635456135510"><a name="mrs_01_24741__li11635456135510"></a><a name="li11635456135510"></a>Run the following command in the source cluster to export the metadata and service data of the <strong id="mrs_01_24741__b19143165316398">export_test</strong> table to the directory created in <a href="#mrs_01_24741__li89938161394">8</a>:<p id="mrs_01_24741__p9174101017195"><strong id="mrs_01_24741__b7876192555613">export table </strong><em id="mrs_01_24741__i122749316560">export_test</em><strong id="mrs_01_24741__b1193753495612"> to 'hdfs</strong><strong id="mrs_01_24741__b14288100201">://hacluster</strong><em id="mrs_01_24741__i42711209016">X/tmp/export</em><strong id="mrs_01_24741__b1793773419565">';</strong></p>
</li><li id="mrs_01_24741__li11997154365615">Run the following command in the destination cluster to import the table data exported in <a href="#mrs_01_24741__li11635456135510">10.a</a> to the <strong id="mrs_01_24741__b6381131819404">import_test</strong> table and specify its location as <strong id="mrs_01_24741__b12802508418">tmp/export</strong>:<p id="mrs_01_24741__p10174161013191"><strong id="mrs_01_24741__b815533985916">import table</strong><em id="mrs_01_24741__i131811408597"> import_test </em><strong id="mrs_01_24741__b161551839165919">from '</strong><em id="mrs_01_24741__i134247351303">/tmp</em>' <strong id="mrs_01_24741__b19457153115010">location</strong><strong id="mrs_01_24741__b0763168533"> '</strong>/<em id="mrs_01_24741__i1076768933">tmp/export</em><strong id="mrs_01_24741__b99292455020">';</strong></p>
</li></ol>
</li><li id="mrs_01_24741__li16174181071916">Mode 6: Exporting data to an external table<ol type="a" id="mrs_01_24741__ol76328136"><li id="mrs_01_24741__li437611737"><a name="mrs_01_24741__li437611737"></a><a name="li437611737"></a>Run the following command in the source cluster to export the metadata and service data of the <strong id="mrs_01_24741__b11267152516417">export_test</strong> table to the directory created in <a href="#mrs_01_24741__li89938161394">8</a>:<p id="mrs_01_24741__p143719112039"><strong id="mrs_01_24741__b6372117314">export table </strong><em id="mrs_01_24741__i18371411538">export_test</em><strong id="mrs_01_24741__b193761115315"> to 'hdfs</strong><strong id="mrs_01_24741__b96291111305">://hacluster</strong><em id="mrs_01_24741__i1516219167015">X/</em><em id="mrs_01_24741__i161719110014">tmp/export</em><strong id="mrs_01_24741__b83717111636">';</strong></p>
</li><li id="mrs_01_24741__li1318516176319">Run the following command in the destination cluster to import the table data exported in <a href="#mrs_01_24741__li437611737">10.a</a> to external table <strong id="mrs_01_24741__b20362174218476">import_test</strong>:<p id="mrs_01_24741__p9174151071916"><strong id="mrs_01_24741__b14615522453">import external table </strong><em id="mrs_01_24741__i19360172316518">import_test</em><strong id="mrs_01_24741__b399329853"> from '</strong><em id="mrs_01_24741__i9773202918520">/tmp/export</em><strong id="mrs_01_24741__b109913291358">';</strong></p>
</li></ol>
</li></ul>
<div class="note" id="mrs_01_24741__note553531154210"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="mrs_01_24741__p16785331141815">Before exporting table or partition data, ensure that the HDFS path for storage has been created and is empty. Otherwise, the export fails.</p>
<p id="mrs_01_24741__p171170514817">When partitions are exported or imported, the exported or imported table must be a partitioned table, and data of multiple partition values of the same partition field cannot be exported.</p>
<div class="p" id="mrs_01_24741__p194011725151817">During the data import:<ul id="mrs_01_24741__ul57256220171"><li id="mrs_01_24741__li921992118174">If the <strong id="mrs_01_24741__b1091818392427">import from '</strong><em id="mrs_01_24741__i16165142511198">/tmp/export</em><strong id="mrs_01_24741__b1491893913421">';</strong> statement is used to import a table, the table name is not specified, and the imported data is saved to the table path with the same name as the source table. Pay attention to the following points:<ul id="mrs_01_24741__ul7247125310433"><li id="mrs_01_24741__li99181390428">If there is no table with the same name as that in the source cluster in the destination cluster, such a table will be created during the table import.</li><li id="mrs_01_24741__li2247145317433">Otherwise, the HDFS directory of the table must be empty, or the import fails.</li></ul>
</li><li id="mrs_01_24741__li1897182516215">If the <strong id="mrs_01_24741__b109381112224">import external table </strong><em id="mrs_01_24741__i24818242210">import_test</em><strong id="mrs_01_24741__b393910702213"> from '</strong><em id="mrs_01_24741__i8456412202217">/tmp/export</em><strong id="mrs_01_24741__b59391975221">';</strong> statement is used to import a table, the exported table is imported to the specified table. Pay attention to the following points:<ul id="mrs_01_24741__ul29783774417"><li id="mrs_01_24741__li16971537134417">If there is no table with the same name as the specified table exists in the destination cluster, such a table will be created during the table import.</li><li id="mrs_01_24741__li39733784413">Otherwise, the HDFS directory of the table must be empty, or the import fails.</li></ul>
</li></ul>
</div>
<p id="mrs_01_24741__p9682101603213"><strong id="mrs_01_24741__b155121546144513">hacluster <em id="mrs_01_24741__i967464912451">X</em></strong> is the value of <strong id="mrs_01_24741__b484285764518">haclusterX</strong> in new custom parameter<strong id="mrs_01_24741__b2272122174618">dfs.namenode.rpc-address.haclusterX</strong>.</p>
</div></div>
</p></li></ol>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_24744.html">Data Import and Export in Hive</a></div>
</div>
</div>