doc-exports/docs/dws/umn/dws_01_0055.html
Lu, Huayi c5fcb46315 DWS UMN 801 version
Reviewed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com>
Co-authored-by: Lu, Huayi <luhuayi@huawei.com>
Co-committed-by: Lu, Huayi <luhuayi@huawei.com>
2022-12-13 12:47:57 +00:00

20 lines
3.0 KiB
HTML

<a name="EN-US_TOPIC_0000001180440149"></a><a name="EN-US_TOPIC_0000001180440149"></a>
<h1 class="topictitle1">Importing Data from MRS to GaussDB(DWS)</h1>
<div id="body1499656133705"><div class="section" id="EN-US_TOPIC_0000001180440149__section34418118183527"><h4 class="sectiontitle">Importing Data from MRS to a Data Warehouse Cluster</h4><p id="EN-US_TOPIC_0000001180440149__p8060118">MRS is a big data cluster running based on the open-source Hadoop ecosystem. It provides the industry's latest cutting-edge storage and analysis capabilities of massive volumes of data, satisfying your data storage and processing requirements. For details about MRS services, see the <i><cite id="EN-US_TOPIC_0000001180440149__cite1716351711455">MapReduce Service User Guide</cite></i>.</p>
<p id="EN-US_TOPIC_0000001180440149__p56178650111411">You can use Hive/Spark (analysis cluster of MRS) to store massive volumes of service data. Hive/Spark data files are stored in HDFS. On GaussDB(DWS), you can connect a data warehouse cluster to MRS clusters, read data from HDFS files, and write the data to GaussDB(DWS) when the clusters are on the same network.</p>
</div>
<div class="section" id="EN-US_TOPIC_0000001180440149__section4774472184623"><h4 class="sectiontitle">Import Process</h4><p id="EN-US_TOPIC_0000001180440149__p098116386811">Perform the following operations to import data from MRS to a data warehouse cluster:</p>
<ol id="EN-US_TOPIC_0000001180440149__ol2946194915157"><li id="EN-US_TOPIC_0000001180440149__li139452049181517">In the data warehouse cluster, create an MRS data source connection according to <a href="dws_01_0059.html">Creating an MRS Data Source Connection</a>.<div class="note" id="EN-US_TOPIC_0000001180440149__note4268181341618"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><ul id="EN-US_TOPIC_0000001180440149__ul1618919385454"><li id="EN-US_TOPIC_0000001180440149__li20189203819458">Multiple MRS data sources can exist on the same network, but one GaussDB(DWS) cluster can connect to only one MRS cluster at a time.</li></ul>
</div></div>
</li><li id="EN-US_TOPIC_0000001180440149__li125762663515">Create an HDFS foreign table for querying data from the MRS cluster over APIs of a foreign server.<p id="EN-US_TOPIC_0000001180440149__p16734543193517"><a name="EN-US_TOPIC_0000001180440149__li125762663515"></a><a name="li125762663515"></a>For details, see "Data Import &gt; Importing Data from MRS to a Cluster" in the <i><cite id="EN-US_TOPIC_0000001180440149__cite3601147125410">Data Warehouse Service (DWS) Developer Guide</cite></i>.</p>
</li><li id="EN-US_TOPIC_0000001180440149__li1665122602810">(Optional) When the HDFS configuration of the MRS cluster changes, update the MRS data source configuration on GaussDB(DWS). For details, see <a href="dws_01_0156.html">Updating the MRS Data Source Configuration</a>.</li></ol>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="dws_01_0057.html">MRS Data Sources</a></div>
</div>
</div>