Yang, Tong 6182f91ba8 MRS component operation guide_normal 2.0.38.SP20 version
Reviewed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com>
Co-authored-by: Yang, Tong <yangtong2@huawei.com>
Co-committed-by: Yang, Tong <yangtong2@huawei.com>
2022-12-09 14:55:21 +00:00

23 lines
5.0 KiB
HTML

<a name="mrs_01_1787"></a><a name="mrs_01_1787"></a>
<h1 class="topictitle1">Differences Among Connectors Used During the Process of Importing Data from the Oracle Database to HDFS</h1>
<div id="body1596098832002"><div class="section" id="mrs_01_1787__s5b951d8acb7f45caae4315f023e89a8e"><h4 class="sectiontitle">Question</h4><p id="mrs_01_1787__aab2e09c327764f31a343226000f2e9f3">Three types of connectors are available for importing data from the Oracle database to HDFS using Loader. That is, generic-jdbc-connector, oracle-connector, and oracle-partition-connector. Which one should I select? What are the differences between them?</p>
</div>
<div class="section" id="mrs_01_1787__sc4ef7263443644f196ece71006034c02"><h4 class="sectiontitle">Answers</h4><ul id="mrs_01_1787__u677860e6c1524b06b80a3bc6be2f4759"><li id="mrs_01_1787__en-us_topic_0116216775_li15776810551">generic-jdbc-connector<p id="mrs_01_1787__a34f9761973a840e39c1fe2b059bae131"><a name="mrs_01_1787__en-us_topic_0116216775_li15776810551"></a><a name="en-us_topic_0116216775_li15776810551"></a>Reads data from the Oracle database in JDBC mode. It is applicable to databases that support JDBC.</p>
<p id="mrs_01_1787__a9ace651986fb49e38a5ed6c631e192f3">In this mode, data loading performance of Loader is subject to data distribution in a partition column. When data skew occurs (data has only one value or several values) in a partition column, a few Maps process a significant portion of data. As a result, the index becomes invalid, causing a sharp decline in SQL query performance. </p>
<p id="mrs_01_1787__ae36bcd7da75c448f95c452da9f25572b"><strong id="mrs_01_1787__b157102810107">generic-jdbc-connector</strong> supports view import and export, but oracle-partition-connector and oracle-connector do not support. Therefore, only this connector can be used to import views.</p>
</li><li id="mrs_01_1787__l7e4f550f4e2443358f634c80e8b8cc56">Both <strong id="mrs_01_1787__b8479123614107">oracle-partition-connector</strong> and <strong id="mrs_01_1787__b46666382104">oracle-connector</strong><p id="mrs_01_1787__a069501b0d34646eeb9af1cc23719bf79">can use the ROWID of Oracle for partitioning. oracle-partition-connector is self-developed and oracle-connector is an open-source edition. The two types of connectors share similar performance.</p>
<p id="mrs_01_1787__en-us_topic_0116216775_p722328510573"><strong id="mrs_01_1787__b1115035912016">oracle-connector</strong> requires more system table permissions. The following lists the read permissions required by the system tables of <strong id="mrs_01_1787__b128972522115">oracle-connector</strong> and <strong id="mrs_01_1787__b103509922111">oracle-connector</strong>.</p>
<ul id="mrs_01_1787__u5847c0a196ad4cdf9fa9d8aa193d2124"><li id="mrs_01_1787__le2afd03a94a6422a8163cbc2a461db8c"><strong id="mrs_01_1787__b1391811189213">oracle-connector</strong>: dba_tab_partitions, dba_constraints, dba_tables t, dba_segments, v$version, dba_objects, v$instance, SYS_CONTEXT function, dba_extents, and dba_tab_subpartitions</li><li id="mrs_01_1787__lddbd2044d8ac487daa35344f003fa44e"><strong id="mrs_01_1787__b9283103692115">oracle-partition-connector</strong>: DBA_OBJECTS and DBA_EXTENTS</li></ul>
<p id="mrs_01_1787__af027ba1dba274bc8838a23568065ed64">Compared with <strong id="mrs_01_1787__b52052175226">generic-jdbc-connector</strong>, <strong id="mrs_01_1787__b1648019197224">oracle-partition-connector</strong> and <strong id="mrs_01_1787__b598862112221">oracle-connector</strong> have the following advantages:</p>
<ol id="mrs_01_1787__o91d89ecd3bd34877ae05a8f445fe3712"><li id="mrs_01_1787__led490505b9b8434f99ffd3414ee27c2e">Load balancing: Number and scope of data segments are determined by the storage structure (data blocks) of the source table rather than the data on the source table. In terms of granularity, a data block can occupy a partition.</li><li id="mrs_01_1787__l43c24ca8396342a49dadc6024ff36d65">Stable performance: Invalid index faults caused by data skew and bound variable snooping can be completely eliminated. </li><li id="mrs_01_1787__l548607b9b1564ceaa457598f0912a812">Fast query speed: Using data segmentation delivers a higher query speed than that of using index. </li><li id="mrs_01_1787__l5a16aca640f8492f8bd2b0e7526ac828">Excellent horizontal scalability: The number of generated segments increases with the increase of data volume. In this case, ideal performance can be delivered when you increase the number of concurrent tasks. Contrarily, decreasing concurrent tasks saves resources.</li><li id="mrs_01_1787__l55ccd9e4bf8a49b79eb71fe1a020e97d">Simplified data segmentation logic: Problems like precision loss, type compatibility, and bound variables can be prevented. </li><li id="mrs_01_1787__l92b5bd28e32f467e83108f9a540f40a4">Enhanced usability: Users do not need to create partition columns and tables for Loader. </li></ol>
</li></ul>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1785.html">Common Issues About Loader</a></div>
</div>
</div>