forked from docs/doc-exports
Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com> Reviewed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com> Co-authored-by: Su, Xiaomeng <suxiaomeng1@huawei.com> Co-committed-by: Su, Xiaomeng <suxiaomeng1@huawei.com>
187 lines
14 KiB
HTML
187 lines
14 KiB
HTML
<a name="dli_08_15036"></a><a name="dli_08_15036"></a>
|
|
|
|
<h1 class="topictitle1">Dimension Table</h1>
|
|
<div id="body0000001719011362"><div class="section" id="dli_08_15036__section102271660319"><h4 class="sectiontitle">Function</h4><p id="dli_08_15036__p28331612932">Create a Doris dimension table to connect to the source streams for wide table generation.</p>
|
|
</div>
|
|
<div class="section" id="dli_08_15036__section5611103820312"><h4 class="sectiontitle">Prerequisites</h4><ul id="dli_08_15036__ul66761330172912"><li id="dli_08_15036__li13676330102913">An enhanced datasource connection has been created for DLI to connect to HBase, so that jobs can run on the dedicated queue of DLI and you can set the security group rules as required.
|
|
</li><li id="dli_08_15036__li1380793232118"><strong id="dli_08_15036__b82011610720">If MRS Doris is used, IP addresses of all hosts in the MRS cluster have been added to host information of the enhanced datasource connection.</strong><p id="dli_08_15036__p1679341219"></p>
|
|
<p id="dli_08_15036__p1290940152113">For details, see "Modifying Host Information" in <em id="dli_08_15036__i146821342365">Data Lake Insight User Guide</em>.</p>
|
|
</li><li id="dli_08_15036__en-us_topic_0000001549411578_li516015511901"><span id="dli_08_15036__ph1606103117232">Kerberos authentication is disabled for the cluster (the cluster is in normal mode).</span><p id="dli_08_15036__en-us_topic_0000001549411578_p63621633512">After connecting to Doris as user <strong id="dli_08_15036__en-us_topic_0000001549411578_b9740174014351">admin</strong>, create a role with administrator permissions, and bind the role to the user.</p>
|
|
</li></ul>
|
|
</div>
|
|
<div class="section" id="dli_08_15036__section1465831815212"><h4 class="sectiontitle">Caveats</h4><ul id="dli_08_15036__ul322397111719"><li id="dli_08_15036__li13608118132418">When you create a Flink OpenSource SQL job, set <strong id="dli_08_15036__dli_08_15029_b163001353185217">Flink Version</strong> to <strong id="dli_08_15036__dli_08_15029_b1430115539523">1.15</strong> in the <strong id="dli_08_15036__dli_08_15029_b1030175315523">Running Parameters</strong> tab. Select <strong id="dli_08_15036__dli_08_15029_b430135325212">Save Job Log</strong>, and specify the OBS bucket for saving job logs.</li><li id="dli_08_15036__li980192610493">Storing authentication credentials such as usernames and passwords in code or plaintext poses significant security risks. It is recommended using DEW to manage credentials instead. Storing encrypted credentials in configuration files or environment variables and decrypting them when needed ensures security. For details, see .</li><li id="dli_08_15036__li106441816124218"><span id="dli_08_15036__ph1892124262310">Kerberos authentication is disabled for the cluster (the cluster is in normal mode).</span></li><li id="dli_08_15036__li1257525562010">Doris table names are case sensitive.</li><li id="dli_08_15036__li1863019478419">When Doris of CloudTable is used, set the port number in the <strong id="dli_08_15036__b197991628794">fenodes</strong> field to <strong id="dli_08_15036__b9800928397">8030</strong>, for example, <em id="dli_08_15036__i118015281791">xx</em><strong id="dli_08_15036__b188018281295">:8030</strong>. In addition, enable ports <strong id="dli_08_15036__b1895524012712">8030</strong>, <strong id="dli_08_15036__b0955840182711">8040</strong>, and <strong id="dli_08_15036__b15955840112713">9030</strong> in the security group.</li><li id="dli_08_15036__li1024325917145">After HTTPS is enabled, add the following configuration parameters to the <strong id="dli_08_15036__b15451020164314">with</strong> clause for creating a table:<ul id="dli_08_15036__ul19464113016433"><li id="dli_08_15036__li346443094319"><strong id="dli_08_15036__b17846193319435">'doris.enable.https' = 'true'</strong></li><li id="dli_08_15036__li546473044311"><strong id="dli_08_15036__b18851103324310">'doris.ignore.https.ca' = 'true'</strong></li></ul>
|
|
</li></ul>
|
|
</div>
|
|
<div class="section" id="dli_08_15036__section661775715318"><h4 class="sectiontitle">Syntax</h4><pre class="screen" id="dli_08_15036__screen1214311454414">create table hbaseSource (
|
|
attr_name attr_type
|
|
(',' attr_name attr_type)*
|
|
)
|
|
with (
|
|
'connector' = 'doris',
|
|
'fenodes' = 'FE_IP:PORT,FE_IP:PORT,FE_IP:PORT',
|
|
'table.identifier' = 'database.table',
|
|
'username' = 'dorisUsername',
|
|
'password' = 'dorisPassword'
|
|
);</pre>
|
|
</div>
|
|
<div class="section" id="dli_08_15036__section4712115614410"><h4 class="sectiontitle">Parameter Description</h4><p id="dli_08_15036__p1464110771618"><strong id="dli_08_15036__b10584832299">Shared configuration</strong></p>
|
|
|
|
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="dli_08_15036__table1608573160" frame="border" border="1" rules="all"><thead align="left"><tr id="dli_08_15036__row4641157131610"><th align="left" class="cellrowborder" valign="top" width="25%" id="mcps1.3.5.3.1.5.1.1"><p id="dli_08_15036__p1265919113553">Parameter</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="25%" id="mcps1.3.5.3.1.5.1.2"><p id="dli_08_15036__p1265917110552">Default Value</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="25%" id="mcps1.3.5.3.1.5.1.3"><p id="dli_08_15036__p5659511155517">Mandatory</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="25%" id="mcps1.3.5.3.1.5.1.4"><p id="dli_08_15036__p11659171113559">Parameter Type Description</p>
|
|
</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody><tr id="dli_08_15036__row764167161613"><td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.5.3.1.5.1.1 "><p id="dli_08_15036__p7641878169">fenodes</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.5.3.1.5.1.2 "><p id="dli_08_15036__p664113718160">--</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.5.3.1.5.1.3 "><p id="dli_08_15036__p46414761617">Y</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.5.3.1.5.1.4 "><p id="dli_08_15036__p14641973161">IP address and port number of the Doris FE. Use commas (,) to separate them for multiple instances. To obtain the port number, log in to MRS Manager, choose <strong id="dli_08_15036__b67409914617">Cluster</strong> > <strong id="dli_08_15036__b4740139463">Services</strong> > <strong id="dli_08_15036__b574015913612">Doris</strong> > <strong id="dli_08_15036__b7740597617">Configurations</strong>, and search for <strong id="dli_08_15036__b67401392066">http</strong>. Search for <strong id="dli_08_15036__b1043018401594">https</strong> instead if HTTPS is enabled.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_08_15036__row10641579165"><td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.5.3.1.5.1.1 "><p id="dli_08_15036__p3641147101618">table.identifier</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.5.3.1.5.1.2 "><p id="dli_08_15036__p36416711611">--</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.5.3.1.5.1.3 "><p id="dli_08_15036__p1564167191617">Y</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.5.3.1.5.1.4 "><p id="dli_08_15036__p16641874161">Doris table name, for example, <strong id="dli_08_15036__b46896421999">db.tbl</strong>.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_08_15036__row186411672167"><td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.5.3.1.5.1.1 "><p id="dli_08_15036__p36411761610">username</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.5.3.1.5.1.2 "><p id="dli_08_15036__p156414714169">--</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.5.3.1.5.1.3 "><p id="dli_08_15036__p1664137121619">Y</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.5.3.1.5.1.4 "><p id="dli_08_15036__p064112715166">User name for accessing Doris.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_08_15036__row126411779163"><td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.5.3.1.5.1.1 "><p id="dli_08_15036__p4641475164">password</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.5.3.1.5.1.2 "><p id="dli_08_15036__p564217781619">--</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.5.3.1.5.1.3 "><p id="dli_08_15036__p1064287181610">Y</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.5.3.1.5.1.4 "><p id="dli_08_15036__p16423714163">Password for accessing Doris.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_08_15036__row132336394599"><td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.5.3.1.5.1.1 "><p id="dli_08_15036__p12595101816215">lookup.cache.max-rows</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.5.3.1.5.1.2 "><p id="dli_08_15036__p1497115101621">-1L</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.5.3.1.5.1.3 "><p id="dli_08_15036__p597113101026">N</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.5.3.1.5.1.4 "><p id="dli_08_15036__p7630142112816">Maximum number of rows to search in the cache, where the oldest row will be deleted if this value is exceeded.</p>
|
|
<p id="dli_08_15036__p999818418212">To enable cache configuration, both the <strong id="dli_08_15036__b13154114762416">cache.max-rows</strong> and <strong id="dli_08_15036__b1111735382417">cache.ttl</strong> options must be specified.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_08_15036__row136428713164"><td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.5.3.1.5.1.1 "><p id="dli_08_15036__p114603219415">lookup.cache.ttl</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.5.3.1.5.1.2 "><p id="dli_08_15036__p6971210429">10s</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.5.3.1.5.1.3 "><p id="dli_08_15036__p189707109219">N</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.5.3.1.5.1.4 "><p id="dli_08_15036__p922319111956">Cache lifespan.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_08_15036__row36424710167"><td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.5.3.1.5.1.1 "><p id="dli_08_15036__p154622016512">lookup.max-retries</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.5.3.1.5.1.2 "><p id="dli_08_15036__p19701610627">3</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.5.3.1.5.1.3 "><p id="dli_08_15036__p172141949173019">N</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="25%" headers="mcps1.3.5.3.1.5.1.4 "><p id="dli_08_15036__p5469294518">Maximum number of retry attempts when a database lookup fails.</p>
|
|
</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
</div>
|
|
</div>
|
|
<div class="section" id="dli_08_15036__section966893013488"><h4 class="sectiontitle">Example</h4><p id="dli_08_15036__p12968610155911">This example reads data from a Doris source table and inputs it into the Print connector.</p>
|
|
<ol id="dli_08_15036__ol15968191035919"><li id="dli_08_15036__li18968610105918">Create an enhanced datasource connection in the VPC and subnet where Doris locates, and bind the connection to the required Flink elastic resource pool. Add MRS host information for the enhanced datasource connection.</li><li id="dli_08_15036__li10969111011595">Set Doris and Kafka security groups and add inbound rules to allow access from the Flink queue. Test the connectivity using the Doris and Kafka addresses. If the connection passes the test, it is bound to the queue.</li><li id="dli_08_15036__li139694107597">Create a Doris table and insert 10 data records. The creation statement is as follows:<pre class="screen" id="dli_08_15036__screen2053221932510">CREATE TABLE IF NOT EXISTS dorisdemo
|
|
(
|
|
`user_id` varchar(10) NOT NULL,
|
|
`city` varchar(10),
|
|
`age` int,
|
|
`gender` int
|
|
)
|
|
DISTRIBUTED BY HASH(`user_id`) BUCKETS 10;
|
|
|
|
INSERT INTO dorisdemo VALUES ('user1', 'city1', 20, 1);
|
|
INSERT INTO dorisdemo VALUES ('user2', 'city2', 21, 0);
|
|
INSERT INTO dorisdemo VALUES ('user3', 'city3', 22, 1);
|
|
INSERT INTO dorisdemo VALUES ('user4', 'city4', 23, 0);
|
|
INSERT INTO dorisdemo VALUES ('user5', 'city5', 24, 1);
|
|
INSERT INTO dorisdemo VALUES ('user6', 'city6', 25, 0);
|
|
INSERT INTO dorisdemo VALUES ('user7', 'city7', 26, 1);
|
|
INSERT INTO dorisdemo VALUES ('user8', 'city8', 27, 0);
|
|
INSERT INTO dorisdemo VALUES ('user9', 'city9', 28, 1);
|
|
INSERT INTO dorisdemo VALUES ('user10', 'city10', 29, 0);</pre>
|
|
</li><li id="dli_08_15036__li4465165816144">Create a Flink OpenSource SQL job. Enter the following job script and submit the job. This job simulates reading data from Kafka, performs a join with a Doris dimension table to denormalize the data, and outputs it to Print.<pre class="screen" id="dli_08_15036__screen58321536173120">CREATE TABLE ordersSource (
|
|
user_id string,
|
|
user_name string,
|
|
proctime as Proctime()
|
|
) WITH (
|
|
'connector' = 'kafka',
|
|
'topic' = 'kafka-topic',
|
|
'properties.bootstrap.servers' = 'kafkaIp:port,kafkaIp:port,kafkaIp:port',
|
|
'properties.group.id' = 'GroupId',
|
|
'scan.startup.mode' = 'latest-offset',
|
|
'format' = 'json'
|
|
);
|
|
|
|
CREATE TABLE dorisDemo (
|
|
`user_id` String NOT NULL,
|
|
`city` String,
|
|
`age` int,
|
|
`gender` int
|
|
) with (
|
|
'connector' = 'doris',
|
|
'fenodes' = '<em id="dli_08_15036__i10354250113514">IP address of the FE instance</em>:<em id="dli_08_15036__i4354135012350">Port number</em>',
|
|
'table.identifier' = 'demo.dorisdemo',
|
|
'username' = 'dorisUsername',
|
|
'password' = 'dorisPassword',
|
|
'lookup.cache.ttl'='10 m',
|
|
'lookup.cache.max-rows' = '100'
|
|
);
|
|
|
|
CREATE TABLE print (
|
|
user_id string,
|
|
user_name string,
|
|
`city` String,
|
|
`age` int,
|
|
`gender` int
|
|
) WITH (
|
|
'connector' = 'print'
|
|
);
|
|
|
|
insert into print
|
|
select
|
|
orders.user_id,
|
|
orders.user_name,
|
|
dim.city,
|
|
dim.age,
|
|
dim.gender
|
|
from ordersSource orders
|
|
left join dorisDemo for system_time as of orders.proctime as dim on orders.user_id = dim.user_id;</pre>
|
|
</li><li id="dli_08_15036__li55501154153">Write two data records to the Kafka data source.<pre class="screen" id="dli_08_15036__screen370220338333">{"user_id": "user1", "user_name": "name1"}
|
|
{"user_id": "user2", "user_name": "name2"}</pre>
|
|
</li><li id="dli_08_15036__li75506518158">View the data in the Print result table.<pre class="screen" id="dli_08_15036__screen321591810294">+I[user1, name1, city1, 20, 1]
|
|
+I[user2, name2, city2, 21, 0]</pre>
|
|
</li></ol>
|
|
</div>
|
|
</div>
|
|
<div>
|
|
<div class="familylinks">
|
|
<div class="parentlink"><strong>Parent topic:</strong> <a href="dli_08_15032.html">Doris</a></div>
|
|
</div>
|
|
</div>
|
|
|