Yang, Tong 6182f91ba8 MRS component operation guide_normal 2.0.38.SP20 version
Reviewed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com>
Co-authored-by: Yang, Tong <yangtong2@huawei.com>
Co-committed-by: Yang, Tong <yangtong2@huawei.com>
2022-12-09 14:55:21 +00:00

190 lines
24 KiB
HTML

<a name="mrs_01_2398"></a><a name="mrs_01_2398"></a>
<h1 class="topictitle1">Creating a ClickHouse Table</h1>
<div id="body0000001080812946"><p id="mrs_01_2398__p208931159813">ClickHouse implements the replicated table mechanism based on the ReplicatedMergeTree engine and ZooKeeper. When creating a table, you can specify an engine to determine whether the table is highly available. Shards and replicas of each table are independent of each other.</p>
<p id="mrs_01_2398__p175761923215">ClickHouse also implements the distributed table mechanism based on the Distributed engine. Views are created on all shards (local tables) for distributed query, which is easy to use. ClickHouse has the concept of data sharding, which is one of the features of distributed storage. That is, parallel read and write are used to improve efficiency.</p>
<p id="mrs_01_2398__p554421512177">The ClickHouse cluster table engine that uses Kunpeng as the CPU architecture does not support HDFS and Kafka.</p>
<div class="section" id="mrs_01_2398__section1386435625"><a name="mrs_01_2398__section1386435625"></a><a name="section1386435625"></a><h4 class="sectiontitle">Viewing cluster and Other Environment Parameters of ClickHouse</h4><ol id="mrs_01_2398__ol1957054154819"><li id="mrs_01_2398__li538035105918"><span>Use the ClickHouse client to connect to the ClickHouse server by referring to <a href="mrs_01_2345.html">Using ClickHouse from Scratch</a>.</span></li><li id="mrs_01_2398__li5153155032517"><a name="mrs_01_2398__li5153155032517"></a><a name="li5153155032517"></a><span>Query the cluster identifier and other information about the environment parameters.</span><p><div class="p" id="mrs_01_2398__p9611951122514"><strong id="mrs_01_2398__b129808844916">select cluster,shard_num,replica_num,host_name from system.clusters;</strong><pre class="screen" id="mrs_01_2398__screen8405017103510">SELECT
cluster,
shard_num,
replica_num,
host_name
FROM system.clusters
┌─cluster───────────┬─shard_num─┬─replica_num─┬─host_name──────── ┐
│ default_cluster_1 │ 1 │ 1 │ node-master1dOnG │
│ default_cluster_1 │ 1 │ 2 │ node-group-1tXED0001 │
│ default_cluster_1 │ 2 │ 1 │ node-master2OXQS │
│ default_cluster_1 │ 2 │ 2 │ node-group-1tXED0002 │
│ default_cluster_1 │ 3 │ 1 │ node-master3QsRI │
│ default_cluster_1 │ 3 │ 2 │ node-group-1tXED0003 │
└─────────────── ┴────── ┴─────── ┴──────────────┘
6 rows in set. Elapsed: 0.001 sec. </pre>
</div>
</p></li><li id="mrs_01_2398__li1218412042615"><span>Query the shard and replica identifiers.</span><p><div class="p" id="mrs_01_2398__p830342012262"><strong id="mrs_01_2398__b1421463415524">select * from system.macros</strong>;<pre class="screen" id="mrs_01_2398__screen2099122833617">SELECT *
FROM system.macros
┌─macro───┬─substitution─────┐
│ id │ 76 │
│ replica │ node-master3QsRI │
│ shard │ 3 │
└────── ┴────────────┘
3 rows in set. Elapsed: 0.001 sec. </pre>
</div>
</p></li></ol>
</div>
<div class="section" id="mrs_01_2398__section1564103819477"><a name="mrs_01_2398__section1564103819477"></a><a name="section1564103819477"></a><h4 class="sectiontitle">Creating a Local Replicated Table and a distributed Table</h4><ol id="mrs_01_2398__ol153625504550"><li id="mrs_01_2398__li113629503556"><span>Log in to the ClickHouse node using the client, for example, <strong id="mrs_01_2398__b16274593274">clickhouse client --host </strong><em id="mrs_01_2398__i82561265918">node-master3QsRI</em> <strong id="mrs_01_2398__b7247132155916">--multiline --port 9440 --secure</strong><strong id="mrs_01_2398__b6701179145212">;</strong></span><p><div class="note" id="mrs_01_2398__note166161431173220"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="mrs_01_2398__p20616531163220"><em id="mrs_01_2398__i2208181819519">node-master3QsRI</em> is the value of <strong id="mrs_01_2398__b1950112016617">host_name</strong> obtained in <a href="#mrs_01_2398__li5153155032517">2</a> in <a href="#mrs_01_2398__section1386435625">Viewing cluster and Other Environment Parameters of ClickHouse</a>.</p>
</div></div>
</p></li><li id="mrs_01_2398__li89698281356"><a name="mrs_01_2398__li89698281356"></a><a name="li89698281356"></a><span>Create a replicated table using the ReplicatedMergeTree engine.</span><p><p id="mrs_01_2398__p940814301254">For details about the syntax, see <a href="https://clickhouse.tech/docs/en/engines/table-engines/mergetree-family/replication/#creating-replicated-tables" target="_blank" rel="noopener noreferrer">https://clickhouse.tech/docs/en/engines/table-engines/mergetree-family/replication/#creating-replicated-tables</a>.</p>
<p id="mrs_01_2398__p4377122294412">For example, run the following commands to create a ReplicatedMergeTree table named <strong id="mrs_01_2398__b2505161211116">test</strong> on the <strong id="mrs_01_2398__b41241019101110">default_cluster_1</strong> node and in the <strong id="mrs_01_2398__b21361051509">default</strong> database:</p>
<p id="mrs_01_2398__p5771818194417"><strong id="mrs_01_2398__b27731814417">CREATE TABLE </strong><em id="mrs_01_2398__i477121818445">default.test </em><strong id="mrs_01_2398__b877131814413">ON CLUSTER </strong><em id="mrs_01_2398__i17771518114414">default_cluster_1</em></p>
<p id="mrs_01_2398__p1177111894416"><strong id="mrs_01_2398__b277141894417">(</strong></p>
<p id="mrs_01_2398__p277171884419"><strong id="mrs_01_2398__b2779184447">`EventDate` DateTime,</strong></p>
<p id="mrs_01_2398__p87713185442"><strong id="mrs_01_2398__b1177201834415">`id` UInt64</strong></p>
<p id="mrs_01_2398__p1477191811448"><strong id="mrs_01_2398__b17716188441">)</strong></p>
<p id="mrs_01_2398__p37711854412"><strong id="mrs_01_2398__b16771718104414">ENGINE = ReplicatedMergeTree('</strong><em id="mrs_01_2398__i11728104617217">/clickhouse/tables/{shard}/default/test</em><strong id="mrs_01_2398__b474819331917">', '</strong><em id="mrs_01_2398__i18771718114419">{replica}</em>'<strong id="mrs_01_2398__b162534142816">)</strong></p>
<p id="mrs_01_2398__p117719186449"><strong id="mrs_01_2398__b15778186443">PARTITION BY toYYYYMM(EventDate)</strong></p>
<p id="mrs_01_2398__p157851815448"><strong id="mrs_01_2398__b97881884419">ORDER BY id;</strong></p>
<p id="mrs_01_2398__p1471456273">The parameters are described as follows:</p>
<ul id="mrs_01_2398__ul5826184392617"><li id="mrs_01_2398__li78492215273">The <strong id="mrs_01_2398__b8497737174210">ON CLUSTER</strong> syntax indicates the distributed DDL, that is, the same local table can be created on all instances in the cluster after the statement is executed once.</li><li id="mrs_01_2398__li5849926277"><strong id="mrs_01_2398__b7665135114157">default_cluster_1</strong> is the cluster identifier obtained in <a href="#mrs_01_2398__li5153155032517">2</a> in <a href="#mrs_01_2398__section1386435625">Viewing cluster and Other Environment Parameters of ClickHouse</a>.<div class="caution" id="mrs_01_2398__note819385317274"><span class="cautiontitle"><img src="public_sys-resources/caution_3.0-en-us.png"> </span><div class="cautionbody"><div class="p" id="mrs_01_2398__p1014412157281"><strong id="mrs_01_2398__b1913054182814">ReplicatedMergeTree</strong> engine receives the following two parameters:<ul id="mrs_01_2398__ul181301416284"><li id="mrs_01_2398__li267514545410">Storage path of the table data in ZooKeeper<p id="mrs_01_2398__p18351555104120"><a name="mrs_01_2398__li267514545410"></a><a name="li267514545410"></a>The path must be in the <strong id="mrs_01_2398__b3452122911466">/clickhouse</strong> directory. Otherwise, data insertion may fail due to insufficient ZooKeeper quota.</p>
<p id="mrs_01_2398__p1627114314303">To avoid data conflict between different tables in ZooKeeper, the directory must be in the following format:</p>
<p id="mrs_01_2398__p1116223613"><em id="mrs_01_2398__i161303412288">/clickhouse/tables/{shard}</em><strong id="mrs_01_2398__b5332112813481">/</strong><em id="mrs_01_2398__i4130114142816">default/test</em>, in which <strong id="mrs_01_2398__b197871457154816">/clickhouse/tables/{shard}</strong> is fixed, <em id="mrs_01_2398__i274714326496">default</em> indicates the database name, and <em id="mrs_01_2398__i3204619185016">text</em> indicates the name of the created table.</p>
</li><li id="mrs_01_2398__li12131845288">Replica name: Generally, <strong id="mrs_01_2398__b55925919513">{replica}</strong> is used.</li></ul>
</div>
</div></div>
</li></ul>
<pre class="screen" id="mrs_01_2398__screen799161105419">CREATE TABLE default.test ON CLUSTER default_cluster_1
(
`EventDate` DateTime,
`id` UInt64
)
ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/default/test', '{replica}')
PARTITION BY toYYYYMM(EventDate)
ORDER BY id
┌─host─────────────────┬─port─┬─status─┬─error─┬─num_hosts_remaining─┬─num_hosts_active─┐
│ node-group-1tXED0002 │ 9000 │ 0 │ │ 5 │ 3 │
│ node-group-1tXED0003 │ 9000 │ 0 │ │ 4 │ 3 │
│ node-master1dOnG │ 9000 │ 0 │ │ 3 │ 3 │
└────────────────────┴────┴─────┴──── ┴─────────── ┴──────────┘
┌─host─────────────────┬─port─┬─status─┬─error─┬─num_hosts_remaining─┬─num_hosts_active─┐
│ node-master3QsRI │ 9000 │ 0 │ │ 2 │ 0 │
│ node-group-1tXED0001 │ 9000 │ 0 │ │ 1 │ 0 │
│ node-master2OXQS │ 9000 │ 0 │ │ 0 │ 0 │
└────────────────────┴────┴─────┴──── ┴─────────── ┴──────────┘
6 rows in set. Elapsed: 0.189 sec. </pre>
</p></li><li id="mrs_01_2398__li16616143173215"><a name="mrs_01_2398__li16616143173215"></a><a name="li16616143173215"></a><span>Create a distributed table using the Distributed engine.</span><p><p id="mrs_01_2398__p1199141904510">For example, run the following commands to create a distributed table named <strong id="mrs_01_2398__b1551401212">test_all</strong> on the <strong id="mrs_01_2398__b175515014113">default_cluster_1</strong> node and in the <strong id="mrs_01_2398__b831291615112">default</strong> database:</p>
<p id="mrs_01_2398__p5770017134511"><strong id="mrs_01_2398__b9770917104519">CREATE TABLE </strong><em id="mrs_01_2398__i12770181716452">default.test_all </em><strong id="mrs_01_2398__b18770191710456">ON CLUSTER </strong><em id="mrs_01_2398__i107701017194513">default_cluster_1</em></p>
<p id="mrs_01_2398__p77701017194512"><strong id="mrs_01_2398__b57701917174519">(</strong></p>
<p id="mrs_01_2398__p147709172459"><strong id="mrs_01_2398__b47702017154513">`EventDate` DateTime,</strong></p>
<p id="mrs_01_2398__p3770171715451"><strong id="mrs_01_2398__b677021717455">`id` UInt64</strong></p>
<p id="mrs_01_2398__p1877081704511"><strong id="mrs_01_2398__b1877071716455">)</strong></p>
<p id="mrs_01_2398__p157701617174511"><strong id="mrs_01_2398__b9770917154515">ENGINE = Distributed(</strong><em id="mrs_01_2398__i147705178458">default_cluster_1, default, test, rand()</em><strong id="mrs_01_2398__b1577031710451">)</strong><strong id="mrs_01_2398__b14536102818523">;</strong></p>
<pre class="screen" id="mrs_01_2398__screen7673659135417">CREATE TABLE default.test_all ON CLUSTER default_cluster_1
(
`EventDate` DateTime,
`id` UInt64
)
ENGINE = Distributed(default_cluster_1, default, test, rand())
┌─host─────────────────┬─port─┬─status─┬─error─┬─num_hosts_remaining─┬─num_hosts_active─┐
│ node-group-1tXED0002 │ 9000 │ 0 │ │ 5 │ 0 │
│ node-master3QsRI │ 9000 │ 0 │ │ 4 │ 0 │
│ node-group-1tXED0003 │ 9000 │ 0 │ │ 3 │ 0 │
│ node-group-1tXED0001 │ 9000 │ 0 │ │ 2 │ 0 │
│ node-master1dOnG │ 9000 │ 0 │ │ 1 │ 0 │
│ node-master2OXQS │ 9000 │ 0 │ │ 0 │ 0 │
└────────────────────┴────┴─────┴──── ┴─────────── ┴──────────┘
6 rows in set. Elapsed: 0.115 sec.
</pre>
<div class="note" id="mrs_01_2398__note10770417184513"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="mrs_01_2398__p37704175456"><strong id="mrs_01_2398__b19313815142514">Distributed</strong> requires the following parameters:</p>
<ul id="mrs_01_2398__ul377081711453"><li id="mrs_01_2398__li1577041711458"><strong id="mrs_01_2398__b08700363255">default_cluster_1</strong> is the cluster identifier obtained in <a href="#mrs_01_2398__li5153155032517">2</a> in <a href="#mrs_01_2398__section1386435625">Viewing cluster and Other Environment Parameters of ClickHouse</a>.</li><li id="mrs_01_2398__li6770151713451"><strong id="mrs_01_2398__b0109921122618">default</strong> indicates the name of the database where the local table is located.</li><li id="mrs_01_2398__li15770101714518"><strong id="mrs_01_2398__b84824346263">test</strong> indicates the name of the local table. In this example, it is the name of the table created in <a href="#mrs_01_2398__li89698281356">2</a>.</li><li id="mrs_01_2398__li477011720455">(Optional) Sharding key<p id="mrs_01_2398__p11770317124511"><a name="mrs_01_2398__li477011720455"></a><a name="li477011720455"></a>This key and the weight configured in the <strong id="mrs_01_2398__b3913192002818">config.xml</strong> file determine the route for writing data to the distributed table, that is, the physical table to which the data is written. It can be the original data (for example, <strong id="mrs_01_2398__b13921957183213">site_id</strong>) of a column in the table or the result of the function call, for example, <strong id="mrs_01_2398__b1440111239354">rand()</strong> is used in the preceding SQL statement. Note that data must be evenly distributed in this key. Another common operation is to use the hash value of a column with a large difference, for example, <strong id="mrs_01_2398__b1927815414384">intHash64(user_id)</strong>.</p>
</li></ul>
</div></div>
</p></li></ol>
<p id="mrs_01_2398__p121560068"></p>
</div>
<div class="section" id="mrs_01_2398__section16955425614"><h4 class="sectiontitle">ClickHouse Table Data Operations</h4><ol id="mrs_01_2398__ol97995531379"><li id="mrs_01_2398__li5566814132614"><span>Log in to the ClickHouse node on the client. Example:</span><p><div class="p" id="mrs_01_2398__p1671565910268"><strong id="mrs_01_2398__b831563420131">clickhouse client --host </strong><em id="mrs_01_2398__i1231519349137">node-master3QsRI</em> <strong id="mrs_01_2398__b3315173431319">--multiline --port 9440 --secure</strong><strong id="mrs_01_2398__b1431633465220">;</strong><div class="note" id="mrs_01_2398__note16315734181320"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="mrs_01_2398__p9316734201319"><em id="mrs_01_2398__i456835194613">node-master3QsRI</em> is the value of <strong id="mrs_01_2398__b1357163584617">host_name</strong> obtained in <a href="#mrs_01_2398__li5153155032517">2</a> in <a href="#mrs_01_2398__section1386435625">Viewing cluster and Other Environment Parameters of ClickHouse</a>.</p>
</div></div>
</div>
</p></li><li id="mrs_01_2398__li77990531075"><a name="mrs_01_2398__li77990531075"></a><a name="li77990531075"></a><span>After creating a table by referring to <a href="#mrs_01_2398__section1564103819477">Creating a Local Replicated Table and a distributed Table</a>, you can insert data to the local table.</span><p><p id="mrs_01_2398__p38631290916">For example, run the following command to insert data to the local table <strong id="mrs_01_2398__b87201245184713">test</strong>:</p>
<p id="mrs_01_2398__p21561929895"><strong id="mrs_01_2398__b51591856591">insert into test values(toDateTime(now()), rand());</strong></p>
</p></li><li id="mrs_01_2398__li174567159811"><span>Query the local table information.</span><p><p id="mrs_01_2398__p1772132341110">For example, run the following command to query data information of the table <strong id="mrs_01_2398__b6166201254914">test</strong> in <a href="#mrs_01_2398__li77990531075">2</a>:</p>
<p id="mrs_01_2398__p377213237119"><strong id="mrs_01_2398__b15101244201216">select * from test;</strong></p>
<pre class="screen" id="mrs_01_2398__screen1772023201111">SELECT *
FROM test
┌───────────EventDate─┬─────────id─┐
│ 2020-11-05 21:10:42 │ 1596238076 │
└──────────────── ┴───────────┘
1 rows in set. Elapsed: 0.002 sec.
</pre>
</p></li><li id="mrs_01_2398__li207721423131112"><span>Query the distributed table.</span><p><p id="mrs_01_2398__p14792616141518">For example, the distributed table <strong id="mrs_01_2398__b653573115112">test_all</strong> is created based on table <strong id="mrs_01_2398__b15741878512">test</strong> in <a href="#mrs_01_2398__li16616143173215">3</a>. Therefore, the same data in table <strong id="mrs_01_2398__b7519103865313">test</strong> can also be queried in table <strong id="mrs_01_2398__b19525919125219">test_all</strong>.</p>
<p id="mrs_01_2398__p754612358162"><strong id="mrs_01_2398__b201111843111613">select * from test_all;</strong></p>
<pre class="screen" id="mrs_01_2398__screen452114464163">SELECT *
FROM test_all
┌───────────EventDate─┬─────────id─┐
│ 2020-11-05 21:10:42 │ 1596238076 │
└──────────────── ┴───────────┘
1 rows in set. Elapsed: 0.004 sec. </pre>
</p></li><li id="mrs_01_2398__li157912091411"><span>Switch to the shard node with the same <strong id="mrs_01_2398__b1194280115912">shard_num</strong> and query the information about the current table. The same table data can be queried.</span><p><p id="mrs_01_2398__p18538125103010">For example, run the <strong id="mrs_01_2398__b19445159123111">exit;</strong> command to exit the original node.</p>
<p id="mrs_01_2398__p4237337143110">Run the following command to switch to the <strong id="mrs_01_2398__b9816193220311">node-group-1tXED0003</strong> node:</p>
<p id="mrs_01_2398__p1630945819213"><strong id="mrs_01_2398__b930915812118">clickhouse client --host </strong><em id="mrs_01_2398__i2309105817213">node-group-1tXED0003</em> <strong id="mrs_01_2398__b1230910588214">--multiline --port 9440 --secure</strong><strong id="mrs_01_2398__b13852455298">;</strong></p>
<div class="note" id="mrs_01_2398__note14309058182114"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="mrs_01_2398__p1330911588214">The <strong id="mrs_01_2398__b250416131514">shard_num</strong> values of <strong id="mrs_01_2398__b129831918916">node-group-1tXED0003</strong> and <strong id="mrs_01_2398__b1425362916115">node-master3QsRI</strong> are the same by performing <a href="#mrs_01_2398__li5153155032517">2</a>.</p>
</div></div>
<p id="mrs_01_2398__p9309185822115"><strong id="mrs_01_2398__b43092058152111">show tables</strong><strong id="mrs_01_2398__b187051847142918">;</strong></p>
<pre class="screen" id="mrs_01_2398__screen12309195882118">SHOW TABLES
┌─name─────┐
│ test │
│ test_all │
└────────┘
</pre>
</p></li><li id="mrs_01_2398__li53091658152116"><span>Query the local table data. For example, run the following command to query data in table <strong id="mrs_01_2398__b18482714416">test</strong> on the <strong id="mrs_01_2398__b119424107511">node-group-1tXED0003</strong> node:</span><p><div class="p" id="mrs_01_2398__p465969132410"><strong id="mrs_01_2398__b108661612112419">select * from test;</strong><pre class="screen" id="mrs_01_2398__screen14156163642318">SELECT *
FROM test
┌───────────EventDate─┬─────────id─┐
│ 2020-11-05 21:10:42 │ 1596238076 │
└──────────────── ┴───────────┘
1 rows in set. Elapsed: 0.005 sec.
</pre>
</div>
</p></li><li id="mrs_01_2398__li178971023182316"><span>Switch to the shard node with different <strong id="mrs_01_2398__b1379919419515">shard_num</strong> value and query the data of the created table.</span><p><p id="mrs_01_2398__p1193083411258">For example, run the following command to exit the <strong id="mrs_01_2398__b9137163716713">node-group-1tXED0003</strong> node:</p>
<p id="mrs_01_2398__p19842618155313"><strong id="mrs_01_2398__b1495722213538">exit;</strong></p>
<p id="mrs_01_2398__p718418386333">Switch to the <strong id="mrs_01_2398__b6833841180">node-group-1tXED0001</strong> node. The <strong id="mrs_01_2398__b10711221183">shard_num</strong> values of <strong id="mrs_01_2398__b6813227816">node-group-1tXED0001</strong> and <strong id="mrs_01_2398__b19882214814">node-master3QsRI</strong> are different by performing <a href="#mrs_01_2398__li5153155032517">2</a>.</p>
<p id="mrs_01_2398__p10534113552616"><strong id="mrs_01_2398__b68591651102611">clickhouse client --host </strong><em id="mrs_01_2398__i295120122712">node-group-1tXED0001</em> <strong id="mrs_01_2398__b20859135102613">--multiline --port 9440 --secure;</strong></p>
<p id="mrs_01_2398__p52839372718">Query the local table <strong id="mrs_01_2398__b26441816134615">test</strong>. Data cannot be queried on the different shard node because table <strong id="mrs_01_2398__b114313531595">test</strong> is a local table.</p>
<p id="mrs_01_2398__p1537613351279"><strong id="mrs_01_2398__b4973620122818">select * from test</strong><strong id="mrs_01_2398__b1162817447150">;</strong></p>
<pre class="screen" id="mrs_01_2398__screen19270194052717">SELECT *
FROM test
Ok.</pre>
<p id="mrs_01_2398__p4466175110158">Query data in the distributed table <strong id="mrs_01_2398__b11258641111013">test_all</strong>. The data can be queried properly.</p>
<p id="mrs_01_2398__p15309556151711"><strong id="mrs_01_2398__b131010562174">select * from test_all</strong><strong id="mrs_01_2398__b15310456201717">;</strong></p>
<pre class="screen" id="mrs_01_2398__screen237745241719">SELECT *
FROM test
┌───────────EventDate─┬─────────id─┐
│ 2020-11-05 21:12:19 │ 3686805070 │
└──────────────── ┴───────────┘
1 rows in set. Elapsed: 0.002 sec.
</pre>
</p></li></ol>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_2344.html">Using ClickHouse</a></div>
</div>
</div>