This Hive syntax is used to create a DLI table. The main differences between the DataSource and the Hive syntax lie in the supported data formats and the number of supported partitions. For details, see syntax and precautions.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | CREATE TABLE [IF NOT EXISTS] [db_name.]table_name [(col_name1 col_type1 [COMMENT col_comment1], ...)] [COMMENT table_comment] [PARTITIONED BY (col_name2 col_type2, [COMMENT col_comment2], ...)] [ROW FORMAT row_format] STORED AS file_format [TBLPROPERTIES (key = value)] [AS select_statement]; row_format: : SERDE serde_cls [WITH SERDEPROPERTIES (key1=val1, key2=val2, ...)] | DELIMITED [FIELDS TERMINATED BY char [ESCAPED BY char]] [COLLECTION ITEMS TERMINATED BY char] [MAP KEYS TERMINATED BY char] [LINES TERMINATED BY char] [NULL DEFINED AS char] |
Parameter |
Mandatory |
Description |
---|---|---|
db_name |
No |
Database name The value can contain letters, numbers, and underscores (_), but it cannot contain only numbers or start with a number or underscore (_). |
table_name |
Yes |
Table name in the database The value can contain letters, numbers, and underscores (_), but it cannot contain only numbers or start with a number or underscore (_). The matching rule is ^(?!_)(?![0-9]+$)[A-Za-z0-9_$]*$. If special characters are required, use single quotation marks ('') to enclose them. |
col_name |
Yes |
Column name The column field can contain letters, numbers, and underscores (_), but it cannot contain only numbers and must contain at least one letter. The column name is case insensitive. |
col_type |
Yes |
Data type of a column field, which is primitive. |
col_comment |
No |
Column field description, which can only be string constants. |
row_format |
Yes |
Row data format |
file_format |
Yes |
DLI table storage format, which can be TEXTFILE, AVRO, ORC, SEQUENCEFILE, RCFILE, or PARQUET. |
table_comment |
No |
Table description, which can only be string constants. |
key = value |
No |
Set table properties and values. If the table storage format is Parquet, you can use TBLPROPERTIES(parquet.compression = 'zstd') to set the table compression format to zstd. |
select_statement |
No |
The CREATE TABLE AS statement is used to insert the SELECT query result of the source table or a data record to a newly created DLI table. |
Example description: Create a DLI non-partitioned table named table1 and use the STORED AS keyword to set the storage format of the table to orc.
You can save DLI tables in the textfile, avro, orc, sequencefile, rcfile, or parquet format.
1 2 3 4 5 | CREATE TABLE IF NOT EXISTS table1 ( col_1 STRING, col_2 INT ) STORED AS orc; |
Example description: Create a partitioned table named student, which is partitioned using facultyNo and classNo.
In practice, you can select a proper partitioning field and add it to the end of the PARTITIONED BY keyword.
1 2 3 4 5 6 7 8 9 | CREATE TABLE IF NOT EXISTS student( id int, name STRING ) STORED AS avro PARTITIONED BY ( facultyNo INT, classNo INT ); |
Example description: Based on the DLI table table1 created in Example 1: Creating a DLI Non-Partitioned Table, use the CTAS syntax to copy data from table1 to table1_ctas.
When using CTAS to create a table, you can ignore the syntax used to create the table being copied. This means that regardless of the syntax used to create table1, you can use the DataSource syntax to create table1_ctas.
In this example, the storage format of table1 is orc, and the storage format of table1_ctas may be parquet. This means that the storage format of the table created by CTAS may be different from that of the original table.
Use the SELECT statement following the AS keyword to select required data and insert the data to table1_ctas.
The SELECT syntax is as follows: SELECT <Column name > FROM <Table name > WHERE <Related filter criteria>.
1 2 3 4 5 | CREATE TABLE IF NOT EXISTS table1_ctas STORED AS sequencefile AS SELECT * FROM table1; |
1 2 3 4 5 6 | CREATE TABLE IF NOT EXISTS table1_ctas USING parquet AS SELECT col_1 FROM table1 WHERE col_1 = 'Ann'; |
Example description: Create a DLI non-partitioned table named table2. You can customize the native data types of column fields based on service requirements.
For details, see "Data Types" > "Primitive Data Types".
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | CREATE TABLE IF NOT EXISTS table2 ( col_01 STRING, col_02 CHAR (2), col_03 VARCHAR (32), col_04 TIMESTAMP, col_05 DATE, col_06 INT, col_07 SMALLINT, col_08 BIGINT, col_09 TINYINT, col_10 FLOAT, col_11 DOUBLE, col_12 DECIMAL (10, 3), col_13 BOOLEAN ) STORED AS parquet; |
Example description: Create a DLI partitioned table named table3 and partition the table based on col_3. Set dli.multi.version.enable, comment, orc.compress, and auto.purge in TBLPROPERTIES.
1 2 3 4 5 6 7 8 9 10 11 12 | CREATE TABLE IF NOT EXISTs table3 ( col_1 STRING, col_2 STRING ) PARTITIONED BY (col_3 DATE) STORED AS rcfile TBLPROPERTIES ( dli.multi.version.enable = true, comment = 'Created by dli', orc.compress = 'ZLIB', auto.purge = true ); |
Example description: In this example, create a non-partitioned table named table4 in the textfile format and set ROW FORMAT (the ROW FORMAT function is available only for textfile tables).
1 2 3 4 5 6 7 8 9 10 11 | CREATE TABLE IF NOT EXISTS table4 ( col_1 STRING, col_2 INT ) STORED AS TEXTFILE ROW FORMAT DELIMITED FIELDS TERMINATED BY '/' COLLECTION ITEMS TERMINATED BY '$' MAP KEYS TERMINATED BY '#' LINES TERMINATED BY '\n' NULL DEFINED AS 'NULL'; |