Creating a DLI Table Using the DataSource Syntax

Function

This DataSource syntax can be used to create a DLI table. The main differences between the DataSource and the Hive syntax lie in the supported data formats and the number of supported partitions. For details, see syntax and precautions.

Precautions

Syntax

1
2
3
4
5
6
7
CREATE TABLE [IF NOT EXISTS] [db_name.]table_name 
  [(col_name1 col_type1 [COMMENT col_comment1], ...)]
  USING file_format 
  [OPTIONS (key1=val1, key2=val2, ...)] 
  [PARTITIONED BY (col_name1, col_name2, ...)]
  [COMMENT table_comment]
  [AS select_statement];

Keywords

Parameters

Table 1 Parameters

Parameter

Mandatory

Description

db_name

No

Database name

The value can contain letters, numbers, and underscores (_), but it cannot contain only numbers or start with a number or underscore (_).

table_name

Yes

Table name in the database

The value can contain letters, numbers, and underscores (_), but it cannot contain only numbers or start with a number or underscore (_). The matching rule is ^(?!_)(?![0-9]+$)[A-Za-z0-9_$]*$.

Special characters must be enclosed in single quotation marks ('').

The table name is case insensitive.

col_name

Yes

Column names with data types separated by commas (,)

The column name can contain letters, numbers, and underscores (_), but it cannot contain only numbers and must contain at least one letter.

The column name is case insensitive.

col_type

Yes

Data type of a column field, which is primitive.

col_comment

No

Column field description, which can only be string constants.

file_format

Yes

Data storage format of DLI tables. The value can be parquet or orc.

table_comment

No

Table description, which can only be string constants.

select_statement

No

The CREATE TABLE AS statement is used to insert the SELECT query result of the source table or a data record to a newly created DLI table.

Table 2 OPTIONS parameters

Parameter

Mandatory

Description

Default Value

multiLevelDirEnable

No

Whether to iteratively query data in subdirectories. When this parameter is set to true, all files in the table path, including files in subdirectories, are iteratively read when a table is queried.

false

compression

No

Compression format. Generally, you need to set this parameter to zstd for parquet files.

-

Example 1: Creating a DLI Non-Partitioned Table

Example description: Create a DLI non-partitioned table named table1 and use the USING keyword to set the storage format of the table to orc.

You can save DLI tables in the parquet format.

1
2
3
4
CREATE TABLE IF NOT EXISTS table1 (
    col_1 STRING,
    col_2 INT)
USING orc;

Example 2: Creating a DLI Partitioned Table

Example description: Create a partitioned table named student, which is partitioned using facultyNo and classNo.

In practice, you can select a proper partitioning field and add it to the end of the PARTITIONED BY keyword.

1
2
3
4
5
6
7
CREATE TABLE IF NOT EXISTS student (
    Name        STRING,
    facultyNo   INT,
    classNo     INT
)
USING orc
PARTITIONED BY (facultyNo, classNo);

Example 3: Using CTAS to Create a DLI Table Using All or Part of the Data in the Source Table

Example description: Based on the DLI table table1 created in Example 1: Creating a DLI Non-Partitioned Table, use the CTAS syntax to copy data from table1 to table1_ctas.

When using CTAS to create a table, you can ignore the syntax used to create the table being copied. This means that regardless of the syntax used to create table1, you can use the DataSource syntax to create table1_ctas.

In addition, in this example, the storage format of table1 is orc, and the storage format of table1_ctas may be orc or parquet. This means that the storage format of the table created by CTAS may be different from that of the original table.

Use the SELECT statement following the AS keyword to select required data and insert the data to table1_ctas.

The SELECT syntax is as follows: SELECT <Column name > FROM <Table name > WHERE <Related filter criteria>.

Example 4: Creating a DLI Non-Partitioned Table and Customizing the Data Type of a Column Field

Example description: Create a DLI non-partitioned table named table2. You can customize the native data types of column fields based on service requirements.

For details, see "Data Types" > "Primitive Data Types".

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
CREATE TABLE IF NOT EXISTS table2 (
    col_01  STRING,
    col_02  CHAR (2),
    col_03  VARCHAR (32),
    col_04  TIMESTAMP,
    col_05  DATE,
    col_06  INT,
    col_07  SMALLINT,
    col_08  BIGINT,
    col_09  TINYINT,
    col_10  FLOAT,
    col_11  DOUBLE,
    col_12  DECIMAL (10, 3),
    col_13  BOOLEAN
)
USING parquet;

Example 5: Creating a DLI Partitioned Table and Customizing OPTIONS Parameters

Example description: When creating a DLI table, you can customize property names and values. For details about OPTIONS parameters, see Table 2.

In this example, a DLI partitioned table named table3 is created and partitioned based on col_2. Set pmultiLevelDirEnable and compression in OPTIONS.