Creates a new empty table in the current database.
This table is owned by the user who executes the command. However, if the system administrator creates a table in the schema with the same name as a common user, the owner of the table is the user (not the system administrator).
1 2 3 4 5 6 7 8 9 10 11 12 | CREATE [ [ GLOBAL | LOCAL | VOLATILE ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ IF NOT EXISTS ] table_name { ({ column_name data_type [ compress_mode ] [ COLLATE collation ] [ column_constraint [ ... ] ] | table_constraint | LIKE source_table [ like_option [...] ] } [, ... ])| LIKE source_table [ like_option [...] ] } [ WITH ( {storage_parameter = value} [, ... ] ) ] [ ON COMMIT { PRESERVE ROWS | DELETE ROWS } ] [ COMPRESS | NOCOMPRESS ] [ DISTRIBUTE BY { REPLICATION | ROUNDROBIN | { HASH ( column_name [,...] ) } } ] [ TO { GROUP groupname | NODE ( nodename [, ... ] ) } ] [ COMMENT [=] 'text' ]; |
1 2 3 4 5 6 7 8 9 10 11 | [ CONSTRAINT constraint_name ] { NOT NULL | NULL | CHECK ( expression ) | DEFAULT default_expr | ON UPDATE on_update_expr | COMMENT 'text' | UNIQUE [ NULLS [NOT] DISTINCT | NULLS IGNORE ] index_parameters | PRIMARY KEY index_parameters | REFERENCES reftable [ ( refcolumn ) ] } [ DEFERRABLE | NOT DEFERRABLE | INITIALLY DEFERRED | INITIALLY IMMEDIATE ] |
1 | { DELTA | PREFIX | DICTIONARY | NUMSTR | NOCOMPRESS } |
1 2 3 4 5 6 | [ CONSTRAINT constraint_name ] { CHECK ( expression ) | UNIQUE [ NULLS [NOT] DISTINCT | NULLS IGNORE ] ( column_name [, ... ] ) index_parameters | PRIMARY KEY ( column_name [, ... ] ) index_parameters | PARTIAL CLUSTER KEY ( column_name [, ... ] ) } [ DEFERRABLE | NOT DEFERRABLE | INITIALLY DEFERRED | INITIALLY IMMEDIATE ] |
1 | { INCLUDING | EXCLUDING } { DEFAULTS | CONSTRAINTS | INDEXES | STORAGE | COMMENTS | PARTITION | RELOPTIONS | DISTRIBUTION | DROPCOLUMNS | ALL } |
1 | [ WITH ( {storage_parameter = value} [, ... ] ) ] |
GaussDB(DWS) is compatible with the PostgreSQL ecosystem. Row storage and its B-tree index are similar to those of PostgreSQL. Column storage and its index are self-developed. When creating a table, it is crucial to choose the right storage method, distribution column, partition key, and index. This ensures efficient data access during SQL execution, reducing I/O consumption. The following figure illustrates the process from SQL statement initiation to data acquisition, helping you understand the function of each technical method for performance optimization.

The following table lists the existing optimization methods of GaussDB(DWS).
No. |
Method |
Usage |
Example SQL |
Modifiable After Creation |
||
|---|---|---|---|---|---|---|
1 |
String |
|
- |
Yes (The existing data can be rewritten.) |
||
2 |
Numeric |
Specifying precision for the numeric type is essential for improving performance. It is not advisable to use the numeric type without specifying precision. |
- |
Yes (The existing data can be rewritten.) |
||
3 |
Partition by Column |
|
|
No (You need to create a new table to make modifications.) |
||
4 |
secondary_part_column |
|
|
No (You need to create a new table to make modifications.) |
||
5 |
Distribute by Column |
This requires user-defined settings and is suitable for join fields that require frequent GROUP BY or multi-table joins. It reduces data shuffling through local joins and is ideal for equality queries. |
|
No (You need to create a new table to make modifications.) |
||
6 |
Bitmap column |
Define the bitmap index (cardinality ≤ 32) or bloom filter (cardinality > 32) based on the repeated values in the CU. This method is applicable to equivalent queries of varchar or text type columns. It is advised to create indexes on columns involved in the WHERE condition. |
|
Yes (Modification does not rewrite existing data. Only the new data is affected.) |
||
7 |
min-max index |
|
|
Yes (The PCK columns can be modified. Modification does not rewrite existing data and only the new data is affected.) |
||
8 |
Primary key (B-tree index) |
|
|
Yes (The index can be modified and re-created.) |
||
9 |
GIN index |
|
|
Yes (The index can be modified and re-created.) |
||
10 |
Orientation=column/row |
This method specifies whether a table is stored in rows or columns. Row-store tables cannot be compressed and are best suited for point queries and frequent updates. Column-store tables can be compressed and are ideal for analysis purposes. |
- |
No (You need to create a new table to make modifications.) |
If this key word is specified, the created table is not a log table. Data written to unlogged tables is not written to the write-ahead log, which makes them considerably faster than ordinary tables. However, an unlogged table is automatically truncated after a crash or unclean shutdown, incurring data loss risks. The contents of an unlogged table are also not replicated to standby servers. Any indexes created on an unlogged table are not automatically logged as well.
Usage scenario: Unlogged tables do not ensure safe data. You can back up data before using unlogged tables; for example, you should back up the data before a system upgrade.
Troubleshooting: If data is missing in the indexes of unlogged tables due to some unexpected operations such as an unclean shutdown, you should re-create the indexes with errors.
Specify the keywords GLOBAL, LOCAL, and VOLATILE before TEMP or TEMPORARY to create temporary tables with different attributes. Global temporary tables are supported only by 8.2.1.220 and later cluster versions.
If enable_global_temp_tabl is set to on, a global temporary table GLOBAL is created.
If enable_global_temp_tabl is set to off, a local temporary table LOCAL is created. You can also specify keyword LOCAL to reach the same effect.
If TEMP or TEMPORARY is specified, the created table is a temporary table. Temporary tables are automatically dropped at the end of a session, or optionally at the end of the current transaction. Therefore, apart from CN and other CN errors connected by the current session, you can still create and use temporary table in the current session. Temporary tables are created only in the current session. If a DDL statement involves operations on temporary tables, a DDL error will be generated. Therefore, you are not advised to perform operations on temporary tables in DDL statements. TEMP is equivalent to TEMPORARY.
If IF NOT EXISTS is specified, a table will be created if there is no table using the specified name. If there is already a table using the specified name, no error will be reported. A message will be displayed indicating that the table already exists, and the database will skip table creation.
Specifies the name of the table to be created.
The table name can contain a maximum of 63 characters, including letters, digits, underscores (_), dollar signs ($), and number signs (#). It must start with a letter or underscore (_).
A table name enclosed in double quotation marks can contain spaces and special characters. However, you are not advised to use these characters in a table name because they may make it difficult to reference and use. In addition, they may be processed differently under different database compatibility modes.
Specifies the name of a column to be created in the new table.
The column name can contain a maximum of 63 characters, including letters, digits, underscores (_), dollar signs ($), and number signs (#). It must start with a letter or underscore (_).
Specifies the data type of the column.
In a database compatible with Teradata or MySQL syntax, if the data type of a column is set to DATE, the DATE type is returned. Otherwise, the TIMESTAMP type is returned.
Specifies the compress option of the table, only available for row-store table. The option specifies the algorithm preferentially used by table columns.
Value range: DELTA, PREFIX, DICTIONARY, NUMSTR, NOCOMPRESS
Assigns a collation to the column (which must be of a collatable data type). If no collation is specified, the default collation is used.
Specifies a table from which the new table automatically copies all column names, their data types, and their not-null constraints.
The new table and the source table are decoupled after creation is complete. Changes to the source table will not be applied to the new table, and it is not possible to include data of the new table in scans of the source table.
Columns and constraints copied by LIKE are not merged with the same name. If the same name is specified explicitly or in another LIKE clause, an error is reported.
PERIOD and TTL in the WITH clause are partition-related parameters. LIKE INCLUDING RELOPTIONS will not be copied to the new table. To copy LIKE INCLUDING RELOPTIONS, use INCLUDING PARTITION.
Specifies an optional storage parameter for a table or an index.
Using Numeric of any precision to define column, specifies precision p and scale s. When precision and scale are not specified, the input will be displayed.
The description of parameters is as follows:
The fillfactor of a table is a percentage between 10 and 100. 100 (complete packing) is the default value. When a smaller fillfactor is specified, INSERT operations pack table pages only to the indicated percentage. The remaining space on each page is reserved for updating rows on that page. This gives UPDATE a chance to place the updated copy of a row on the same page, which is more efficient than placing it on a different page. For a table whose records are never updated, setting the fillfactor to 100 (complete packing) is the appropriate choice, but in heavily updated tables smaller fillfactors are appropriate. The parameter has no meaning for column–store tables.
Value range: 10 to 100
Specifies the storage mode (row-store, column-store) for table data. This parameter cannot be modified once it is set.
Valid value:
ROW applies to OLTP service, which has many interactive transactions. An interaction involves many columns in the table. Using ROW can improve the efficiency.
COLUMN applies to the data warehouse service, which has a large amount of aggregation computing, and involves a few column operations.
Default value: ROW (row-store)
Specifies the compression level of the table data. It determines the compression ratio and time. Generally, the higher the level of compression, the higher the ratio, the longer the time, and the lower the level of compression, the lower the ratio, the shorter the time. The actual compression ratio depends on the distribution characteristics of loading table data.
Valid value:
The valid values for column-store tables are YES/NO and LOW/MIDDLE/HIGH, and the default is LOW. When this parameter is set to YES, the compression level is LOW by default.
GaussDB(DWS) provides the following compression algorithms:
COMPRESSION |
NUMERIC |
STRING |
INT |
|---|---|---|---|
LOW |
Delta compression + RLE compression |
LZ4 compression |
Delta compression (RLE is optional.) |
MIDDLE |
Delta compression + RLE compression + LZ4 compression |
dict compression or LZ4 compression |
Delta compression or LZ4 compression (RLE is optional) |
HIGH |
Delta compression + RLE compression + zlib compression |
dict compression or zlib compression |
Delta compression or zlib compression (RLE is optional) |
Specifies the compression level of the table data. It determines the compression ratio and time. This divides a compression level into sublevels, providing you with more choices for compression rate and duration. As the value becomes greater, the compression rate becomes higher and duration longer at the same compression level. The parameter is only valid for column-store tables.
Value range: 0–3.
Default value: 0
Schedules the partition deletion tasks in a partitioned table. By default, no partition deletion task is created.
Value range: 1 hour–100 years
Schedules the partition creation tasks in a partitioned table. If TTL has been configured, PERIOD cannot be greater than TTL.
Value range: 1 hour–100 years
Default value: 1 day
Specifies the maximum of a storage unit during data loading process. The parameter is only valid for column-store tables.
Value range: 10000 to 60000
Default value: 60,000
Specifies the number of records to be partial cluster stored during data loading process. The parameter is only valid for column-store tables.
Value range: 600000 to 2147483647
Default value: 4,200,000
You can use auto-increment and decrement partitions for INT4, INT8, VARCHAR, and TEXT columns, which are commonly used for storing time-related data. The time_format option is applicable only when the partition key is INT4, INT8, VARCHAR, or TEXT and a period is specified. This is supported only by clusters of version 9.1.0.200 or later.
VARCHAR/TEXT:
INT4/INT8:
Specifies whether to enable delta tables in column-store tables. The parameter is only valid for column-store tables. If COLVERSION is set to 3.0, enable_delta cannot be turned on because this parameter is not supported by V3 tables.
Using column-store tables with delta tables is not recommended. This may cause disk bloat and performance deterioration due to delayed merge.
Default value: off
Specifies whether an H-Store table will be created (based on column-store tables). The parameter is only valid for column-store tables. This parameter is supported by version 8.2.0.100 or later clusters. If COLVERSION is set to 3.0, enable_delta cannot be turned on because this parameter is not supported by V3 tables.
Default value: off
If this parameter is enabled, the following GUC parameters must be set to ensure that H-Store tables are cleared.
Set autovacuum to on, autovacuum_max_workers to 6, and autovacuum_max_workers_hstore to 3.
Specifies whether fine-grained DR will be enabled for column-store tables. This parameter only takes effect on column-store tables whose COLVERSION is 2.0 and cannot be set to true if enable_hstore is true. This parameter is supported by version 8.2.0.100 or later clusters.
This parameter has been discarded in version 8.2.1 and is reserved for compatibility with earlier versions. This parameter is invalid in the current version.
Specifies whether the fine-grained DR table will be set as a primary or secondary table. This parameter can be true only when the enable_disaster_cstore parameter has been set to true.
Valid value:
Specifies the upper limit of to-be-imported rows for triggering the data import to a delta table when data is to be imported to a column-store table. This parameter takes effect only if the enable_delta table parameter is set to on. The parameter is only valid for column-store tables.
Value range: 0 to 60000
Default value: 6000
Specifies the version of the column-store format. You can switch between different storage formats.
Valid value:
1.0: Each column in a column-store table is stored in a separate file. The file name is relfilenode.C1.0, relfilenode.C2.0, relfilenode.C3.0, or similar.
2.0: All columns of a column-store table are combined and stored in a file. The file is named relfilenode.C1.0.
3.0: Each column of a column-store table is stored in a file. The file is stored in the OBS file system and named C1_fileid.0.
Default value: The default value for the storage-compute coupled version is 2.0, while for the storage-compute decoupling version, it is 3.0.
The value of COLVERSION can only be set to 2.0 for OBS multi-temperature tables.
Specifies the mode of table-level auto-analyze.
Valid value:
Default value: all
Specifies whether to enable the incremental analyze mode for partitioned tables. This parameter is valid only for partitioned tables and cannot be set for replicated tables. This is supported only by clusters of version 9.1.0.100 or later.
The default value is false.
Indicates whether to skip the hint bits operation when the full-page writes (FPW) log needs to be written during sequential scanning.
If SKIP_FPI_HINT is set to true and the checkpoint operation is performed on a table, no Xlog will be generated when the table is sequentially scanned. This applies to intermediate tables that are queried less frequently, reducing the size of Xlogs and improving query performance.
It is similar to ON COMMIT { PRESERVE ROWS | DELETE ROWS }. The two parameters cannot be specified at the same time. This parameter is used only for global temporary tables.
Default value: true
Determines whether to enable CU rewriting logic for column-store tables using AUTOVACUUM. This parameter is supported only by clusters of version 8.2.1.100 or later.
There is a low probability that an error is reported when lightweight UPDATE and AUTOVACUUM are executed concurrently. You can set the table-level parameter to off to avoid this problem.
Default value: true
Specifies the name of a level-2 partition column in a column-store table. Only one column can be specified as the level-2 partition column. This parameter applies only to H-Store column-store tables. This parameter is supported only by clusters of version 8.3.0 or later.
Specifies the number of level-2 partitions in a column-store table. This parameter applies only to H-Store column-store tables. This parameter is supported only by clusters of version 8.3.0 or later.
Value range: 1 to 32
Default value: 8
If the enable_hstore_opt table-level parameter is enabled, the enable_hstore table-level parameter is also enabled by default. This parameter is supported only in cluster 8.3.0 and later versions. This parameter supports V2 .
Default value: false
bitmap index is only applicable to the new H-Store (hstore_opt table). To generate the bitmap index mapping, you need to enable the enable_hstore_opt table-level parameter and set bitmap_columns to Specified Columns. This parameter is supported only by clusters of version 8.3.0 or later.
Determines whether to create a turbo table (column-store tables). The parameter is only valid for column-store tables.
Default value: off
Turbo tables enhance the storage efficiency of numeric and varchar data types using the integers, leading to accelerated processing speeds for these types.
Specifies the cache mode of tables or partitioned tables (disks). If one of the following values is specified in the cache policy, hot cache is used. Otherwise, cold cache is used. Hot cache occupies more space than cold cache and uses more complex replacement policies.
Valid value:
The default value is ALL.
ON COMMIT determines what to do when you commit a temporary table creation operation. Global temporary tables support only the PRESERVE ROWS option.
If you specify COMPRESS in the CREATE TABLE statement, the compression feature is triggered in the case of a bulk INSERT operation. If this feature is enabled, a scan is performed for all tuple data within the page to generate a dictionary and then the tuple data is compressed and stored. If NOCOMPRESS is specified, the table is not compressed.
Default value: NOCOMPRESS, tuple data is not compressed before storage.
Specifies how the table is distributed or replicated between DNs.
Valid value:
When you create a table, the choices of distribution keys and partition keys have major impact on SQL query performance. Therefore, choosing proper distribution column and partition key with strategies.
Connect to the database and run the following statements to check the number of tuples on each DN: Replace tablename with the actual name of the table to be analyzed.
SELECT a.count,b.node_name FROM (SELECT count(*) AS count,xc_node_id FROM tablename GROUP BY xc_node_id) a, pgxc_node b WHERE a.xc_node_id=b.node_id ORDER BY a.count DESC;
If tuple numbers vary greatly (several times or tenfold) in each DN, a data skew occurs. Change the data distribution key based on the following principles:
The column value of the distribution column should be discrete so that data can be evenly distributed on each DN. For example, you are advised to select the primary key of a table as the distribution column, and the ID card number as the distribution column in a personnel information table.
With the above principles met, you can select join conditions as distribution keys so that join tasks can be pushed down to DNs, reducing the amount of data transferred between the DNs.
In range partitioning, the table is partitioned into ranges defined by a key column or set of columns, with no overlap between the ranges of values assigned to different partitions. Each range has a dedicated partition for data storage.
Modify partition keys to make the query result stored in the same or least partitions (partition pruning). Obtaining consecutive I/O to improve the query performance.
In actual services, time is used to filter query objects. Therefore, you can use time as a partition key, and change the key value based on the total data volume and single data query volume.
TO GROUP specifies the Node Group in which the table is created. Currently, it cannot be used for HDFS tables. TO NODE is used for internal scale-out tools.
In logical cluster mode, if TO GROUP is not specified, the table is created in the node group associated with the logical cluster user by default. If the user, such as the administrator or a common user, does not manage the logical cluster, by default the table is created in the first logical cluster, which is the logical cluster with the smallest OID in pgxc_group.
If the node group specified by TO GROUP is a replication table node group, the table is created on all CNs and DNs, but the replication table data is distributed only on the DNs in the replication table node group.
The Storage-compute decoupling 3.0 supports read-only logical clusters. If a user is not bound to any read-only logical clusters but sets TO GROUP to a logical cluster in a table creation statement, an error will be reported during table creation. If a user bound to a read-only logical cluster creates a table, the table will be created in the logical cluster specified by the GUC parameter default_storage_nodegroup. If default_storage_nodegroup is set to installation, tables will be created in the first logical cluster.
The COMMENT clause can specify table comments during table creation.
Specifies a name for a column or table constraint. The optional constraint clauses specify constraints that new or updated rows must satisfy for an insert or update operation to succeed.
There are two ways to define constraints:
Indicates that the column is not allowed to contain NULL values.
The column is allowed to contain NULL values. This is the default setting.
This clause is only provided for compatibility with non-standard SQL databases. You are advised not to use this clause.
Specifies an expression producing a Boolean result which new or updated rows must satisfy for an insert or update operation to succeed. Expressions evaluating to TRUE or UNKNOWN succeed. If any row of an insert or update operation produces a FALSE result, an error exception is raised and the insert or update does not alter the database.
A check constraint specified as a column constraint should reference only the column's values, while an expression appearing in a table constraint can reference multiple columns.
<>NULL and !=NULL are invalid in an expression. Change them to IS NOT NULL.
Assigns a default data value for a column. The value can be any variable-free expressions (Subqueries and cross-references to other columns in the current table are not allowed). The data type of the default expression must match the data type of the column.
The default expression will be used in any insert operation that does not specify a value for the column. If there is no default value for a column, then the default value is NULL.
The ON UPDATE clause specifies a timestamp function for a column. Ensure that the data type of the column for which the ON UPDATE clause specifies a timestamp function is timestamp or timestamptz.
When an SQL statement containing the UPDATE operation is executed, this column is automatically updated to the time specified by the timestamp function.
The on_update_expr function supports only CURRENT_TIMESTAMP, CURRENT_TIME, CURRENT_DATE, LOCALTIME, LOCALTIMESTAMP.
The COMMENT clause can specify a comment for a column.
UNIQUE [ NULLS [ NOT ] DISTINCT | NULLS IGNORE ] ( column_name [, ... ] ) index_parameters
Specifies that a group of one or more columns of a table can contain only unique values.
The [ NULLS [ NOT ] DISTINCT | NULLS IGNORE ] field is used to specify how to process null values in the index column of the Unique index.
Default value: This parameter is left empty by default. NULL values can be inserted repeatedly.
When the inserted data is compared with the original data in the table, the NULL value can be processed in any of the following ways:
The following table lists the behaviors of the three processing modes.
Constraint |
All Index Columns Are NULL |
Some Index Columns Are NULL. |
|---|---|---|
NULLS DISTINCT |
Can be inserted repeatedly. |
Can be inserted repeatedly. |
NULLS NOT DISTINCT |
Cannot be inserted repeatedly. |
Cannot be inserted if the non-null values are equal. Can be inserted if the non-null values are not equal. |
NULLS IGNORE |
Can be inserted repeatedly. |
Cannot be inserted if the non-null values are equal. Can be inserted if the non-null values are not equal. |
If DISTRIBUTE BY REPLICATION is not specified, the column table that contains only unique values must contain distribution columns.
PRIMARY KEY ( column_name [, ... ] ) index_parameters
Specifies the primary key constraint specifies that a column or columns of a table can contain only unique (non-duplicate) and non-null values.
Only one primary key can be specified for a table.
If DISTRIBUTE BY REPLICATION is not specified, the column set with a primary key constraint must contain distributed columns.
The foreign key constraint requires that the group consisting of one or more columns in the new table should contain and match only the referenced column values in the referenced table. The referenced column should be the only column or primary key in the referenced table.
GaussDB(DWS) does not check foreign key constraints. When using foreign key constraints, you need to use the check_foreign_key_constraint function to check whether the data in the foreign key table meets the foreign key constraints.
Controls whether the constraint can be deferred. A constraint that is not deferrable will be checked immediately after every command. Checking of constraints that are deferrable can be postponed until the end of the transaction using the SET CONSTRAINTS command. NOT DEFERRABLE is the default value. Currently, only UNIQUE and PRIMARY KEY constraints of row-store tables accept this clause. All the other constraints are not deferrable.
Specifies a partial cluster key for storage. When importing data to a column-store table, you can perform local data sorting by specified columns (single or multiple).
If a constraint is deferrable, this clause specifies the default time to check the constraint.
The constraint check time can be altered using the SET CONSTRAINTS command.
Create a V3 table with storage and compute decoupled (supported only in the storage-compute decoupling 3.0 version).
1 2 3 4 5 6 7 8 | CREATE TABLE public.t1 ( id integer not null, data integer, age integer ) WITH (ORIENTATION =COLUMN, COLVERSION =3.0) DISTRIBUTE BY ROUNDROBIN; |
Specify the cache policy when creating a table (supported only in clusters of the storage-compute decoupling 3.0 version).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | CREATE TABLE Sports ( N_NATIONKEY INT NOT NULL , N_NAME CHAR(25) NOT NULL , N_REGIONKEY INT NOT NULL , N_COMMENT VARCHAR(152) ) WITH (orientation = column, colversion = 3.0, cache_policy = 'HPL: Balls, Basketball') tablespace cu_obs_tbs DISTRIBUTE BY ROUNDROBIN partition by list(N_NAME) ( partition Balls values ('Basketball', 'football', 'badminton'), partition Athletics values ('High jump', 'long jump', 'javelin'), partition Water_Sports values ('Surfing', 'diving', 'swimming'), partition Shooting values ('air guns', 'Rifles', 'archery'), partition rest values (DEFAULT) ); |
Define a unique column constraint for the table:
1 2 3 4 5 6 7 8 9 10 | CREATE TABLE CUSTOMER ( C_CUSTKEY BIGINT NOT NULL CONSTRAINT C_CUSTKEY_pk PRIMARY KEY , C_NAME VARCHAR(25) , C_ADDRESS VARCHAR(40) , C_NATIONKEY INT , C_PHONE CHAR(15) , C_ACCTBAL DECIMAL(15,2) ) DISTRIBUTE BY HASH(C_CUSTKEY); |
Define a primary key table constraint for the table. You can define a primary key table constraint on one or more columns of a table:
1 2 3 4 5 6 7 8 9 10 11 | CREATE TABLE CUSTOMER ( C_CUSTKEY BIGINT , C_NAME VARCHAR(25) , C_ADDRESS VARCHAR(40) , C_NATIONKEY INT , C_PHONE CHAR(15) , C_ACCTBAL DECIMAL(15,2) , CONSTRAINT C_CUSTKEY_KEY PRIMARY KEY(C_CUSTKEY,C_NAME) ) DISTRIBUTE BY HASH(C_CUSTKEY,C_NAME); |
Define the CHECK column constraint:
1 2 3 4 5 6 7 8 | CREATE TABLE CUSTOMER ( C_CUSTKEY BIGINT NOT NULL CONSTRAINT C_CUSTKEY_pk PRIMARY KEY , C_NAME VARCHAR(25) , C_ADDRESS VARCHAR(40) , C_NATIONKEY INT NOT NULL CHECK (C_NATIONKEY > 0) ) DISTRIBUTE BY HASH(C_CUSTKEY); |
Define the CHECK table constraint:
CREATE TABLE CUSTOMER
(
C_CUSTKEY BIGINT NOT NULL CONSTRAINT C_CUSTKEY_pk PRIMARY KEY ,
C_NAME VARCHAR(25) ,
C_ADDRESS VARCHAR(40) ,
C_NATIONKEY INT ,
CONSTRAINT C_CUSTKEY_KEY2 CHECK(C_CUSTKEY > 0 AND C_NAME <> '')
)
DISTRIBUTE BY HASH(C_CUSTKEY);
Create a column-store table and specify the storage format and compression mode:
1 2 3 4 5 6 7 8 9 10 11 | CREATE TABLE customer_address ( ca_address_sk INTEGER NOT NULL , ca_address_id CHARACTER(16) NOT NULL , ca_street_number CHARACTER(10) , ca_street_name CHARACTER varying(60) , ca_street_type CHARACTER(15) , ca_suite_number CHARACTER(10) ) WITH (ORIENTATION = COLUMN, COMPRESSION=HIGH,COLVERSION=2.0) DISTRIBUTE BY HASH (ca_address_sk); |
Use DEFAULT to declare a default value for column W_STATE:
1 2 3 4 5 6 7 8 9 10 | CREATE TABLE warehouse_t ( W_WAREHOUSE_SK INTEGER NOT NULL, W_WAREHOUSE_ID CHAR(16) NOT NULL, W_WAREHOUSE_NAME VARCHAR(20) UNIQUE DEFERRABLE, W_WAREHOUSE_SQ_FT INTEGER , W_COUNTY VARCHAR(30) , W_STATE CHAR(2) DEFAULT 'GA', W_ZIP CHAR(10) ); |
Create the CUSTOMER_bk table in LIKE mode:
1 | CREATE TABLE CUSTOMER_bk (LIKE CUSTOMER INCLUDING ALL); |