In the big data field, the mainstream file formats are ORC and Parquet. You can use Hive to export data to an ORC or Parquet file and use GaussDB(DWS) to query and analyze the data in the ORC or Parquet file through a read-only foreign table. Therefore, you need to map the data types supported by the ORC or Parquet file format with the data types supported by GaussDB(DWS). For details, see Table 1. Similarly, GaussDB(DWS) exports data through a write-only foreign table, and stores the data in the ORC or Parquet format. Using Hive to read the ORC or Parquet file also requires matched data types. Table 2 shows the matching relationship.
Type |
Type Supported by GaussDB(DWS) Foreign Tables |
Hive Table Type |
|---|---|---|
1-byte integer |
TINYINT (not recommended) |
TINYINT |
SMALLINT (recommended) |
TINYINT |
|
2-byte integer |
SMALLINT |
SMALLINT |
4-byte integer |
INTEGER |
INT |
8-byte integer |
BIGINT |
BIGINT |
Single-precision floating point number |
FLOAT4 (REAL) |
FLOAT |
Double-precision floating point number |
FLOAT8(DOUBLE PRECISION) |
DOUBLE |
Scientific data type |
DECIMAL[p (,s)] (The maximum precision can reach up to 38.) |
DECIMAL (The maximum precision can reach up to 38.) (HIVE 0.11) |
Date type |
DATE |
DATE |
Time type |
TIMESTAMP |
TIMESTAMP |
Boolean type |
BOOLEAN |
BOOLEAN |
CHAR type |
CHAR(n) |
CHAR (n) |
VARCHAR type |
VARCHAR(n) |
VARCHAR (n) |
String (large text object) |
TEXT(CLOB) |
STRING |
Binary type |
BYTEA |
BINARY |
Only 9.1.0.100 and later versions support binary type.
Type |
Type Supported by GaussDB(DWS) Internal Tables (Data Source Table) |
Type Supported by GaussDB(DWS) Write-only Foreign Tables |
Hive Table Type |
|---|---|---|---|
1-byte integer |
TINYINT |
TINYINT (not recommended) |
SMALLINT |
SMALLINT (recommended) |
SMALLINT |
||
2-byte integer |
SMALLINT |
SMALLINT |
SMALLINT |
4-byte integer |
INTEGER, BINARY_INTEGER |
INTEGER |
INT |
8-byte integer |
BIGINT |
BIGINT |
BIGINT |
Single-precision floating point number |
FLOAT4, REAL |
FLOAT4, REAL |
FLOAT |
Double-precision floating point number |
DOUBLE PRECISION, FLOAT8, BINARY_DOUBLE |
DOUBLE PRECISION, FLOAT8, BINARY_DOUBLE |
DOUBLE |
Scientific data type |
DECIMAL, NUMERIC |
DECIMAL[p (,s)] (The maximum precision can reach up to 38.) |
precision ≤ 38: DECIMAL; precision > 38: STRING |
Date type |
DATE |
TIMESTAMP[(p)] [WITHOUT TIME ZONE] |
TIMESTAMP |
|
Time type |
TIME [(p)] [WITHOUT TIME ZONE], TIME [(p)] [WITH TIME ZONE] |
TEXT |
STRING |
TIMESTAMP[(p)] [WITHOUT TIME ZONE], TIMESTAMP[(p)][WITH TIME ZONE], SMALLDATETIME |
TIMESTAMP[(p)] [WITHOUT TIME ZONE] |
TIMESTAMP |
|
INTERVAL DAY (l) TO SECOND (p), INTERVAL [FIELDS] [(p)] |
VARCHAR(n) |
VARCHAR(n) |
|
Boolean type |
BOOLEAN |
BOOLEAN |
BOOLEAN |
CHAR type |
CHAR(n), CHARACTER(n), NCHAR(n) |
CHAR(n), CHARACTER(n), NCHAR(n) |
n ≤ 255: CHAR(n); n > 255: STRING |
VARCHAR type |
VARCHAR(n), CHARACTER VARYING(n), VARCHAR2(n) |
VARCHAR(n) |
n ≤ 65535: VARCHAR(n); n > 65535: STRING |
NVARCHAR2(n) |
TEXT |
STRING |
|
String (large text object) |
TEXT, CLOB |
TEXT, CLOB |
STRING |
Binary type |
BYTEA |
BYTEA |
BINARY |
Monetary type |
MONEY |
NUMERIC |
BIGINT |