ZSTD_JNI is a native implementation of the ZSTD compression algorithm. Compared with ZSTD, ZSTD_JNI has higher compression read/write efficiency and compression ratio, and allows you to specify the compression level as well as the compression mode for data columns in a specific format.
Currently, only ORC tables can be compressed using ASTD_JNI. By contrast, ZSTD enables you to compress tables in the full storage format. Therefore, you are advised to use this feature only when you have high requirements on data compression.
This section applies only to MRS 3.2.0 or later.
cd /opt/client
source bigdata_env
beeline
create table tab_1(...) stored as orc TBLPROPERTIES("orc.compress"="ZSTD_JNI");
create table tab_1(...) stored as orc TBLPROPERTIES("orc.compress"="ZSTD_JNI", 'orc.global.compress.level'='3');
The following example code shows how to use ZSTD_JNI to compress data in the JSON, Base64, timestamp, and UUID formats.
create table test_orc_zstd_jni(f1 int, f2 string, f3 string, f4 string, f5 string) stored as orc
TBLPROPERTIES('orc.compress'='ZSTD_JNI', 'orc.column.compress'='[{"type":"cjson","columns":"f2"},{"type":"base64","columns":"f3"},{"type ":"gorilla","columns":{"format": "yyyy-MM-dd HH:mm:ss.SSS", "columns": "f4"}},{"type":"uuid","columns":"f5"}]');
You can insert data in the corresponding format based on the site requirements to further compress the data.