Reviewed-by: Kacur, Michal <michal.kacur@t-systems.com> Co-authored-by: Yang, Tong <yangtong2@huawei.com> Co-committed-by: Yang, Tong <yangtong2@huawei.com>
5.7 KiB
CLEAN_FILE
Function
Cleans invalid data files from the Hudi table directory.
Syntax
call clean_file(table => '[table_name]', mode=>'[op_type]', backup_path=>'[backup_path]', start_instant_time=>'[start_time]', end_instant_time=>'[end_time]');
Parameter Description
Parameter |
Description |
---|---|
table_name |
Mandatory. Name of the Hudi table from which invalid data files are to be deleted. |
op_type |
Optional. Command running mode. The default value is dry_run. Value options are dry_run, repair, undo, and query. dry_run: displays invalid data files to be cleaned. repair: displays and cleans invalid data files. undo: restores deleted data files. query: displays the backup directories that have been cleaned. |
backup_path |
Mandatory. Backup directory of the data files to be restored. This parameter is available only when the running mode is undo. |
start_time |
Optional. Start time for generating invalid data files. This parameter is available only when the running mode is dry_run or repair. The start time is not limited by default. |
end_time |
Optional. End time for generating invalid data files. This parameter is available only when the running mode is dry_run or repair. The end time is not limited by default. |
Example
call clean_file(table => 'h1', mode=>'repair'); call clean_file(table => 'h1', mode=>'dry_run'); call clean_file(table => 'h1', mode=>'query'); call clean_file(table => 'h1', mode=>'undo', backup_path=>'/tmp/hudi/h1/.hoodie/.cleanbackup/hoodie_repair_backup_20220222222222');
Precautions
The command cleans only invalid Parquet files.
System Response
You can view command execution results in the driver log or on the client.