By default, components in an MRS 3.2.0-LTS.1 or later cluster support prevention against accidental data deletion. Native HDFS garbage collection can be used in the Hadoop big data systems that use OBS.
The file data deleted by a component user is not directly deleted, but is stored in the recycle bin of the OBS file system instead. This section describes how to set a lifecycle rule for the recycle bin directory to periodically clear related data.
You need to configure lifecycle policies for the recycle bin directories of preset users in the MRS cluster and the recycle bin directories of new users who need accidental deletion prevention. If a low privileged agency is used or only the permission for MRS users to access OBS file system directories is configured by referring to Configuring Fine-Grained Permissions for MRS Multi-User Access to OBS, you will need the operation permission for the recycle bin directory.
Cluster Version |
Directory Type |
Component |
Directory |
How to Create |
---|---|---|---|---|
Versions earlier than MRS 3.3.0-LTS |
Recycle bin directories that must be configured by default for each component in an MRS cluster |
Hive |
|
If the .Trash folder does not exist, create it on the cluster client as user omm. Run the following command: hdfs dfs -mkdir -p obs://Name of the OBS parallel file system where the table is stored/Folder path |
Spark |
|
|||
HetuEngine |
|
|||
HBase |
|
|||
Recycle bin directories of users who need accidental deletion prevention |
Hive/Spark/HetuEngine |
user/<New service user>/.Trash |
||
MRS 3.3.0-LTS or later |
Default recycle bin directories configured for each component in an MRS cluster |
Hive/Spark/HetuEngine |
/user/.Trash |
For example, if a new user in the cluster has the following permissions, you need to create a recycle bin directory clearing rule for the user in the parallel file system:
Name |
Description |
Example Value |
---|---|---|
Status |
Whether to enable the lifecycle rule. |
Enable |
Rule Name |
Rule name that identifies different lifecycle configurations. |
rule-test |
Prefix |
Prefix of the objects to which the lifecycle rule applies. Objects that have the specified prefix will be managed by the lifecycle rule. The prefix cannot start with a slash (/), have consecutive slashes (/), or contain the following special characters: \:*?"<>| If this parameter is not specified, the rule will take effect for the entire file system. NOTE:
To prevent other service data from being deleted by mistake, you are not advised to use the lifecycle rule configured for the entire file system or high-level directories. Generally, the recycle bin directory of MRS components is in the following format. If the folder does not exist, create it. user/<Username>/.Trash |
user/omm/.Trash |
Delete Files After (Days) |
The object within the rule configuration scope expires and is automatically deleted by OBS if the number of days since its last update reaches this parameter value. |
30 days |
You can click Edit in the Operation column of a lifecycle rule to edit it. You can also click Disable or Enable to disable or enable it.