The system obtains the number of subfiles and subdirectories in a specified directory every hour and checks whether it reaches the percentage of the threshold (the maximum number of subfiles and subdirectories in an HDFS directory, the threshold for triggering an alarm is 90% by default). If it exceeds the percentage of the threshold, an alarm is triggered.
When the number of subfiles and subdirectories in the directory the alarm is lower than the percentage of the threshold, the alarm is automatically cleared. When the monitoring switch is disabled, alarms corresponding to all directories are cleared. If a directory is removed from the monitoring list, alarms corresponding to the directory are cleared.
Alarm ID |
Alarm Severity |
Automatically Cleared |
---|---|---|
14020 |
Major |
Yes |
Name |
Meaning |
---|---|
Source |
Specifies the cluster for which the alarm is generated. |
ServiceName |
Specifies the service for which the alarm is generated. |
RoleName |
Specifies the role for which the alarm is generated. |
NameServiceName |
Specifies the NameService service for which the alarm is generated. |
Directory |
Specifies the directory for which the alarm is generated. |
Trigger Condition |
Specifies the threshold triggering the alarm. If the current indicator value exceeds this threshold, the alarm is generated. |
If the number of entries in the monitored directory exceeds 90% of the threshold, an alarm is triggered, but entries can be added to the directory. Once the maximum threshold is exceeded, entries will fail to be added to the directory.
The number of entries in the monitored directory exceeds 90% of the threshold.
Check whether unnecessary files exist in the system.
If the cluster is in security mode, security authentication is required.
Run the kinit hdfs command and enter the password as prompted. Obtain the password from the administrator.
hdfs dfs -ls Directory with the alarm
hdfs dfs -rm -r -f File or directory path
Deleting a file or folder is a high-risk operation. Ensure that the file or folder is no longer required before performing this operation.
Check whether the threshold is correctly configured.
Collect fault information.
After the fault is rectified, the system automatically clears this alarm.
None