The system checks the number of blocks to be supplemented every 30 seconds and compares the number with the threshold. The number of blocks to be supplemented has a default threshold. This alarm is generated when the number of blocks to be supplemented exceeds the threshold.
You can change the threshold specified by Blocks Under Replicated (NameNode) by choosing O&M > Alarm > Thresholds > Name of the desired cluster > HDFS > File and Block.
If Trigger Count is set to 1 and the number of blocks to be supplemented is less than or equal to the threshold, this alarm is cleared. If Trigger Count is greater than 1 and the number of blocks to be supplemented is less than or equal to 90% of the threshold, this alarm is cleared.
Alarm ID |
Alarm Severity |
Auto Clear |
---|---|---|
14028 |
Minor |
Yes |
Name |
Meaning |
---|---|
Source |
Specifies the cluster for which the alarm is generated. |
ServiceName |
Specifies the service for which the alarm is generated. |
RoleName |
Specifies the role for which the alarm is generated. |
HostName |
Specifies the host for which the alarm is generated. |
NameServiceName |
Specifies the NameService for which the alarm is generated. |
Trigger Condition |
Specifies the threshold for triggering the alarm. |
Data stored in HDFS is lost. HDFS may enter the security mode and cannot provide write services. Lost block data cannot be restored.
cat fsck.log | grep "Under-replicated"
cat fsck.log | grep "Under replicated" | grep "/tmp/hadoop-yarn/staging/" | wc -l
/tmp/hadoop-yarn/staging/ is the default directory. If the directory is modified, obtain it from the configuration item yarn.app.mapreduce.am.staging-dir in the mapred-site.xml file.
hdfs dfs -setrep -w Number of file replicas/tmp/hadoop-yarn/staging/
To obtain the default number of file replicas:
Log in to FusionInsight Manager, choose Cluster > Services > HDFS > Configurations > All Configurations, and search for the dfs.replication parameter. The value of this parameter is the default number of file replicas.
Check whether the alarm is cleared 5 minutes later.
Collect the fault information.
This alarm is automatically cleared after the fault is rectified.
None