HDFS Data

Establishing a Data Transmission Channel

Backing Up HDFS Data

Based on the regions of and network connectivity between the source cluster and destination cluster, data backup scenarios are classified as follows:

Backing Up HDFS Metadata

HDFS metadata information to be exported includes file and folder permissions and owner/group information. You can run the following command on the HDFS client to export the metadata:

$HADOOP_HOME/bin/hdfs dfs –ls –R <migrating_path> > /tmp/hdfs_meta.txt

The following provides description about the parameters in the preceding command.

If the source cluster can communicate with the destination cluster and you run the hadoop distcp command as a super administrator to copy data, you can add the -p parameter to enable DistCp to restore the metadata of the corresponding file in the destination cluster while copying data. In this case, skip this step.

HDFS File Property Restoration

Based on the exported permission information, run the HDFS commands in the background of the destination cluster to restore the file permission and owner and group information.

$HADOOP_HOME/bin/hdfs dfs –chmod <MODE> <path>
$HADOOP_HOME/bin/hdfs dfs –chown <OWNER> <path>