Regular Expressions for Separating Semi-structured Text

During table/file migration, CDM uses delimiters to separate fields in CSV files. However, delimiters cannot be used in complex semi-structured data because the field values also contain delimiters. In this case, the regular expression can be used to separate the fields.

The regular expression is configured in Source Job Configuration. The migration source must be an object storage or file system, and File Format must be CSV.

Figure 1 Setting regular expression parameters
During the migration of CSV files, CDM can use regular expressions to separate fields and write parsed results to the migration destination. For details about the syntax of the regular expression, refer to the related documents. This section describes the regular expressions of the following log files:

Log4J Log

Log4J Audit Log

Tomcat Log

Django Log

Apache Server Log