This section describes how to develop and configure a job.
For details about how to develop a batch processing job in single-task mode, see sections Developing an SQL Script, Configuring job parameters, Data Table, Testing and Saving the Job, and Downloading or Dumping a Script Execution Result.
Parameter |
Description |
|---|---|
Owner |
An owner configured during job creation is automatically matched. This parameter value can be modified. |
Executor |
This parameter is available when Scheduling Identities is set to Yes. User that executes the job. When you enter an executor, the job is executed by the executor. If the executor is left unspecified, the job is executed by the user who submitted the job for startup. NOTE:
You can configure execution users only after you apply for the whitelist membership. To enable it, contact customer service or technical support. |
Job Agency |
This parameter is available when Scheduling Identities is set to Yes. After an agency is configured, the job interacts with other services as an agency during job execution. |
Priority |
Priority configured during job creation is automatically matched. This parameter value can be modified. |
Execution Timeout |
Timeout of the job instance. If this parameter is set to 0 or is not set, this parameter does not take effect. If the notification function is enabled for the job and the execution time of the job instance exceeds the preset value, the system sends a specified notification, and the job keeps running. |
Exclude Waiting Time from Instance Timeout Duration |
Whether to exclude the wait time from the instance execution timeout duration If you select this option, the time to wait before an instance starts running is excluded from the timeout duration. You can modify this setting in Default Configuration > Exclude Waiting Time from Instance Timeout Duration. If you do not select this option, the time to wait before an instance starts running is included in the timeout duration. |
Custom Parameter |
Set the name and value of the parameter. |
Job Tag |
Configure job tags to manage jobs by category. Click Add to add a tag to the job. You can also select a tag configured in Managing Job Tags. |
Job Description |
Description of the job |
Property |
Description |
|---|---|
DLI SQL properties |
|
DLI Data Directory |
Select the DLI data directory.
|
Database Name |
Select a database. If you select the default DLI data directory dli, select a DLI database and tables. If you select a metadata catalog that has been created in LakeFormation associated with DLI, select a LakeFormation database and tables. |
Queue Name |
The queue set in the SQL script is selected by default. You can change another one. You can create a resource queue using either of the following methods:
|
Record Dirty Data |
Click
|
DLI Environmental Variable |
|
DWS SQL properties |
|
Data Connection |
Select a data connection. |
Database |
Select a database. |
Dirty Data Table |
Name of the dirty data table defined in the SQL script. The dirty data attributes cannot be edited. They are automatically recommended by the SQL script content. |
Matching Rule |
Enter a Java regular expression used to match the DWS SQL result. For example, if the expression is (?<=\()(-*\d+?)(?=,) and the SQL result is (1,"error message"), then the matched result is "1". |
Failure Matching Value |
If the matched content equals the set value, the node fails to be executed. |
RDS SQL properties |
|
Data Connection |
Select a data connection. |
Database |
Select a database. |
Spark SQL properties |
|
MRS Job Name |
MRS job name. The system automatically sets this parameter based on the job name. If the MRS job name is not set and the direct connection mode is selected, the node name can contain only letters, digits, hyphens (-), and underscores (_). A maximum of 64 characters are allowed, and Chinese characters are not allowed. NOTE:
If you select an MRS API data connection, you cannot set the job name. |
Data Connection |
Select a data connection. |
MRS Resource Queue |
Select a created MRS resource queue. This parameter is mandatory if Whether MRS Resource Queue Is Mandatory is set to Yes. NOTE:
Select a queue you configured in the queue permissions of DataArts Security. If you set multiple resource queues for this node, the resource queue you select here has the highest priority. |
Database |
Select a database. If you select an MRS API connection, you cannot select a database. |
Program Parameter |
Set program parameters. The following is an example: Set Parameter to --queue and Value to default_cr, indicating that a specified queue of the MRS cluster is configured. You can also go to the MRS console, click the name of the MRS cluster and then the Jobs tab, locate the job, click More in the Operation column, and select View Details to view the job details. NOTE:
Configure optimization parameters such as threads, memory, and vCPUs for the job to optimize resource usage and improve job execution performance. This configuration is unavailable if a Spark proxy connection is used. Spark SQL jobs with a single operator and using a connection of the MRS API type support program parameters. |
Hive SQL properties |
|
MRS Job Name |
MRS job name. The system automatically sets this parameter based on the job name. If the MRS job name is not set and the direct connection mode is selected, the node name can contain only letters, digits, hyphens (-), and underscores (_). A maximum of 64 characters are allowed, and Chinese characters are not allowed. |
Data Connection |
Select a data connection. |
Database |
Select a database. |
MRS Resource Queue |
Select a created MRS resource queue. This parameter is mandatory if Whether MRS Resource Queue Is Mandatory is set to Yes. |
Program Parameter |
Set program parameters. The following is an example: Set Parameter to --hiveconf and Value to mapreduce.job.queuename=default_cr, indicating that a specified queue of the MRS cluster is configured. You can also go to the MRS console, click the name of the MRS cluster and then the Jobs tab, locate the job, click More in the Operation column, and select View Details to view the job details. NOTE:
Configure optimization parameters such as threads, memory, and vCPUs for the job to optimize resource usage and improve job execution performance. This configuration is unavailable if a Hive proxy connection is used. Hive SQL jobs with a single operator and using a connection of the MRS API type support program parameters. |
Doris SQL properties |
|
Data Connection |
Select a data connection. |
Database |
Select a database. |
Parameter |
Mandatory |
Description |
|---|---|---|
Node Status Polling Interval (s) |
Yes |
How often the system checks whether the node execution is complete. The value ranges from 1 to 60 seconds. During the node execution, the system checks whether the node execution is complete at the configured interval. |
Max. Node Execution Duration |
Yes |
Execution timeout interval for the node. If retry is configured and the execution is not complete within the timeout interval, the node will be executed again. |
Retry upon Failure |
Yes |
Whether to re-execute a node if it fails to be executed.
|
Policy for Handling Subsequent Nodes If the Current node Fails |
Yes |
Policy for handling subsequent nodes if the current node fails
|
select 1; select * from a where b="dsfa\;"; --example 1\;example 2.
To view the functions supported by this type of data connection, click System Functions on the right of the editor. You can double-click a function to the editor to use it.
Enter script parameters in the SQL statement and click Parameter Setup in the right pane of the editor and then click Update from Script. You can also directly configure parameters and constants for the job script.
In the following script example, str1 indicates the parameter name. It can contain only letters, digits, hyphens (-), underscores (_), greater-than signs (>), and less-than signs (<), and can contain a maximum of 16 characters. The parameter name must be unique.
select ${str1} from data;
Click Data Tables on the right of the editor to display all the tables in the current database or schema. You can select tables and columns and click Generate SQL Statement in the lower right corner to generate an SQL statement, which you need to manually format.
Click Parameter Setup on the right of the editor and set the parameters described in Table 4.
Module |
Description |
|---|---|
Variables |
|
Add |
Click Add and enter the variable parameter name and parameter value in the text boxes.
After the parameter is configured, it is referenced in the format of ${Parameter name} in the job. |
Edit Parameter Expression |
Click |
Modifying a Job |
Change the parameter name or value in the corresponding text boxes. |
Mask |
If the parameter value is a key, click |
Delete |
Click |
Constant Parameter |
|
Add |
Click Add and enter the constant parameter name and parameter value in the text boxes.
After the parameter is configured, it is referenced in the format of ${Parameter name} in the job. |
Edit Parameter Expression |
Click |
Modifying a Job |
Modify the parameter name and parameter value in text boxes and save the modifications. |
Delete |
Click |
Workspace Environment Variables |
|
View the variables and constants that have been configured in the workspace. |
|
Click the Parameter Preview tab and configure the parameters listed in Table 5.
Module |
Description |
|---|---|
Current Time |
This parameter is displayed only when Scheduling Type is set to Run once. The default value is the current time. |
Event Triggering Time |
This parameter is displayed only when Scheduling Type is set to Event-based. The default value is the time when an event is triggered. |
Scheduling Period |
This parameter is displayed only when Scheduling Type is set to Run periodically. The default value is the scheduling period. |
Start Time |
This parameter is displayed only when Scheduling Type is set to Run periodically. The value is the configured job execution time. |
Start Time |
This parameter is displayed only when Scheduling Type is set to Run periodically. The value is the time when the periodic job scheduling starts. |
Subsequent Instances |
Number of job instances scheduled.
|
In Parameter Preview, if a job parameter has a syntax error, the system displays a message.
If a parameter depends on the data generated during job execution, such data cannot be simulated and displayed in Parameter Preview.
You can view tables of Hive SQL, Spark SQL, DLI SQL, Doris SQL, RDS SQL, and DWS SQLsingle-task batch processing jobs. On the Data Tables slide-out panel, you can select a table name to view the column names, field types, and descriptions in the table.

After configuring the job, perform the following operations:
to execute the job.
You can view the run logs of the job by clicking View Log.
to save the job configuration.After the job is saved, a version is automatically generated and displayed in Versions. The version can be rolled back. If you save a job multiple times within a minute, only one version is recorded. If the intermediate data is important, you can click Save new version to save and add a version.
Parameter |
Mandatory |
Description |
|---|---|---|
Data Format |
Yes |
Format of the data to be exported. CSV and JSON formats are supported. |
Resource Queue |
No |
DLI queue where the export operation is to be performed. Set this parameter when a DLI or SQL script is created. |
Compression Format |
No |
Format of compression. Set this parameter when a DLI or SQL script is created.
|
Storage Path |
Yes |
OBS path where the result file is stored. After selecting an OBS path, customize a folder. Then, the system will create it automatically for storing the result file. You can also go to the Download Center page to set the default OBS path, which will be automatically set for Storage Path in the Dump Result dialog box. |
Cover Type |
No |
If a folder that has the same name as your custom folder exists in the storage path, select a cover type. Set this parameter when a DLI or SQL script is created.
|
Export Column Name |
No |
Yes: Column names will be exported. No: Column names will not be exported. |
Character Set |
No |
|
Quotation Character |
No |
This parameter is available and can be set only when Data Format is csv. Quotation characters are used to identify the beginning and end of text fields when exporting job results, and are used to separate fields. Only one character can be set. The default value is double quotation marks ("). This is mainly used to handle data that contains spaces, special characters, or characters that are the same as the delimiter. For details about the examples of using quotation characters and escape characters, see Example of Using Quotation Characters and Escape Characters. |
Escape Character |
No |
This parameter is available and can be set only when Data Format is csv. If special characters, such as quotation marks, need to be included in the exported results, they can be represented using escape characters (backslash \). Only one character can be set. The default value is a backslash (\). Common scenarios for using escape characters are:
For details about the examples of using quotation characters and escape characters, see Example of Using Quotation Characters and Escape Characters. |
SQL Type |
Maximum Number of Results That You Can View Online |
Maximum Number/Size of Results That Can Be Downloaded |
Maximum Number/Size of Results That Can Be Dumped |
|---|---|---|---|
DLI |
1,000 |
1,000 records, less than 3MB |
Unlimited |
Hive |
1,000 |
1,000 records, less than 3MB |
10,000 records or 3 MB |
GaussDB(DWS) |
1,000 |
1,000 records, less than 3MB |
10,000 records or 3 MB |
Spark |
1,000 |
1,000 records, less than 3MB |
10,000 records or 3 MB |
RDS |
1,000 |
1,000 records, less than 3MB |
Not supported |
Doris |
1,000 |
1,000 records, less than 3MB |
1,000 records or 3 MB |
You can leave Quotation Character and Escape Character empty.

If you leave them empty, the downloaded .csv file contains two rows in Excel.

If you specify both of them, for example, enter double quotation marks ("), the downloaded file is as follows.
