You can submit programs developed by yourself to MRS to execute them, and obtain the results. This section describes how to submit a Spark job on the MRS console.
You have uploaded the program packages and data files required for running jobs to OBS or HDFS.
In the Basic Information area on the Dashboard page, click Synchronize on the right side of IAM User Sync to synchronize IAM users. For details, see Synchronizing IAM Users to MRS.
Parameter |
Description |
---|---|
Name |
Job name. It contains 1 to 64 characters. Only letters, digits, hyphens (-), and underscores (_) are allowed. NOTE:
You are advised to set different names for different jobs. |
Program Path |
Path of the program package to be executed. The following requirements must be met:
|
Program Parameter |
(Optional) Used to configure optimization parameters such as threads, memory, and vCPUs for the job to optimize resource usage and improve job execution performance. Table 2 describes the common parameters of a running program. |
Parameters |
(Optional) Key parameter for program execution. The parameter is specified by the function of the user's program. MRS is only responsible for loading the parameter. Multiple parameters are separated by space. The parameter contains a maximum of 150,000 characters. It cannot contain special characters ;|&><'$, but can be left blank. CAUTION:
If you enter a parameter with sensitive information (such as the login password), the parameter may be exposed in the job details display and log printing. Exercise caution when performing this operation. |
Service Parameter |
(Optional) It is used to modify service parameters for the job. The parameter modification applies only to the current job. To make the modification take effect permanently for the cluster, follow instructions in Configuring Service Parameters. To add multiple parameters, click Table 3 lists the common service configuration parameters. NOTE:
If you need to run a long-term job, such as SparkStreaming, and access OBS, you need to use Service Parameter to import the AK/SK for accessing OBS. |
Command Reference |
Command submitted to the background for execution when a job is submitted. |
Parameter |
Description |
Example Value |
---|---|---|
--conf |
Add the task configuration items. |
spark.executor.memory=2G |
--driver-memory |
Set the running memory of driver. |
2G |
--num-executors |
Set the number of executors to be started. |
5 |
--executor-cores |
Set the number of executor cores. |
2 |
--class |
Set the main class of a task. |
org.apache.spark.examples.SparkPi |
--files |
Upload files to a task. The files can be custom configuration files or some data files from OBS or HDFS. |
- |
--jars |
Upload additional dependency packages of a task to add the external dependency packages to the task. |
- |
--executor-memory |
Set executor memory. |
2G |
--conf spark-yarn.maxAppAttempts |
Control the number of AM retries. |
If this parameter is set to 0, retry is not allowed. If this parameter is set to 1, one retry is allowed. |
Parameter |
Description |
Example Value |
---|---|---|
fs.obs.access.key |
Key ID for accessing OBS. |
- |
fs.obs.secret.key |
Key corresponding to the key ID for accessing OBS. |
- |
Parameter |
Description |
---|---|
Name |
Job name. It contains 1 to 64 characters. Only letters, digits, hyphens (-), and underscores (_) are allowed. NOTE:
You are advised to set different names for different jobs. |
Program Path |
Path of the program package to be executed. The following requirements must be met:
|
Parameters |
Key parameter for program execution. The parameter is specified by the function of the user's program. MRS is only responsible for loading the parameter. Multiple parameters are separated by space. Configuration method: Package name.Class name The parameter contains a maximum of 150,000 characters. It cannot contain special characters ;|&><'$, but can be left blank. NOTE:
When entering a parameter containing sensitive information (for example, login password), you can add an at sign (@) before the parameter name to encrypt the parameter value. This prevents the sensitive information from being persisted in plaintext. When you view job information on the MRS console, the sensitive information is displayed as *. Example: username=admin @password=admin_123 |
Import From |
Path for inputting data Data can be stored in HDFS or OBS. The path varies depending on the file system.
The parameter contains a maximum of 1,023 characters, excluding special characters such as ;|&>,<'$, and can be left blank. |
Export To |
Path for outputting data NOTE:
Data can be stored in HDFS or OBS. The path varies depending on the file system.
The parameter contains a maximum of 1,023 characters, excluding special characters such as ;|&>,<'$, and can be left blank. |
Log Path |
Path for storing job logs that record job running status. Data can be stored in HDFS or OBS. The path varies depending on the file system.
The parameter contains a maximum of 1,023 characters, excluding special characters such as ;|&>,<'$, and can be left blank. |
After the job is created, you can manage it.
In MRS 3.x and later versions, the default installation path of the client is /opt/Bigdata/client. In MRS 3.x and earlier versions, the default installation path is /opt/client. For details, see the actual situation.
In this example, a machine-machine user used in the user development scenario has been created, and user groups (hadoop and supergroup), the primary group (supergroup), and role permissions (System_administrator and default) have been correctly assigned to the user.
tar –xvf MRSTest _xxxxxx_keytab.tar
You will obtain two files: user.keytab and krb5.conf.
source /opt/Bigdata/client/bigdata_env
cd $SPARK_HOME
./bin/spark-submit --master yarn --deploy-mode client --conf spark.yarn.principal=MRSTest --conf spark.yarn.keytab=/opt/user.keytab --class org.apache.spark.examples.SparkPi examples/jars/spark-examples_2.11-2.3.2-mrs-2.0.jar 10
Parameter description: