This section describes how to create an MRS Hive connection between DataArts Studio and the data lake base.
For example, if your data lake service is an MRS cluster, you need to prepare two MRS clusters with the same version, specifications, components, region, VPC, and subnet. If some configurations of an MRS cluster are modified, you also need to synchronize the modifications to the other MRS cluster.


Parameter |
Mandatory |
Description |
|---|---|---|
Data Connection Type |
Yes |
MRS Hive is selected by default and cannot be changed. |
Name |
Yes |
Name of the data connection to create. Data connection names can contain a maximum of 100 characters. They can contain only letters, digits, underscores (_), and hyphens (-). |
Tag |
No |
Attribute of the data connection to create. Tags make management easier.
NOTE:
The tag name can contain only letters, digits, and underscores (_) and cannot start with an underscore (_) or contain more than 100 characters. |
Applicable Modules |
Yes |
Select the modules for which this connection is available. |
Basic and Network Connectivity Configuration |
||
Connection Type |
Yes |
Connection type. Proxy connection is recommended.
NOTE:
Select Proxy connection for Connection Type so that the DataArts Architecture, DataArts Quality, DataArts Catalog, and DataArts DataService components can use the MRS connection. |
Manual |
Yes |
This parameter is mandatory when Connection Type is set to Proxy connection. Select the connection mode. If you do not need to access MRS clusters in other projects or enterprise projects, select Cluster Name Mode.
|
Manager IP |
Yes |
This parameter is mandatory when Connection String Mode is selected for Manual. Set this parameter to the floating IP address of MRS Manager. Only MRS clusters are supported. A Hadoop cluster can be connected only after it is managed by MRS.
NOTE:
DataArts Studio does not support MRS clusters whose Kerberos encryption type is aes256-sha2,aes128-sha2, and only supports MRS clusters whose Kerberos encryption type is aes256-sha1,aes128-sha1. You can click Select next to the text box and select an MRS cluster in the same project and enterprise project. If you want to access an MRS cluster in another project or enterprise project, obtain and enter the floating IP address of MRS Manager and ensure that the connection's agent (CDM cluster) can communicate with the tenant-plane MRS cluster. To obtain the floating IP address of MRS Manager, log in to the active master node of the MRS cluster and run the ifconfig command. In the command output, the IP address of eth0:wsom is the floating IP address of MRS Manager. For details about how to log in to the master node of the MRS cluster, see "Manager Operation Guide" > "Getting Started" > "Logging In to an MRS Cluster Node" in MapReduce Service (MRS) x.x.x User Guide. Enter multiple IP addresses based on the scenario in sequence and separate them with commas (,), for example, 127.0.0.1 or 127.0.0.1,127.0.0.2,127.0.0.3.
|
MRS Cluster Name |
Yes |
This parameter is mandatory when MRS API connection is selected for Connection Type or Cluster Name Mode is selected for Manual. The name of the MRS cluster. Select an MRS cluster that Hive belongs to. Only MRS clusters are supported. A Hadoop cluster can be selected only after it is managed by MRS. All the MRS clusters with the same project ID and enterprise project are displayed.
NOTE:
DataArts Studio does not support MRS clusters whose Kerberos encryption type is aes256-sha2,aes128-sha2, and only supports MRS clusters whose Kerberos encryption type is aes256-sha1,aes128-sha1. If the connection fails after you select a cluster, check whether the MRS cluster can communicate with the CDM instance which functions as the agent. They can communicate with each other in the following scenarios:
NOTE:
If an agent is connected to multiple MRS clusters and one of the MRS clusters is deleted or abnormal, connections to the other MRS clusters will be affected. Therefore, you are advised to connect an agent to only one MRS cluster. |
KMS Key |
No |
This parameter is mandatory when Connection Type is set to Proxy connection. KMS key used to encrypt and decrypt data source authentication information. Select a default or custom key.
NOTE:
|
Agent |
Yes |
This parameter is mandatory when Connection Type is set to Proxy connection. MRS is not a fully managed service and cannot be directly connected to DataArts Studio. A CDM cluster can provide an agent for DataArts Studio to communicate with non-fully-managed services. Therefore, you need to select a CDM cluster when creating an MRS data connection. If no CDM cluster is available, create one first by referring to Creating a CDM Cluster. As a network proxy, the CDM cluster must be able to communicate with the MRS cluster. To ensure network connectivity, the CDM cluster must be in the same region and AZ and use the same VPC and subnet as the MRS cluster. The security group rule must also allow the CDM cluster to communicate with the MRS cluster. NOTE:
|
Data Source Authentication and Other Function Configuration |
||
Authentication Method |
Yes |
This parameter is mandatory when Connection String Mode is selected for Manual. It specifies the authentication method used for accessing the MRS cluster. The following options are available:
|
Username |
Yes |
Human-machine user of the MRS cluster. This parameter is mandatory when Connection Type is set to Proxy connection. If a new MRS user is used for connection, you need to log in to Manager and change the initial password. To create a data connection for an MRS security cluster, do not use user admin. The admin user is the default management page user and cannot be used as the authentication user of the security cluster. You can create an MRS user whose password never expires by referring to Creating a Kerberos Authentication User for an MRS Security Cluster. When creating an MRS data connection, set Username and Password to the new MRS username and password.
NOTE:
|
Password |
Yes |
The password for accessing the MRS cluster. This parameter is mandatory when Connection Type is set to Proxy connection. |
Enable ldap |
No |
This parameter is available when Connection Type is set to Proxy connection. If LDAP authentication is enabled for an external LDAP server connected to MRS Hive, the LDAP username and password are required for authenticating the connection to MRS Hive. In this case, this option must be enabled. Otherwise, the connection will fail. |
ldapUsername |
Yes |
This parameter is mandatory when Enable ldap is enabled. Enter the username configured when LDAP authentication was enabled for MRS Hive. |
ldapPassword |
Yes |
This parameter is mandatory when Enable ldap is enabled. Enter the password configured when LDAP authentication was enabled for MRS Hive. |
OBS storage support |
No |
This parameter is displayed when DataArts Migration is selected for Applicable Modules. The server must support OBS storage. When creating a Hive table, you can store the table in OBS. |
Use Agency |
No |
This parameter is displayed when DataArts Migration is selected for Applicable Modules. If you enable the agency function, you can create a data connection without having a permanent AK/SK and execute CDM jobs using the scheduling identity configured in DataArts Factory. |
Public agency |
No |
This parameter is displayed when DataArts Migration is selected for Applicable Modules and Use Agency is enabled. The agency is only used to check whether the connection agency function is normal. CDM jobs will be executed using the scheduling identity configured in DataArts Factory. |
AK |
N/A |
This parameter is displayed when DataArts Migration is selected for Applicable Modules and OBS storage support is enabled. AK and SK are used to log in to the OBS server. You need to create an access key for the current account and obtain an AK/SK pair. To obtain an access key, perform the following steps:
|
SK |
N/A |
|
The CDM cluster functions as a network agent. MRS data connections that you are going to create need to communicate with CDM.
The possible cause is that the CDM cluster is stopped or a concurrency conflict occurs. You can switch to another agent to temporarily avoid this issue.