Files
doc-exports/docs/dataartsstudio/umn/dataartsstudio_07_004.html
chenxiaoxiong f9e2808b7c DataArts UMN 20250810 version
Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com>
Co-authored-by: chenxiaoxiong <chenxiaoxiong@huawei.com>
Co-committed-by: chenxiaoxiong <chenxiaoxiong@huawei.com>
2025-09-02 10:44:13 +00:00

46 lines
6.9 KiB
HTML

<a name="dataartsstudio_07_004"></a><a name="dataartsstudio_07_004"></a>
<h1 class="topictitle1">Basic Concepts</h1>
<div id="body1555916447705"><div class="section" id="dataartsstudio_07_004__section145593115569"><h4 class="sectiontitle"><span id="dataartsstudio_07_004__text10541145944512">DataArts Studio</span> Instance</h4><p id="dataartsstudio_07_004__p4509132153819">A <span id="dataartsstudio_07_004__text1666254321817">DataArts Studio</span> instance is the minimum unit of compute resources provided for users. You can create, access, and manage multiple <span id="dataartsstudio_07_004__text66921523194610">DataArts Studio</span> instances at the same time. A <span id="dataartsstudio_07_004__text8951814409">DataArts Studio</span> instance allows you to access the following modules: Management Center, DataArts Architecture, DataArts Migration, DataArts Factory, DataArts Quality, and DataArts Catalog. You can obtain <span id="dataartsstudio_07_004__text19306946184613">DataArts Studio</span> instances with specifications tailored to your service requirements.</p>
</div>
<div class="section" id="dataartsstudio_07_004__section336353113115"><h4 class="sectiontitle">Workspace</h4><p id="dataartsstudio_07_004__p237433012315">A workspace enables admins to manage member permissions, resources, and configurations of the underlying compute engines.</p>
<p id="dataartsstudio_07_004__p13374193002312">The workspace is a basic unit for member management as well as role and permission assignment. Each team must have an independent workspace.</p>
<p id="dataartsstudio_07_004__p123741630192314">You can access the Management Center, DataArts Factory, and DataArts Migration modules only after your account is added to a workspace and assigned the permissions required to perform such operations.</p>
</div>
<div class="section" id="dataartsstudio_07_004__section12013220339"><h4 class="sectiontitle">Member and Role</h4><p id="dataartsstudio_07_004__p514263011257">A member is a account that has been assigned the permissions required to access and use a workspace. As an admin, when you add a workspace member, you must set a role.</p>
<p id="dataartsstudio_07_004__p3968163172510">A role is a predefined combination of permissions. Different roles have different permission sets. After a role is assigned to a member, the member has all the permissions of that role. Each member must have at least one role, and they can have multiple roles at the same time.</p>
</div>
<div class="section" id="dataartsstudio_07_004__section572693114139"><h4 class="sectiontitle">CDM Cluster</h4><p id="dataartsstudio_07_004__p27513141429">A CDM cluster run on an ECS. You can create data migration tasks in a CDM cluster and migrate data between homogeneous or heterogeneous data sources in the cloud and on-premises data center.</p>
</div>
<div class="section" id="dataartsstudio_07_004__section29076483298"><h4 class="sectiontitle">Data Source</h4><p id="dataartsstudio_07_004__p1638316515294">A data source is a medium for storing or processing data, such as a relational database, data warehouse, and data lake. Different data sources use different data storage, transmission, processing, and application modes, as well as different scenarios, technologies, and tools.</p>
</div>
<div class="section" id="dataartsstudio_07_004__section463314618275"><h4 class="sectiontitle">Source Data</h4><p id="dataartsstudio_07_004__p463312602711">Source data is the data that is not processed after created. In data management, source data refers to the data directly from source files (such as service system databases, offline files, and IoT files) or copies of source files.</p>
</div>
<div class="section" id="dataartsstudio_07_004__section171110516274"><h4 class="sectiontitle">Data Connection</h4><p id="dataartsstudio_07_004__p191115518279">A data connection is a collection of details required for accessing where data is stored, including the connection type, name, and login information.</p>
</div>
<div class="section" id="dataartsstudio_07_004__section854545714242"><h4 class="sectiontitle">Concurrency</h4><p id="dataartsstudio_07_004__p145451157112416">Concurrency refers to the maximum number of threads that can be concurrently read from the source in a data integration job.</p>
</div>
<div class="section" id="dataartsstudio_07_004__section62441169266"><h4 class="sectiontitle">Dirty Data</h4><p id="dataartsstudio_07_004__p32441064264">Dirty data refers to the data meaningless to business or in invalid format. For example, if the source data of the VARCHAR type is not properly converted, it cannot be written to the destination column of the INT type.</p>
</div>
<div class="section" id="dataartsstudio_07_004__section176851113122820"><h4 class="sectiontitle">Job (DataArts Factory)</h4><p id="dataartsstudio_07_004__p587393918285">A job is composed of one or more nodes that run together to complete data operations.</p>
</div>
<div class="section" id="dataartsstudio_07_004__section954804713285"><h4 class="sectiontitle">Node</h4><p id="dataartsstudio_07_004__p2169125015282">A node is a definition for the actions to be performed on your data. For example, you can use the MRS Spark node to execute predefined Spark jobs in MRS.</p>
</div>
<div class="section" id="dataartsstudio_07_004__section1965541319286"><h4 class="sectiontitle">Solution</h4><p id="dataartsstudio_07_004__p947712922819">A solution is a series of convenient and systematic management operations that meet service requirements and objectives. Each solution can contain one or more business-related jobs, and each job can be reused by multiple solutions.</p>
</div>
<div class="section" id="dataartsstudio_07_004__section135651547122816"><h4 class="sectiontitle">Resource</h4><p id="dataartsstudio_07_004__p1237428133014">A resource is the self-defined code or text file that you upload. It is invoked when nodes run.</p>
</div>
<div class="section" id="dataartsstudio_07_004__section11833175217303"><h4 class="sectiontitle">Expression Language (EL)</h4><p id="dataartsstudio_07_004__p2787653123016">Node parameters in data development jobs can be dynamically generated based on the running environment using ELs. An EL often uses simple arithmetic and calculation logic and references embedded objects including job objects and tool objects.</p>
</div>
<div class="section" id="dataartsstudio_07_004__section11468175511306"><h4 class="sectiontitle">Environment Variable</h4><p id="dataartsstudio_07_004__p112110568306">An environmental variable is an object with a specific name in the operating system. It contains information to be used by one or more applications.</p>
</div>
<div class="section" id="dataartsstudio_07_004__section155251232123018"><h4 class="sectiontitle">PatchData</h4><p id="dataartsstudio_07_004__p462195215318">PatchData is an instance that was generated in the past by a repeatedly scheduled job.</p>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="dataartsstudio_12_0001.html">Service Overview</a></div>
</div>
</div>