Yang, Tong 6182f91ba8 MRS component operation guide_normal 2.0.38.SP20 version
Reviewed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com>
Co-authored-by: Yang, Tong <yangtong2@huawei.com>
Co-committed-by: Yang, Tong <yangtong2@huawei.com>
2022-12-09 14:55:21 +00:00

1256 lines
142 KiB
HTML

<a name="mrs_01_1931"></a><a name="mrs_01_1931"></a>
<h1 class="topictitle1">Common Parameters</h1>
<div id="body1595920205328"><div class="section" id="mrs_01_1931__s83a229801d134354bb7cd1f93b6b1da3"><h4 class="sectiontitle">Overview</h4><p id="mrs_01_1931__aefc9cb558bbd4b53acfa356d7a652d23">This section describes common configuration items used in Spark. Subsections are divided by feature so that you can quickly find required configuration items. If you use <span id="mrs_01_1931__text130612336220">MRS</span> clusters, most parameters described in this section have been adapted and you do not need to configure them again. For details about the parameters that need to be configured based on the site requirements, see <a href="mrs_01_1930.html">Configuring Parameters Rapidly</a>.</p>
</div>
<div class="section" id="mrs_01_1931__s1688b140aedc4714817dfd815ef65b0f"><h4 class="sectiontitle">Configuring the Number of Stage Retries</h4><p id="mrs_01_1931__a0754729dc1ff4c819b54e2513affe1b1">When FetchFailedException occurs in a Spark task, a stage retry is triggered. To prevent infinite stage retries, the number of stage retries is limited. The number of retry times can be adjusted based on the site requirements.</p>
<p id="mrs_01_1931__ab1ebacbfbe6849d2a2ced5eef9b56416">Configure the following parameters in the <span class="filepath" id="mrs_01_1931__f33677041dc354f6981b380bf0abcb6dd"><b>spark-defaults.conf</b></span> file on the Spark client.</p>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="mrs_01_1931__tcf9080f29fff4a0f875407f07f04951c" frame="border" border="1" rules="all"><caption><b>Table 1 </b>Parameter description</caption><thead align="left"><tr id="mrs_01_1931__r607892e8d4cf4eb58d510c1b1b775bdd"><th align="left" class="cellrowborder" valign="top" width="33.333333333333336%" id="mcps1.3.2.4.2.4.1.1"><p id="mrs_01_1931__a3b297d30afc843fd93fe0103ee50064b"><strong id="mrs_01_1931__ac903ad14603e4ea98b35d00f657bca6a">Parameter</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="45.36453645364537%" id="mcps1.3.2.4.2.4.1.2"><p id="mrs_01_1931__a2bbcf764557a42b5acdd959d7cf58f8b"><strong id="mrs_01_1931__ac175bb3513414ff99830f6dfb0fb281e">Description</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="21.302130213021304%" id="mcps1.3.2.4.2.4.1.3"><p id="mrs_01_1931__a0716f548774940f38f48b1c3ad4d6eb6"><strong id="mrs_01_1931__a6f10287b27d54381acb9c698375f59c6">Default Value</strong></p>
</th>
</tr>
</thead>
<tbody><tr id="mrs_01_1931__reedb5ff051f54556868c7f12da3bfd90"><td class="cellrowborder" valign="top" width="33.333333333333336%" headers="mcps1.3.2.4.2.4.1.1 "><p id="mrs_01_1931__a5bb67e012e0d4ef7a3f66101e586144e">spark.stage.maxConsecutiveAttempts</p>
</td>
<td class="cellrowborder" valign="top" width="45.36453645364537%" headers="mcps1.3.2.4.2.4.1.2 "><p id="mrs_01_1931__a159f3381aa2e44d4a26d97dca84df640">Indicates the maximum number of stage retries.</p>
</td>
<td class="cellrowborder" valign="top" width="21.302130213021304%" headers="mcps1.3.2.4.2.4.1.3 "><p id="mrs_01_1931__ab7181237a7204e12a51dbc0d9c9e2935">4</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="mrs_01_1931__sc480bf1748bb489f84e52e7ba57153c9"><h4 class="sectiontitle">Configuring Whether to Use Cartesian Product</h4><p id="mrs_01_1931__aa74c7cfc457f45979d724ad4435f8149">To enable the Cartesian product function, configure the following parameter in the <span class="filepath" id="mrs_01_1931__fba915d08f97448ca908bf6b0c99b3250"><b>spark-defaults.conf</b></span> configuration file of Spark.</p>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="mrs_01_1931__t9896e5ffc3cd4b8db1087e76151d3c9f" frame="border" border="1" rules="all"><caption><b>Table 2 </b>Cartesian product parameters</caption><thead align="left"><tr id="mrs_01_1931__r0429b70acc2048499f25d0bab1940f62"><th align="left" class="cellrowborder" valign="top" width="33.333333333333336%" id="mcps1.3.3.3.2.4.1.1"><p id="mrs_01_1931__ae2b45fc2988e402788bbb2d1e1d1f1ba"><strong id="mrs_01_1931__aae962ba58fb34423a39738dfa18e9cdd">Parameter</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="45.36453645364537%" id="mcps1.3.3.3.2.4.1.2"><p id="mrs_01_1931__adae614aa441d4572a5ba74b7aacb677e"><strong id="mrs_01_1931__a64430b9c38974ed9b108f96516b7f27f">Description</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="21.302130213021304%" id="mcps1.3.3.3.2.4.1.3"><p id="mrs_01_1931__a8205de34f4e8401a96c98c9152d87834"><strong id="mrs_01_1931__a1ee4340a326b44bfaa8cdc92b8a09f41">Default Value</strong></p>
</th>
</tr>
</thead>
<tbody><tr id="mrs_01_1931__r050ea0fa824048a88cd761eb62ecce54"><td class="cellrowborder" valign="top" width="33.333333333333336%" headers="mcps1.3.3.3.2.4.1.1 "><p id="mrs_01_1931__a947e8ae0d20943669aaaaab3c7ba59b5">spark.sql.crossJoin.enabled</p>
</td>
<td class="cellrowborder" valign="top" width="45.36453645364537%" headers="mcps1.3.3.3.2.4.1.2 "><p id="mrs_01_1931__a52701e8bb9cf46168295c4d640981858">Indicates whether to allow implicit Cartesian product execution.</p>
<ul id="mrs_01_1931__u684998b6672e497799ebf69faf693bf4"><li id="mrs_01_1931__lf31bf10565b34877861be0a8a70576e9"><span class="parmvalue" id="mrs_01_1931__p29db487f59274b58a09810c9ef9e6dd9"><b>true</b></span>: Implicit Cartesian product execution is allowed.</li><li id="mrs_01_1931__lbe8970f7ffaa48dfadafe041f6427770"><span class="parmvalue" id="mrs_01_1931__p653066a1766f4e46a0b179634c3afdd0"><b>false</b></span>: Implicit Cartesian product execution is not allowed. In this case, only CROSS JOIN can be explicitly included in the query.</li></ul>
</td>
<td class="cellrowborder" valign="top" width="21.302130213021304%" headers="mcps1.3.3.3.2.4.1.3 "><p id="mrs_01_1931__a2911c458d8a54b919e7bfdb6683c7079">true</p>
</td>
</tr>
</tbody>
</table>
</div>
<div class="note" id="mrs_01_1931__note1344216101256"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><ul id="mrs_01_1931__ul1488495115615"><li id="mrs_01_1931__li18848511565">For JDBC applications, configure this parameter in the <span class="filepath" id="mrs_01_1931__filepath3720729558"><b>spark-defaults.conf</b></span> configuration file of the server.</li><li id="mrs_01_1931__li1884165110611">For tasks submitted by the Spark client, configure this parameter in the <span class="filepath" id="mrs_01_1931__filepath189164381611"><b>spark-defaults.conf</b></span> configuration file of the client.</li></ul>
</div></div>
</div>
<div class="section" id="mrs_01_1931__s801ce708ada547f8bd47864e7f138bf0"><h4 class="sectiontitle">Configuring Security Authentication for Long-Time Spark Tasks</h4><p id="mrs_01_1931__ab099826c9df24a2ba9087c374da3a5df">In security mode, if the <strong id="mrs_01_1931__b1996603815113115">kinit</strong> command is used for security authentication when the Spark CLI (such as spark-shell, spark-sql, or spark-submit) is used, the task fails due to authentication expiration when the task is running for a long time.</p>
<p id="mrs_01_1931__a362fb457ca45442ca5b1884f1ea5b42c">Set the following parameters in the <span class="filepath" id="mrs_01_1931__f6601020f146344bfbaa50a6350197fb2"><b>spark-defaults.conf</b></span> configuration file on the client. After the configuration is complete, run the Spark CLI again.</p>
<div class="note" id="mrs_01_1931__nbe2fe128dfd64c3d98a5c44442c1e503"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="mrs_01_1931__a0cc079fbe70740969e6b2b31b24a606d">If this parameter is set to <span class="parmvalue" id="mrs_01_1931__p3a658d6b92c843bfa9de395c44554581"><b>true</b></span>, ensure that the values of <strong id="mrs_01_1931__b1972419859113115">keytab</strong> and <strong id="mrs_01_1931__b2129683128113115">principal</strong> in <span class="filepath" id="mrs_01_1931__fcaa28b2bb4344530b9645e6d9ccedf81"><b>spark-defaults.conf</b></span> and <span class="filepath" id="mrs_01_1931__f16214171ae564e7bbcf4e66eeb3bf7f3"><b>hive-site.xml</b></span> are the same.</p>
</div></div>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="mrs_01_1931__tc6ee995d6c224dc481264e33b11fc885" frame="border" border="1" rules="all"><caption><b>Table 3 </b>Parameter description</caption><thead align="left"><tr id="mrs_01_1931__r78f60b23dca64f32ba709be0ab34823b"><th align="left" class="cellrowborder" valign="top" width="25.06%" id="mcps1.3.4.5.2.4.1.1"><p id="mrs_01_1931__afc9dd30d97de48009d6d22cada6934fc"><strong id="mrs_01_1931__a6a99c664443d47e2940b47bc401592bf">Parameter</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="62.09%" id="mcps1.3.4.5.2.4.1.2"><p id="mrs_01_1931__a703a36b9ee0f4e2d85456c7a9164031e"><strong id="mrs_01_1931__a2448694e62a04b379ca826b393d69ac8">Description</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="12.85%" id="mcps1.3.4.5.2.4.1.3"><p id="mrs_01_1931__aa8a6ac2f542a41199526fe6c4535f811"><strong id="mrs_01_1931__a2d301096562b4340b866a6d4a9b3ec2a">Default Value</strong></p>
</th>
</tr>
</thead>
<tbody><tr id="mrs_01_1931__rdfd67eeb69ad4ac1a1fae95b85534ec9"><td class="cellrowborder" valign="top" width="25.06%" headers="mcps1.3.4.5.2.4.1.1 "><p id="mrs_01_1931__a5aafe3e55e564c67bf41eb9772330c5b">spark.kerberos.principal</p>
</td>
<td class="cellrowborder" valign="top" width="62.09%" headers="mcps1.3.4.5.2.4.1.2 "><p id="mrs_01_1931__a00f70be1b68342278239c3c1e52767c0">Indicates the principal user who has the Spark operation permission. Contact the <span id="mrs_01_1931__ph16923308331">system </span>administrator to obtain the principal user.</p>
</td>
<td class="cellrowborder" valign="top" width="12.85%" headers="mcps1.3.4.5.2.4.1.3 "><p id="mrs_01_1931__a61ec041dece8446c9b89809b1e70ca79">-</p>
</td>
</tr>
<tr id="mrs_01_1931__rc6b3b5088ac249cfb23dcf54b957eab2"><td class="cellrowborder" valign="top" width="25.06%" headers="mcps1.3.4.5.2.4.1.1 "><p id="mrs_01_1931__acc61ea50d16a4dbf836236e0c7141ba6">spark.kerberos.keytab</p>
</td>
<td class="cellrowborder" valign="top" width="62.09%" headers="mcps1.3.4.5.2.4.1.2 "><p id="mrs_01_1931__ab50f467579b5417c87a6503fee6a8f05">Indicates the name and path of the keytab file used to configure Spark operation permissions. Contact the <span id="mrs_01_1931__ph11230173314339">system </span>administrator to obtain the keytab file.</p>
</td>
<td class="cellrowborder" valign="top" width="12.85%" headers="mcps1.3.4.5.2.4.1.3 "><p id="mrs_01_1931__a894de984825b4a89889b00ff256ec2b2">-</p>
</td>
</tr>
<tr id="mrs_01_1931__rcd75c28672474529b29974910381ffd9"><td class="cellrowborder" valign="top" width="25.06%" headers="mcps1.3.4.5.2.4.1.1 "><p id="mrs_01_1931__af5394254a746426187fcc3d846b30978">spark.security.bigdata.loginOnce</p>
</td>
<td class="cellrowborder" valign="top" width="62.09%" headers="mcps1.3.4.5.2.4.1.2 "><p id="mrs_01_1931__a9de27aa5a46449698150e542d7da338c">Indicates whether the principal user logs in to the system only once. <strong id="mrs_01_1931__b1833325644113115">true</strong>: single login; <strong id="mrs_01_1931__b1244172671113115">false</strong>: multiple logins.</p>
<p id="mrs_01_1931__a54343e972d51470ca677c822c2fbff52">The difference between a single login and multiple logins is as follows: The Spark community uses the Kerberos user to log in to the system for multiple times. However, the TGT or token may expire, causing the application to fail to run for a long time. The Kerberos login mode of DataSight is modified to allow users to log in only once, which effectively resolves the expiration problem. The restrictions are as follows: The principal and keytab configuration items of Hive must be the same as those of Spark.</p>
<div class="note" id="mrs_01_1931__ncb85afcf3a1140019f7c2d6268244693"><span class="notetitle"> NOTE: </span><div class="notebody"><p id="mrs_01_1931__a8ff798812f934e6d96be143680ad5654">If this parameter is set to <strong id="mrs_01_1931__b863646599113115">true</strong>, ensure that the values of <strong id="mrs_01_1931__b1402708748113115">keytab</strong> and <strong id="mrs_01_1931__b1079978908113115">principal</strong> in <span class="filepath" id="mrs_01_1931__f3ec7f412113b4346bfbcb7d4b8ba79d3"><b>spark-defaults.conf</b></span> and <span class="filepath" id="mrs_01_1931__ff168650ac28f4f21b5126c3d646793fd"><b>hive-site.xml</b></span> are the same.</p>
</div></div>
</td>
<td class="cellrowborder" valign="top" width="12.85%" headers="mcps1.3.4.5.2.4.1.3 "><p id="mrs_01_1931__ac34c793db134462b941298c93e009b2b">true</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="mrs_01_1931__s11cd9f4fff844f2099065ea850f50154"><h4 class="sectiontitle">Python Spark</h4><p id="mrs_01_1931__ae13baa13503946e3a19785f58d2979d3">Python Spark is the third programming language of Spark except Scala and Java. Different from Java and Scala that run on the JVM platform, Python Spark has its own Python process as well as the JVM process. The following configuration items apply only to Python Spark scenarios. However, other configuration items can also take effect in Python Spark scenarios.</p>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="mrs_01_1931__tcadafb19c7cd46f28600078139e86cba" frame="border" border="1" rules="all"><caption><b>Table 4 </b>Parameter description</caption><thead align="left"><tr id="mrs_01_1931__r7d8fbe7285eb4913ab37a5dd521c8e88"><th align="left" class="cellrowborder" valign="top" width="25.09%" id="mcps1.3.5.3.2.4.1.1"><p id="mrs_01_1931__ad65ef2793a2845ae8440e9b4e86d6614"><strong id="mrs_01_1931__a3457491a776b47199e0724a149c3fad1">Parameter</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="61.75000000000001%" id="mcps1.3.5.3.2.4.1.2"><p id="mrs_01_1931__ac3bcd76a02de4a5eaca2317408996451"><strong id="mrs_01_1931__a67b6febb2d77420d99cbf85a484290af">Description</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="13.16%" id="mcps1.3.5.3.2.4.1.3"><p id="mrs_01_1931__a7e91f99a47214c92a5c05b0e04282a7d"><strong id="mrs_01_1931__a8dd4c98f98b64fe3ad3c0b5df52fcc71">Default Value</strong></p>
</th>
</tr>
</thead>
<tbody><tr id="mrs_01_1931__ra24298007713444d8649c9203a7077eb"><td class="cellrowborder" valign="top" width="25.09%" headers="mcps1.3.5.3.2.4.1.1 "><p id="mrs_01_1931__a158282292a4c414eb7ff183677957826">spark.python.profile</p>
</td>
<td class="cellrowborder" valign="top" width="61.75000000000001%" headers="mcps1.3.5.3.2.4.1.2 "><p id="mrs_01_1931__a2a850ea9ff804424b06eef69ed79119a">Indicates whether to enable profiling on the Python worker. Use <strong id="mrs_01_1931__b428527689113115">sc.show_profiles()</strong> to display the analysis results or display the analysis results before the Driver exits. You can use <strong id="mrs_01_1931__b990036255113115">sc.dump_profiles(path)</strong> to dump the results to a disk. If some analysis results have been manually displayed, they will not be automatically displayed before the driver exits.</p>
<p id="mrs_01_1931__a737e9f9244de4796ac9e8105e5c2aa3f">By default, <strong id="mrs_01_1931__b2109682214113115">pyspark.profiler.BasicProfiler</strong> is used. You can transfer the specified profiler during SparkContext initialization to overwrite the default profiler.</p>
</td>
<td class="cellrowborder" valign="top" width="13.16%" headers="mcps1.3.5.3.2.4.1.3 "><p id="mrs_01_1931__aa18ffbefab7d4c5c996e0d799b3be4e8">false</p>
</td>
</tr>
<tr id="mrs_01_1931__r562e04f3e9bd49619a72ef8a7098adab"><td class="cellrowborder" valign="top" width="25.09%" headers="mcps1.3.5.3.2.4.1.1 "><p id="mrs_01_1931__a2ab1c3c59e4b414ab1220f76e7cb5257">spark.python.worker.memory</p>
</td>
<td class="cellrowborder" valign="top" width="61.75000000000001%" headers="mcps1.3.5.3.2.4.1.2 "><p id="mrs_01_1931__ad53ea79b103e483fa6cf0ba003c824d3">Indicates the memory size that can be used by each Python worker process during aggregation. The value format is the same as that of the specified JVM memory, for example, 512 MB and 2 GB. If the memory used by a process during aggregation exceeds the value of this parameter, data will be written to disks.</p>
</td>
<td class="cellrowborder" valign="top" width="13.16%" headers="mcps1.3.5.3.2.4.1.3 "><p id="mrs_01_1931__ad5962da5c2a8436e94740f52e36b7991">512m</p>
</td>
</tr>
<tr id="mrs_01_1931__r8df23bebdc624b25947f9d375645838b"><td class="cellrowborder" valign="top" width="25.09%" headers="mcps1.3.5.3.2.4.1.1 "><p id="mrs_01_1931__a912ec2e4709e4d8289193f67551c121e">spark.python.worker.reuse</p>
</td>
<td class="cellrowborder" valign="top" width="61.75000000000001%" headers="mcps1.3.5.3.2.4.1.2 "><p id="mrs_01_1931__aec5a4c5384e94419813e87f1c24ee61c">Indicates whether to reuse Python workers. If the reuse function is enabled, a fixed number of Python workers will be reused by the next batch of submitted tasks instead of forking a Python process for each task. This function is useful in large-scale broadcasting because the data does not need to be transferred from the JVM to the Python workers again for the next batch of submitted tasks.</p>
</td>
<td class="cellrowborder" valign="top" width="13.16%" headers="mcps1.3.5.3.2.4.1.3 "><p id="mrs_01_1931__a932004a523b84651815423346b6c6501">true</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="mrs_01_1931__s3e6ba9832a9e482d9a0a672e1eed1d75"><h4 class="sectiontitle">Dynamic Allocation</h4><p id="mrs_01_1931__a558c31b34b184f63b2f2fc5b7257b967">Dynamic resource scheduling is a unique feature of the On Yarn mode. This function can be used only after Yarn External Shuffle is enabled. When Spark is used as a resident service, dynamic resource scheduling greatly improves resource utilization. For example, the JDBCServer process does not accept JDBC requests in most of the time. Therefore, releasing resources in this period greatly reduces the waste of cluster resources.</p>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="mrs_01_1931__t3c7ae7e40d9448e290682b763b239880" frame="border" border="1" rules="all"><caption><b>Table 5 </b>Parameter description</caption><thead align="left"><tr id="mrs_01_1931__raf355b29e6db43f9bb1c4ea0015fabe8"><th align="left" class="cellrowborder" valign="top" width="21.3%" id="mcps1.3.6.3.2.4.1.1"><p id="mrs_01_1931__ac9b91602eddb459bbb7488eb9b33e931"><strong id="mrs_01_1931__a24510b211c894c2dbae28f1d7d961a3f">Parameter</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="60.99%" id="mcps1.3.6.3.2.4.1.2"><p id="mrs_01_1931__a5473db6a3ea64ccbbb79611e2598b44a"><strong id="mrs_01_1931__a72ca4ab65368472d86aa55ea8838568d">Description</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="17.71%" id="mcps1.3.6.3.2.4.1.3"><p id="mrs_01_1931__a2148657a1c5b49298964037ba808314c"><strong id="mrs_01_1931__ac49ccc1923f54ca586559467a4af2817">Default Value</strong></p>
</th>
</tr>
</thead>
<tbody><tr id="mrs_01_1931__rdd7b10c7282349c38c67425a7dee8c67"><td class="cellrowborder" valign="top" width="21.3%" headers="mcps1.3.6.3.2.4.1.1 "><p id="mrs_01_1931__a2e0ffe8981d445349bf47d56b16defee">spark.dynamicAllocation.enabled</p>
</td>
<td class="cellrowborder" valign="top" width="60.99%" headers="mcps1.3.6.3.2.4.1.2 "><p id="mrs_01_1931__a0d617bcf826d4d539484733c47735358">Indicates whether to use dynamic resource scheduling, which is used to adjust the number of executors registered with the application according to scale. Currently, this parameter is valid only in Yarn mode.</p>
<p id="mrs_01_1931__a37a5c1adb45e4b20987cccfe3a2e0c9d">To enable dynamic resource scheduling, set <strong id="mrs_01_1931__b47616087113115">spark.shuffle.service.enabled</strong> to <strong id="mrs_01_1931__b632956381113115">true</strong>. Related parameters are as follows: <strong id="mrs_01_1931__b1337259009113115">spark.dynamicAllocation.minExecutors</strong>, <strong id="mrs_01_1931__b1633645046113115">spark.dynamicAllocation.maxExecutors</strong>, and <strong id="mrs_01_1931__b1810705301113115">spark.dynamicAllocation.initialExecutors</strong>.</p>
</td>
<td class="cellrowborder" valign="top" width="17.71%" headers="mcps1.3.6.3.2.4.1.3 "><ul id="mrs_01_1931__ul63349384288"><li id="mrs_01_1931__li16334123882816">JDBCServer2x:<p id="mrs_01_1931__p13592122822818"><a name="mrs_01_1931__li16334123882816"></a><a name="li16334123882816"></a>true</p>
</li><li id="mrs_01_1931__li163361648132811">SparkResource2x:<p id="mrs_01_1931__p459514617307"><a name="mrs_01_1931__li163361648132811"></a><a name="li163361648132811"></a>false</p>
</li></ul>
</td>
</tr>
<tr id="mrs_01_1931__r573b3afb78134da58d443b715b8e79b9"><td class="cellrowborder" valign="top" width="21.3%" headers="mcps1.3.6.3.2.4.1.1 "><p id="mrs_01_1931__a52338dc3f3c84e7588207d6dad919b4a">spark.dynamicAllocation.minExecutors</p>
</td>
<td class="cellrowborder" valign="top" width="60.99%" headers="mcps1.3.6.3.2.4.1.2 "><p id="mrs_01_1931__a51e529f3cb9e4c918a349a05b0b45821">Indicates the minimum number of executors.</p>
</td>
<td class="cellrowborder" valign="top" width="17.71%" headers="mcps1.3.6.3.2.4.1.3 "><p id="mrs_01_1931__a694b89983eda40f08c6fb37d1a8be523">0</p>
</td>
</tr>
<tr id="mrs_01_1931__rc4087f29a69148e38417ffde77b5761d"><td class="cellrowborder" valign="top" width="21.3%" headers="mcps1.3.6.3.2.4.1.1 "><p id="mrs_01_1931__a338c65d28a2a4d979ca56f752d7a670e">spark.dynamicAllocation.initialExecutors</p>
</td>
<td class="cellrowborder" valign="top" width="60.99%" headers="mcps1.3.6.3.2.4.1.2 "><p id="mrs_01_1931__a8c2ca46fabd44c2c927620a39cb81961">Indicates the number of initial executors.</p>
</td>
<td class="cellrowborder" valign="top" width="17.71%" headers="mcps1.3.6.3.2.4.1.3 "><p id="mrs_01_1931__a0e6261251cf5411387a66abd3d39502d">spark.dynamicAllocation.minExecutors</p>
</td>
</tr>
<tr id="mrs_01_1931__r1ffb335be78e4499a313a975d19a8203"><td class="cellrowborder" valign="top" width="21.3%" headers="mcps1.3.6.3.2.4.1.1 "><p id="mrs_01_1931__a00f54278ac654f5fb8de9f130df350f6">spark.dynamicAllocation.maxExecutors</p>
</td>
<td class="cellrowborder" valign="top" width="60.99%" headers="mcps1.3.6.3.2.4.1.2 "><p id="mrs_01_1931__aba4e3d09280e48acb0ca035a0fbf3ee5">Indicates the maximum number of executors.</p>
</td>
<td class="cellrowborder" valign="top" width="17.71%" headers="mcps1.3.6.3.2.4.1.3 "><p id="mrs_01_1931__a8181cff9affa4ffd9f225e6d7796287f">2048</p>
</td>
</tr>
<tr id="mrs_01_1931__r6cbfdd1c590c45ebbdc7124eab679fb7"><td class="cellrowborder" valign="top" width="21.3%" headers="mcps1.3.6.3.2.4.1.1 "><p id="mrs_01_1931__a3e3b388ef27f4ce6953448d0ff5aa5c1">spark.dynamicAllocation.schedulerBacklogTimeout</p>
</td>
<td class="cellrowborder" valign="top" width="60.99%" headers="mcps1.3.6.3.2.4.1.2 "><p id="mrs_01_1931__ae916e836396845d9a9fedce238097268">Indicates the first timeout period for scheduling. The unit is second.</p>
</td>
<td class="cellrowborder" valign="top" width="17.71%" headers="mcps1.3.6.3.2.4.1.3 "><p id="mrs_01_1931__a839e3f4e36f3438d96e6d2ef31896fb1">1s</p>
</td>
</tr>
<tr id="mrs_01_1931__r5f12b2609f4b41bba93df79c2f967caa"><td class="cellrowborder" valign="top" width="21.3%" headers="mcps1.3.6.3.2.4.1.1 "><p id="mrs_01_1931__afa3b58bfa82f41bda153c856b5d250d5">spark.dynamicAllocation.sustainedSchedulerBacklogTimeout</p>
</td>
<td class="cellrowborder" valign="top" width="60.99%" headers="mcps1.3.6.3.2.4.1.2 "><p id="mrs_01_1931__a018f4f052e7c4c94947bcb8b3510cfaf">Indicates the second and later timeout interval for scheduling.</p>
</td>
<td class="cellrowborder" valign="top" width="17.71%" headers="mcps1.3.6.3.2.4.1.3 "><p id="mrs_01_1931__a2aabdbf65d1c4f2eb4dd70555007060e">1s</p>
</td>
</tr>
<tr id="mrs_01_1931__rada95f7c941548b5a1d790b39d3ab324"><td class="cellrowborder" valign="top" width="21.3%" headers="mcps1.3.6.3.2.4.1.1 "><p id="mrs_01_1931__a1887a38d3cee4fd7b77ddd018a88dd0d">spark.dynamicAllocation.executorIdleTimeout</p>
</td>
<td class="cellrowborder" valign="top" width="60.99%" headers="mcps1.3.6.3.2.4.1.2 "><p id="mrs_01_1931__a8074370b2e114277a447261aca6fb5f1">Indicates the idle timeout interval for common executors. The unit is second.</p>
<p id="mrs_01_1931__a540f028774f24e7b85adb27ddceef527"></p>
</td>
<td class="cellrowborder" valign="top" width="17.71%" headers="mcps1.3.6.3.2.4.1.3 "><p id="mrs_01_1931__ac8e23271f78147dd8b977ff14fc8cf2f">60</p>
</td>
</tr>
<tr id="mrs_01_1931__rdf77769022cc4e6d8136ea7359c19921"><td class="cellrowborder" valign="top" width="21.3%" headers="mcps1.3.6.3.2.4.1.1 "><p id="mrs_01_1931__addf4bb026af046b4a07ee516c70351b5">spark.dynamicAllocation.cachedExecutorIdleTimeout</p>
</td>
<td class="cellrowborder" valign="top" width="60.99%" headers="mcps1.3.6.3.2.4.1.2 "><p id="mrs_01_1931__ab6895e60376f41a6a7ab2d98bd6a633b">Indicates the idle timeout interval for executors with cached blocks.</p>
</td>
<td class="cellrowborder" valign="top" width="17.71%" headers="mcps1.3.6.3.2.4.1.3 "><ul id="mrs_01_1931__ul756811528278"><li id="mrs_01_1931__li13568115282714">JDBCServer2x: 2147483647s</li><li id="mrs_01_1931__li1324014833519">IndexServer2x: 2147483647s</li><li id="mrs_01_1931__li1256917522270">SparkResource2x: 120</li></ul>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="mrs_01_1931__sa381540dd3284a1cbe0923bb907a2b61"><h4 class="sectiontitle">Spark Streaming</h4><p id="mrs_01_1931__a4504cf05178342a5b1c81a1c0b9026b9">Spark Streaming is a streaming data processing function provided by the Spark batch processing platform. It processes data input from external systems in <span class="parmname" id="mrs_01_1931__pa954ea11a0aa43c6a5b2a1a242810699"><b>mini-batch</b></span> mode.</p>
<p id="mrs_01_1931__p9690153420385">Configure the following parameters in the <span class="filepath" id="mrs_01_1931__filepath51771720203115"><b>spark-defaults.conf</b></span> file on the Spark client.</p>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="mrs_01_1931__t552725e8cecd4358845aa7b394ed53f5" frame="border" border="1" rules="all"><caption><b>Table 6 </b>Parameter description</caption><thead align="left"><tr id="mrs_01_1931__r13f6abc617b84808809befadc526b6ce"><th align="left" class="cellrowborder" valign="top" width="26.41%" id="mcps1.3.7.4.2.4.1.1"><p id="mrs_01_1931__a84a76df6916f482fbe7109f17da54ec3"><strong id="mrs_01_1931__ae5664c5cee144d1397a7162126052cdc">Parameter</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="61.56%" id="mcps1.3.7.4.2.4.1.2"><p id="mrs_01_1931__af7f9ca48075e46379ece842be861a7e5"><strong id="mrs_01_1931__aa6c6e6de537747fb921a028c1727c536">Description</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="12.030000000000001%" id="mcps1.3.7.4.2.4.1.3"><p id="mrs_01_1931__a950bf1f87f04432e991c3951f8bbcc3c"><strong id="mrs_01_1931__a7a3fcf58a7764701afc7a1535d6dd248">Default Value</strong></p>
</th>
</tr>
</thead>
<tbody><tr id="mrs_01_1931__r610b95c03cf04736a4b9f4bdb1175617"><td class="cellrowborder" valign="top" width="26.41%" headers="mcps1.3.7.4.2.4.1.1 "><p id="mrs_01_1931__adbae00d337604cd8be2f3ba3b217ecb9">spark.streaming.receiver.writeAheadLog.enable</p>
</td>
<td class="cellrowborder" valign="top" width="61.56%" headers="mcps1.3.7.4.2.4.1.2 "><p id="mrs_01_1931__a3812bf27ef834bc8abdbf796ef6274d8">Indicates whether to enable the write-ahead log (WAL) function. After this function is enabled, all input data received by the receiver is saved in the WAL. WAL ensures that data can be restored if the driver program becomes faulty.</p>
</td>
<td class="cellrowborder" valign="top" width="12.030000000000001%" headers="mcps1.3.7.4.2.4.1.3 "><p id="mrs_01_1931__a638e9c4c9f5e47e5a6fb7c94bf4c14bb">false</p>
</td>
</tr>
<tr id="mrs_01_1931__rc9e9452322734c1382d540988c746359"><td class="cellrowborder" valign="top" width="26.41%" headers="mcps1.3.7.4.2.4.1.1 "><p id="mrs_01_1931__aa926c9bbb5824671a8de428046040858">spark.streaming.unpersist</p>
</td>
<td class="cellrowborder" valign="top" width="61.56%" headers="mcps1.3.7.4.2.4.1.2 "><p id="mrs_01_1931__ae74c56273a634ac8b3cd9e4ee17e6acb">Determines whether to automatically remove RDDs generated and saved by Spark Streaming from the Spark memory. If this function is enabled, original data received by Spark Streaming is also automatically cleared. If this function is disabled, original data and RDDs cannot be automatically cleared. External applications can access the data in Streaming. This, however, occupies more Spark memory resources.</p>
</td>
<td class="cellrowborder" valign="top" width="12.030000000000001%" headers="mcps1.3.7.4.2.4.1.3 "><p id="mrs_01_1931__a820ea40a697446da98d0819abee94d05">true</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="mrs_01_1931__s6b639b4ea1b34e1fa8a3d1c12ab9e0c7"><h4 class="sectiontitle">Spark Streaming Kafka</h4><p id="mrs_01_1931__ac45d913414874d55a872e14a5ea0b477">The receiver is an important component of Spark Streaming. It receives external data, encapsulates the data into blocks, and provides the blocks for Streaming to consume. The most common data source is Kafka. Spark Streaming integrates Kafka to ensure reliability and can directly use Kafka as the RDD input.</p>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="mrs_01_1931__t62db078163ed4883a0b4c3496daf02f4" frame="border" border="1" rules="all"><caption><b>Table 7 </b>Parameter description</caption><thead align="left"><tr id="mrs_01_1931__r351310b27ba4482c85cd37a692d84ca7"><th align="left" class="cellrowborder" valign="top" width="28.290000000000003%" id="mcps1.3.8.3.2.4.1.1"><p id="mrs_01_1931__aa0496f539fb64d2b966a95620ac7251b"><strong id="mrs_01_1931__a585f41138b1347619d76aff5e31b184c">Parameter</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="59.680000000000014%" id="mcps1.3.8.3.2.4.1.2"><p id="mrs_01_1931__ae8481732a51546d2a2e72f8cb6ee89bc"><strong id="mrs_01_1931__a450e8d3baf2e47c182aa137a6aa4ca78">Description</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="12.030000000000001%" id="mcps1.3.8.3.2.4.1.3"><p id="mrs_01_1931__a38c122f3e1754c5eae858a134122f517"><strong id="mrs_01_1931__a0dad9f00b3fa4f5392384778f49e5f82">Default Value</strong></p>
</th>
</tr>
</thead>
<tbody><tr id="mrs_01_1931__r70e83bb117c84d9680e8ed0b10693ffb"><td class="cellrowborder" valign="top" width="28.290000000000003%" headers="mcps1.3.8.3.2.4.1.1 "><p id="mrs_01_1931__ade6313d459cf447aa5c43f2cbfc86bc3">spark.streaming.kafka.maxRatePerPartition</p>
</td>
<td class="cellrowborder" valign="top" width="59.680000000000014%" headers="mcps1.3.8.3.2.4.1.2 "><p id="mrs_01_1931__a492bd07b471647c9bd7bdd8b8db10563">Indicates the maximum rate (number of records per second) for reading data from each Kafka partition if the Kafka direct stream API is used.</p>
</td>
<td class="cellrowborder" valign="top" width="12.030000000000001%" headers="mcps1.3.8.3.2.4.1.3 "><p id="mrs_01_1931__af4c2201257394fd486401c4c60cc65cb">-</p>
</td>
</tr>
<tr id="mrs_01_1931__r03efd270520e4cedad327250543b37c9"><td class="cellrowborder" valign="top" width="28.290000000000003%" headers="mcps1.3.8.3.2.4.1.1 "><p id="mrs_01_1931__aadc8ac0fbf934e68ae78f9f768c40607">spark.streaming.blockInterval</p>
</td>
<td class="cellrowborder" valign="top" width="59.680000000000014%" headers="mcps1.3.8.3.2.4.1.2 "><p id="mrs_01_1931__a093e282c5eb24a1e84911d25a0d2c3ad">Indicates the interval (ms) for accumulating data received by a Spark Streaming receiver into a data block before the data is stored in Spark. A minimum value of 50 ms is recommended.</p>
</td>
<td class="cellrowborder" valign="top" width="12.030000000000001%" headers="mcps1.3.8.3.2.4.1.3 "><p id="mrs_01_1931__a301cb19947b44a07800dc66a95986095">200ms</p>
</td>
</tr>
<tr id="mrs_01_1931__r8b7be47081ce4545b1ca71629544f120"><td class="cellrowborder" valign="top" width="28.290000000000003%" headers="mcps1.3.8.3.2.4.1.1 "><p id="mrs_01_1931__a6b4c5e9cf008419987c7eb27cf2e7913">spark.streaming.receiver.maxRate</p>
</td>
<td class="cellrowborder" valign="top" width="59.680000000000014%" headers="mcps1.3.8.3.2.4.1.2 "><p id="mrs_01_1931__ac3c7f0b4182e4e5aa7ca2936d6547a26">Indicates the maximum rate (number of records per second) for each receiver to receive data. The value <strong id="mrs_01_1931__b732716924113115">0</strong> or a negative value indicates no limit to the rate.</p>
</td>
<td class="cellrowborder" valign="top" width="12.030000000000001%" headers="mcps1.3.8.3.2.4.1.3 "><p id="mrs_01_1931__ac1b1ce3127f64678bcc8f1aeb11f28ac">-</p>
</td>
</tr>
<tr id="mrs_01_1931__re86d3e96a0b24620b8e44b9ac7c921e0"><td class="cellrowborder" valign="top" width="28.290000000000003%" headers="mcps1.3.8.3.2.4.1.1 "><p id="mrs_01_1931__ac0bf801e4d5f40a8aeaa7324aa3b694b">spark.streaming.receiver.writeAheadLog.enable</p>
</td>
<td class="cellrowborder" valign="top" width="59.680000000000014%" headers="mcps1.3.8.3.2.4.1.2 "><p id="mrs_01_1931__a00fba75f7151480d8c8719dbc81891b7">Indicates whether to use ReliableKafkaReceiver. This receiver ensures the integrity of streaming data.</p>
</td>
<td class="cellrowborder" valign="top" width="12.030000000000001%" headers="mcps1.3.8.3.2.4.1.3 "><p id="mrs_01_1931__a74fc1e7b54c6438c870145dd9a997948">false</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="mrs_01_1931__scf2ad03bcc8d4954af747700f37a94c4"><h4 class="sectiontitle">Netty/NIO and Hash/Sort Configuration</h4><p id="mrs_01_1931__a56f22b65ccdd46fc97cbc5a21053becb">Shuffle is critical for big data processing, and the network is critical for the entire shuffle process. Currently, Spark supports two shuffle modes: hash and sort. There are two network modes: Netty and NIO.</p>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="mrs_01_1931__tbe33d85cf3614af6b56b9ce0cc54ed20" frame="border" border="1" rules="all"><caption><b>Table 8 </b>Parameter description</caption><thead align="left"><tr id="mrs_01_1931__r0b13ba70667d45f8acbdc5528b7bb551"><th align="left" class="cellrowborder" valign="top" width="28.000000000000004%" id="mcps1.3.9.3.2.4.1.1"><p id="mrs_01_1931__adf319b6842ae4c1181e0384fdca41113"><strong id="mrs_01_1931__a53ab71ba40cf4134a65e8d86d3ce93a2">Parameter</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="60%" id="mcps1.3.9.3.2.4.1.2"><p id="mrs_01_1931__af9a044d0f26d41cba96c49cfbccdf154"><strong id="mrs_01_1931__a81708f750b1b4788b43309c92432ff18">Description</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="12%" id="mcps1.3.9.3.2.4.1.3"><p id="mrs_01_1931__abc8807c9d4f44a079e0a7a559c6a24c8"><strong id="mrs_01_1931__a227eeb31796543a28cf3022afdc7e6ef">Default Value</strong></p>
</th>
</tr>
</thead>
<tbody><tr id="mrs_01_1931__r7fb6bca87fff4310a073a4b32c988354"><td class="cellrowborder" valign="top" width="28.000000000000004%" headers="mcps1.3.9.3.2.4.1.1 "><p id="mrs_01_1931__af37557dc47a9401f86184f76619dba0f">spark.shuffle.manager</p>
</td>
<td class="cellrowborder" valign="top" width="60%" headers="mcps1.3.9.3.2.4.1.2 "><p id="mrs_01_1931__a745f069cd6704343975b2c01f09e97df">Indicates the data processing mode. There are two implementation modes: sort and hash. The sort shuffle has a higher memory utilization. It is the default option in Spark 1.2 and later versions.</p>
</td>
<td class="cellrowborder" valign="top" width="12%" headers="mcps1.3.9.3.2.4.1.3 "><p id="mrs_01_1931__a012e1fd0740e4b8692871fb587e141e0">SORT</p>
</td>
</tr>
<tr id="mrs_01_1931__r5c936d2c5e7542d59f19a2bdb64a40eb"><td class="cellrowborder" valign="top" width="28.000000000000004%" headers="mcps1.3.9.3.2.4.1.1 "><p id="mrs_01_1931__af928b14cdc184e769b708f5a9d9f7fef">spark.shuffle.consolidateFiles</p>
</td>
<td class="cellrowborder" valign="top" width="60%" headers="mcps1.3.9.3.2.4.1.2 "><p id="mrs_01_1931__a0495abd771674cc18de39911ca0cde5d">(Only in hash mode) To merge intermediate files created during shuffle, set this parameter to <strong id="mrs_01_1931__b481162316113115">true</strong>. Decreasing the number of files to be created can improve the processing performance of the file system and reduce risks. If the <strong id="mrs_01_1931__b2003867281113115">ext4</strong> or <strong id="mrs_01_1931__b412427101113115">xfs</strong> file system is used, you are advised to set this parameter to <strong id="mrs_01_1931__b329976849113115">true</strong>. Due to file system restrictions, this setting on <strong id="mrs_01_1931__b1214872893113115">ext3</strong> may reduce the processing performance of a server with more than eight cores.</p>
</td>
<td class="cellrowborder" valign="top" width="12%" headers="mcps1.3.9.3.2.4.1.3 "><p id="mrs_01_1931__a39787a77efbf44c489569972292a2ab9">false</p>
</td>
</tr>
<tr id="mrs_01_1931__r376ee82b592e4c7ebeb044b13f69ac88"><td class="cellrowborder" valign="top" width="28.000000000000004%" headers="mcps1.3.9.3.2.4.1.1 "><p id="mrs_01_1931__a39f32b76ccce425f8ae663b8dc4bac54">spark.shuffle.sort.bypassMergeThreshold</p>
</td>
<td class="cellrowborder" valign="top" width="60%" headers="mcps1.3.9.3.2.4.1.2 "><p id="mrs_01_1931__a532a10ea7a8940df8dcd33f40acfbfe7">This parameter is valid only when <strong id="mrs_01_1931__b257604284113115">spark.shuffle.manager</strong> is set to <strong id="mrs_01_1931__b1189004723113115">sort</strong>. When Map aggregation is not performed and the number of partitions for Reduce tasks is less than or equal to the value of this parameter, do not merge and sort data to prevent performance deterioration caused by unnecessary sorting.</p>
<p id="mrs_01_1931__a18169db2dab8401bbb548143f159ef5c"></p>
</td>
<td class="cellrowborder" valign="top" width="12%" headers="mcps1.3.9.3.2.4.1.3 "><p id="mrs_01_1931__acfe325d2ed40479b81b8ad6ade5c16ba">200</p>
</td>
</tr>
<tr id="mrs_01_1931__r11d3300412eb4f5da43c236297322dac"><td class="cellrowborder" valign="top" width="28.000000000000004%" headers="mcps1.3.9.3.2.4.1.1 "><p id="mrs_01_1931__a1d54a5b04f10472cbb37aaa6b9a7227f">spark.shuffle.io.maxRetries</p>
</td>
<td class="cellrowborder" valign="top" width="60%" headers="mcps1.3.9.3.2.4.1.2 "><p id="mrs_01_1931__a8ac3d1a1b9f340168fa72f2f21142c9c">(Only in Netty mode) If this parameter is set to a non-zero value, fetch failures caused by I/O-related exceptions will be automatically retried. This retry logic helps the large shuffle keep stable when long GC pauses or intermittent network disconnections occur.</p>
</td>
<td class="cellrowborder" valign="top" width="12%" headers="mcps1.3.9.3.2.4.1.3 "><p id="mrs_01_1931__a5f29820a5138447391f59946f32f774e">12</p>
</td>
</tr>
<tr id="mrs_01_1931__r16860423891648168e136ef1b00b829a"><td class="cellrowborder" valign="top" width="28.000000000000004%" headers="mcps1.3.9.3.2.4.1.1 "><p id="mrs_01_1931__abb0902a7539d46268e6c45ec8bcc7152">spark.shuffle.io.numConnectionsPerPeer</p>
</td>
<td class="cellrowborder" valign="top" width="60%" headers="mcps1.3.9.3.2.4.1.2 "><p id="mrs_01_1931__aa331b748e0654b7594bd906344482e1b">(Only in Netty mode) Connections between hosts are reused to reduce the number of connections between large clusters. For a cluster with many disks but a few hosts, this function may make concurrent requests unable to occupy all disks. Therefore, you can increase the value of this parameter.</p>
</td>
<td class="cellrowborder" valign="top" width="12%" headers="mcps1.3.9.3.2.4.1.3 "><p id="mrs_01_1931__ac246cfef2af14031810eda8fd018ec24">1</p>
</td>
</tr>
<tr id="mrs_01_1931__raea9e9d61d944d15a50ca36f7410e704"><td class="cellrowborder" valign="top" width="28.000000000000004%" headers="mcps1.3.9.3.2.4.1.1 "><p id="mrs_01_1931__a9e524a60f91244e5a79d11ae06e66903">spark.shuffle.io.preferDirectBufs</p>
</td>
<td class="cellrowborder" valign="top" width="60%" headers="mcps1.3.9.3.2.4.1.2 "><p id="mrs_01_1931__a4693a00cd73d4f6ca97763a53260b7c5">(Only in Netty mode) The off-heap buffer is used to reduce GC during shuffle and cache block transfer. In an environment where off-heap memory is strictly limited, you can disable it to force all applications from Netty to use heap memory.</p>
</td>
<td class="cellrowborder" valign="top" width="12%" headers="mcps1.3.9.3.2.4.1.3 "><p id="mrs_01_1931__a307c25d5478b4094bf660f355d36f443">true</p>
</td>
</tr>
<tr id="mrs_01_1931__r10b17847b1244cdda540d94940913252"><td class="cellrowborder" valign="top" width="28.000000000000004%" headers="mcps1.3.9.3.2.4.1.1 "><p id="mrs_01_1931__a3b3bc038145040ea91f287ea57da9e44">spark.shuffle.io.retryWait</p>
</td>
<td class="cellrowborder" valign="top" width="60%" headers="mcps1.3.9.3.2.4.1.2 "><p id="mrs_01_1931__a88ecd46b0ccb44658e4cac355ea22d4f">(Only in Netty mode) Specifies the duration for waiting for fetch retry, in seconds. The maximum delay caused by retry is <strong id="mrs_01_1931__b780075325113115">maxRetries</strong> x <strong id="mrs_01_1931__b166685056113115">retryWait</strong>. The default value is 15 seconds.</p>
</td>
<td class="cellrowborder" valign="top" width="12%" headers="mcps1.3.9.3.2.4.1.3 "><p id="mrs_01_1931__ab2c5b5fbe8c5426d8d0f4cec8043dba9">5</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="mrs_01_1931__s2bd5b91ead3b4b3b92a1f7fe38d2d7c1"><h4 class="sectiontitle">Common Shuffle Configuration</h4>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="mrs_01_1931__tc5ccf855dfb74cef9405f311543e998f" frame="border" border="1" rules="all"><caption><b>Table 9 </b>Parameter description</caption><thead align="left"><tr id="mrs_01_1931__r83153d4d0a0a4877809c2a185c0cb6e4"><th align="left" class="cellrowborder" valign="top" width="28.290000000000003%" id="mcps1.3.10.2.2.4.1.1"><p id="mrs_01_1931__a99694619d110439a9922244cbabbe70b"><strong id="mrs_01_1931__a054859cc32d7493b930cee5ab31ed359">Parameter</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="59.680000000000014%" id="mcps1.3.10.2.2.4.1.2"><p id="mrs_01_1931__aab6576cae8e64405ba26b72286157014"><strong id="mrs_01_1931__ab59c5ff7b91c4c5982c28cd025b35e7a">Description</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="12.030000000000001%" id="mcps1.3.10.2.2.4.1.3"><p id="mrs_01_1931__ab032d528f1eb430592bf42dd35972adc"><strong id="mrs_01_1931__a93a3de685d3d48769cc64ab95f8926ec">Default Value</strong></p>
</th>
</tr>
</thead>
<tbody><tr id="mrs_01_1931__r38a7a881df884b76988c7e8c88714240"><td class="cellrowborder" valign="top" width="28.290000000000003%" headers="mcps1.3.10.2.2.4.1.1 "><p id="mrs_01_1931__a8be5dd9a41ca4fec95bd299b436ade3f">spark.shuffle.spill</p>
</td>
<td class="cellrowborder" valign="top" width="59.680000000000014%" headers="mcps1.3.10.2.2.4.1.2 "><p id="mrs_01_1931__a4826fd2a742244e7b7b41378e0cac393">If this parameter is set to <span class="parmvalue" id="mrs_01_1931__pef7a9cdab72f4648ae042c8c61b9172a"><b>true</b></span>, data is overflowed to the disk to limit the memory usage during a Reduce task.</p>
</td>
<td class="cellrowborder" valign="top" width="12.030000000000001%" headers="mcps1.3.10.2.2.4.1.3 "><p id="mrs_01_1931__ac6a429c150a64e31aef959aca410e6b7">true</p>
</td>
</tr>
<tr id="mrs_01_1931__rb09050db6b5346b4a0e34b8dde100c5c"><td class="cellrowborder" valign="top" width="28.290000000000003%" headers="mcps1.3.10.2.2.4.1.1 "><p id="mrs_01_1931__a055f72c7a4a24d7bac4a92b10ed2f710">spark.shuffle.spill.compress</p>
</td>
<td class="cellrowborder" valign="top" width="59.680000000000014%" headers="mcps1.3.10.2.2.4.1.2 "><p id="mrs_01_1931__afea0a0677a3c463e88f29177c2956ab7">Indicates whether to compress the data overflowed during shuffle. The algorithm specified by <strong id="mrs_01_1931__b670836352113116">spark.io.compression.codec</strong> is used for data compression.</p>
</td>
<td class="cellrowborder" valign="top" width="12.030000000000001%" headers="mcps1.3.10.2.2.4.1.3 "><p id="mrs_01_1931__a8013452116764a4b9279c2f68212f74a">true</p>
</td>
</tr>
<tr id="mrs_01_1931__r7aeff67901d64ccc8ddef97b951ec855"><td class="cellrowborder" valign="top" width="28.290000000000003%" headers="mcps1.3.10.2.2.4.1.1 "><p id="mrs_01_1931__a7ad384dfbe114b3dac0648764c889fb7">spark.shuffle.file.buffer</p>
</td>
<td class="cellrowborder" valign="top" width="59.680000000000014%" headers="mcps1.3.10.2.2.4.1.2 "><p id="mrs_01_1931__a5b78454a5fbd43f6beabe8aee348bd9d">Specifies the size of the memory buffer for storing output streams of each shuffle file, in KB. These buffers can reduce the number of disk seek and system calls during the creation of intermediate shuffle file streams. You can also set this parameter by setting <strong id="mrs_01_1931__b1786589099113116">spark.shuffle.file.buffer.kb</strong>.</p>
</td>
<td class="cellrowborder" valign="top" width="12.030000000000001%" headers="mcps1.3.10.2.2.4.1.3 "><p id="mrs_01_1931__a4808c377af6e4f1487916056662a0fba">32KB</p>
</td>
</tr>
<tr id="mrs_01_1931__rb7afd493e88f40b1a4e4462a022c43af"><td class="cellrowborder" valign="top" width="28.290000000000003%" headers="mcps1.3.10.2.2.4.1.1 "><p id="mrs_01_1931__ab504f1101439454c8a80f6cda344ed0a">spark.shuffle.compress</p>
</td>
<td class="cellrowborder" valign="top" width="59.680000000000014%" headers="mcps1.3.10.2.2.4.1.2 "><p id="mrs_01_1931__aaf33dba41aa94b94aabcd2daaf19aef8">Indicates whether to compress the output files of a Map task. You are advised to compress the broadcast variables. using <strong id="mrs_01_1931__b506790833113116">spark.io.compression.codec</strong>.</p>
</td>
<td class="cellrowborder" valign="top" width="12.030000000000001%" headers="mcps1.3.10.2.2.4.1.3 "><p id="mrs_01_1931__a97840c07c0e34e09a9055bc316ecab85">true</p>
</td>
</tr>
<tr id="mrs_01_1931__r08728360a1f04c4fa972c1aae20e0146"><td class="cellrowborder" valign="top" width="28.290000000000003%" headers="mcps1.3.10.2.2.4.1.1 "><p id="mrs_01_1931__ae8bbd08cab154570843ee050533ab4bd">spark.reducer.maxSizeInFlight</p>
</td>
<td class="cellrowborder" valign="top" width="59.680000000000014%" headers="mcps1.3.10.2.2.4.1.2 "><p id="mrs_01_1931__aa5e66be9eab3455aa3279822dc25faf1">Specifies the maximum output size of the Map task that fetches data from each Reduce task, in MB. Each output requires a buffer, which is the fixed memory overhead of each Reduce task. Therefore, keep the value small unless there is a large amount of memory. You can also set this parameter by setting <strong id="mrs_01_1931__b327074468113116">spark.reducer.maxMbInFlight</strong>.</p>
</td>
<td class="cellrowborder" valign="top" width="12.030000000000001%" headers="mcps1.3.10.2.2.4.1.3 "><p id="mrs_01_1931__a7b02c5b936d5415f97d1a41138356814">48MB</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="mrs_01_1931__s990e0d21e78e41109198511e99faf173"><h4 class="sectiontitle">Driver Configuration</h4><p id="mrs_01_1931__a76f06545d2bb4d43b4435cfb1dd9ef52">Spark driver can be considered as the client of Spark applications. All code parsing is completed in this process. Therefore, the parameters of this process are especially important. The following describes how to configure parameters for Spark driver.</p>
<ul id="mrs_01_1931__u4ddab913a2ee47ae8a2a40916c7a2421"><li id="mrs_01_1931__lb22ff7649bdf4f03913750c5b7bb13c7"><strong id="mrs_01_1931__b1852145599113116">JavaOptions</strong>: parameter following <span class="parmvalue" id="mrs_01_1931__p9bd200db836447f586714ab7fd6fda8a"><b>-D</b></span> in the Java command, which can be obtained by <strong id="mrs_01_1931__b450827215113116">System.getProperty</strong></li><li id="mrs_01_1931__l3909ddec67cb45f8ab58729c5c4b07bf"><strong id="mrs_01_1931__b1066136752113116">ClassPath</strong>: path for loading the Java classes and Native library</li><li id="mrs_01_1931__lc82f855a8bab4e4590e55e2b4d5c1971"><strong id="mrs_01_1931__b889517959113116">Java Memory and Cores</strong>: memory and CPU usage of the Java process</li><li id="mrs_01_1931__l44405e5a9b054f7cb517ee99ea6c642b"><strong id="mrs_01_1931__b1444415643113116">Spark Configuration</strong>: Spark internal parameter, which is irrelevant to the Java process</li></ul>
<div class="tablenoborder"><a name="mrs_01_1931__t846a81171d4c4af1908c5cf55578f022"></a><a name="t846a81171d4c4af1908c5cf55578f022"></a><table cellpadding="4" cellspacing="0" summary="" id="mrs_01_1931__t846a81171d4c4af1908c5cf55578f022" frame="border" border="1" rules="all"><caption><b>Table 10 </b>Parameter description</caption><thead align="left"><tr id="mrs_01_1931__r8295959eb9d445099a0efb28b7e5df3f"><th align="left" class="cellrowborder" valign="top" width="23.400000000000002%" id="mcps1.3.11.4.2.4.1.1"><p id="mrs_01_1931__aede8a4be82894b18bc6162211b87fa93"><strong id="mrs_01_1931__a92af67423b14482ca5b412ef87435d0d">Parameter</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="57.42%" id="mcps1.3.11.4.2.4.1.2"><p id="mrs_01_1931__a864abc0fd306460ebf5d8e27c99df6ff"><strong id="mrs_01_1931__a6b70ba253d4a4b9a9d5ad20b682ddd94">Description</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="19.18%" id="mcps1.3.11.4.2.4.1.3"><p id="mrs_01_1931__afe0045aea5464d3ca783dfaa07b6b770"><strong id="mrs_01_1931__a14f272e40f31495ba0a764fa729c0a57">Default Value</strong></p>
</th>
</tr>
</thead>
<tbody><tr id="mrs_01_1931__re286ab75887c451393bf2c1e485503d0"><td class="cellrowborder" valign="top" width="23.400000000000002%" headers="mcps1.3.11.4.2.4.1.1 "><p id="mrs_01_1931__a55a1a813b1df4f86aa58ebb49fa66c03">spark.driver.extraJavaOptions</p>
</td>
<td class="cellrowborder" valign="top" width="57.42%" headers="mcps1.3.11.4.2.4.1.2 "><p id="mrs_01_1931__ab6821797529a49008dc578ba89142049">Indicates a series of extra JVM options passed to the driver, for example, GC setting and logging.</p>
<p id="mrs_01_1931__a5158db00e12e48db9e152a262c1c77f0">Note: In client mode, this configuration cannot be set directly in the application using SparkConf because the driver JVM has been started. You can use <strong id="mrs_01_1931__b797228868113116">--driver-java-options</strong> or the default property file to set the parameter.</p>
</td>
<td class="cellrowborder" valign="top" width="19.18%" headers="mcps1.3.11.4.2.4.1.3 "><p id="mrs_01_1931__a118acd2b943f4c958a5536c678fe5401">For details, see <a href="mrs_01_1930.html">Configuring Parameters Rapidly</a>.</p>
</td>
</tr>
<tr id="mrs_01_1931__rd1246d2f31b945a69a3b248d84feb7d7"><td class="cellrowborder" valign="top" width="23.400000000000002%" headers="mcps1.3.11.4.2.4.1.1 "><p id="mrs_01_1931__a7f6b53dcf4594cb49871c89e09e04c44">spark.driver.extraClassPath</p>
</td>
<td class="cellrowborder" valign="top" width="57.42%" headers="mcps1.3.11.4.2.4.1.2 "><p id="mrs_01_1931__aedb2e3eaf2f744cbaa1b36af6c2ad427">Indicates the extra class path entries attached to the class path of the driver.</p>
<p id="mrs_01_1931__a166dff8c7348438ea3a16f8fffe6699e">Note: In client mode, this configuration cannot be set directly in the application using SparkConf because the driver JVM has been started. You can use <strong id="mrs_01_1931__b61112951113116">--driver-java-options</strong> or the default property file to set the parameter.</p>
</td>
<td class="cellrowborder" valign="top" width="19.18%" headers="mcps1.3.11.4.2.4.1.3 "><p id="mrs_01_1931__p124045883912">For details, see <a href="mrs_01_1930.html">Configuring Parameters Rapidly</a>.</p>
</td>
</tr>
<tr id="mrs_01_1931__rfd37645bfccb4b79a8afab1f51a51894"><td class="cellrowborder" valign="top" width="23.400000000000002%" headers="mcps1.3.11.4.2.4.1.1 "><p id="mrs_01_1931__a01968fc1c0174f06a9e5d551db06cc80">spark.driver.userClassPathFirst</p>
</td>
<td class="cellrowborder" valign="top" width="57.42%" headers="mcps1.3.11.4.2.4.1.2 "><p id="mrs_01_1931__ab3e1612d9c134c428f4cb58081ea7552">(Trial) Indicates whether to allow JAR files added by users to take precedence over Spark JAR files when classes are loaded in the driver. This feature can be used to mitigate conflicts between Spark dependencies and user dependencies. This feature is in the trial phase and is used only in cluster mode.</p>
</td>
<td class="cellrowborder" valign="top" width="19.18%" headers="mcps1.3.11.4.2.4.1.3 "><p id="mrs_01_1931__a89e42fbf000a4752bb6f5af299e03f0c">false</p>
</td>
</tr>
<tr id="mrs_01_1931__rb45579b87a8b4a3f98db7d29bdd08cb9"><td class="cellrowborder" valign="top" width="23.400000000000002%" headers="mcps1.3.11.4.2.4.1.1 "><p id="mrs_01_1931__a8632db243aff4779b4ed110b6701bf6e">spark.driver.extraLibraryPath</p>
</td>
<td class="cellrowborder" valign="top" width="57.42%" headers="mcps1.3.11.4.2.4.1.2 "><p id="mrs_01_1931__aec06689a2eb94420b4c26732fe38ffc4">Sets a special library path for starting the driver JVM.</p>
<p id="mrs_01_1931__aed9653c913e8477795e769ea67b62245">Note: In client mode, this configuration cannot be set directly in the application using SparkConf because the driver JVM has been started. You can use <strong id="mrs_01_1931__b2055353439113116">--driver-java-options</strong> or the default property file to set the parameter.</p>
</td>
<td class="cellrowborder" valign="top" width="19.18%" headers="mcps1.3.11.4.2.4.1.3 "><ul id="mrs_01_1931__ul16242185128"><li id="mrs_01_1931__li6679144981215">JDBCServer2x:<p id="mrs_01_1931__p2351852181211"><a name="mrs_01_1931__li6679144981215"></a><a name="li6679144981215"></a>${SPARK_INSTALL_HOME}/spark/native</p>
</li><li id="mrs_01_1931__li118403811124">SparkResource2x:<p id="mrs_01_1931__p20561417151318"><a name="mrs_01_1931__li118403811124"></a><a name="li118403811124"></a>${DATA_NODE_INSTALL_HOME}/hadoop/lib/native</p>
</li></ul>
</td>
</tr>
<tr id="mrs_01_1931__rf3ee95f63c244e228cbb3ff972cf7c23"><td class="cellrowborder" valign="top" width="23.400000000000002%" headers="mcps1.3.11.4.2.4.1.1 "><p id="mrs_01_1931__a9ce9988eb0ec404aa91f5dcafdba0297">spark.driver.cores</p>
</td>
<td class="cellrowborder" valign="top" width="57.42%" headers="mcps1.3.11.4.2.4.1.2 "><p id="mrs_01_1931__adda49d00f41241ed9a2544dd656a93e3">Specifies the number of cores used by the driver process. This parameter is available only in cluster mode.</p>
</td>
<td class="cellrowborder" valign="top" width="19.18%" headers="mcps1.3.11.4.2.4.1.3 "><p id="mrs_01_1931__aa71ade98d3e74c9dbd4ae02c0cc0d882">1</p>
</td>
</tr>
<tr id="mrs_01_1931__r7da7eee514954fe4af5fade61419130f"><td class="cellrowborder" valign="top" width="23.400000000000002%" headers="mcps1.3.11.4.2.4.1.1 "><p id="mrs_01_1931__a1835600b2e514d2a847d3eefa9f02def">spark.driver.memory</p>
</td>
<td class="cellrowborder" valign="top" width="57.42%" headers="mcps1.3.11.4.2.4.1.2 "><p id="mrs_01_1931__a485f98377ae24013b6f3beebd6b80006">Indicates the memory used by the driver process, that is, the memory used by the SparkContext initialization process (for example, 512 MB and 2 GB).</p>
<p id="mrs_01_1931__a8c8cf912bee541a58dfcd13e5b08347e">Note: In client mode, this configuration cannot be set directly in the application using SparkConf because the driver JVM has been started. You can use <strong id="mrs_01_1931__b167469609113116">--driver-java-options</strong> or the default property file to set the parameter.</p>
</td>
<td class="cellrowborder" valign="top" width="19.18%" headers="mcps1.3.11.4.2.4.1.3 "><p id="mrs_01_1931__a29bf63e18a0347dbaf219dd4e9ed8fc4">4G</p>
</td>
</tr>
<tr id="mrs_01_1931__r9e7544e92f8644febec534624592f588"><td class="cellrowborder" valign="top" width="23.400000000000002%" headers="mcps1.3.11.4.2.4.1.1 "><p id="mrs_01_1931__a9c0405bd4fee403786a4a45ed49e4f3b">spark.driver.maxResultSize</p>
</td>
<td class="cellrowborder" valign="top" width="57.42%" headers="mcps1.3.11.4.2.4.1.2 "><p id="mrs_01_1931__a3a9d79264cfd44a2830748dbdfdad211">Indicates the total size of serialization results of all partitions for each Spark action operation (for example, collect). The value must be at least 1 MB. If this parameter is set to <strong id="mrs_01_1931__b149151958113116">0</strong>, the size is not limited. If the total amount exceeds this limit, the task will be aborted. If the value is too large, the memory of the driver may be insufficient (depending on the object memory overhead of <strong id="mrs_01_1931__b1649106688113116">spark.driver.memory</strong> and JVM). Set a proper limit to ensure sufficient memory for the driver.</p>
</td>
<td class="cellrowborder" valign="top" width="19.18%" headers="mcps1.3.11.4.2.4.1.3 "><p id="mrs_01_1931__a49ce6d3540a24d3687bdab1a72f3ad59">1G</p>
</td>
</tr>
<tr id="mrs_01_1931__r449c9a5b757e4962948e54a56f339d8d"><td class="cellrowborder" valign="top" width="23.400000000000002%" headers="mcps1.3.11.4.2.4.1.1 "><p id="mrs_01_1931__a218fe8725c8949ecaf0b9cd0c929947b">spark.driver.host</p>
</td>
<td class="cellrowborder" valign="top" width="57.42%" headers="mcps1.3.11.4.2.4.1.2 "><p id="mrs_01_1931__a274ae3e9398b409c9c4ba7d8f2e0cb74">Specifies the host name or IP address for the driver to listen on, which is used for the driver to communicate with the executor.</p>
</td>
<td class="cellrowborder" valign="top" width="19.18%" headers="mcps1.3.11.4.2.4.1.3 "><p id="mrs_01_1931__a6b7aa6ecad7a4766ac04018fb9043308">(local hostname)</p>
<p id="mrs_01_1931__a3b67d60491154b45971a0e549fd7886c"></p>
</td>
</tr>
<tr id="mrs_01_1931__re28d63ae282e43939c45f416caa38a11"><td class="cellrowborder" valign="top" width="23.400000000000002%" headers="mcps1.3.11.4.2.4.1.1 "><p id="mrs_01_1931__a9d16dc8dd04642578f2d4fe291560dd2">spark.driver.port</p>
</td>
<td class="cellrowborder" valign="top" width="57.42%" headers="mcps1.3.11.4.2.4.1.2 "><p id="mrs_01_1931__afd95dca10f2e491cac890c1011533754">Specifies the port for the driver to listen on, which is used for the driver to communicate with the executor.</p>
</td>
<td class="cellrowborder" valign="top" width="19.18%" headers="mcps1.3.11.4.2.4.1.3 "><p id="mrs_01_1931__a85f74546e2f144edb2b0e21fb7fe6e53">(random)</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="mrs_01_1931__s80b21017c9f049b2bdf1795f164b45b9"><h4 class="sectiontitle">ExecutorLauncher Configuration</h4><p id="mrs_01_1931__af80e6314ae2c4e729ec32c7b19f29e35">ExecutorLauncher exists only in Yarn-client mode. In Yarn-client mode, ExecutorLauncher and the driver are not in the same process. Therefore, you need to configure parameters for ExecutorLauncher.</p>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="mrs_01_1931__t1ef0774261d14a90a95272b739713705" frame="border" border="1" rules="all"><caption><b>Table 11 </b>Parameter description</caption><thead align="left"><tr id="mrs_01_1931__r638e026d85434dd29732123ba2e22822"><th align="left" class="cellrowborder" valign="top" width="28.29%" id="mcps1.3.12.3.2.4.1.1"><p id="mrs_01_1931__a855eddf44d9a4c988d5635485b908068"><strong id="mrs_01_1931__ac9461a6e328c457c8e62ed0fbd477245">Parameter</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="58.18%" id="mcps1.3.12.3.2.4.1.2"><p id="mrs_01_1931__a3c0fedfe997f425bb98991b7a46d5002"><strong id="mrs_01_1931__a002f39a8e19942b896860ce25fceed61">Description</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="13.530000000000001%" id="mcps1.3.12.3.2.4.1.3"><p id="mrs_01_1931__a9d703bc553b64226bf14a303f72848e3"><strong id="mrs_01_1931__ae57adcddda9740fcb1877ddb55cbe7d9">Default Value</strong></p>
</th>
</tr>
</thead>
<tbody><tr id="mrs_01_1931__ra69fb56da7344cac9ce65fdb62ed6222"><td class="cellrowborder" valign="top" width="28.29%" headers="mcps1.3.12.3.2.4.1.1 "><p id="mrs_01_1931__aeae9d24fefbd44418632786a9a79298a">spark.yarn.am.extraJavaOptions</p>
</td>
<td class="cellrowborder" valign="top" width="58.18%" headers="mcps1.3.12.3.2.4.1.2 "><p id="mrs_01_1931__a66cf537fe1e146b588bca9cae6c9072d">Indicates a string of extra JVM options to pass to the YARN ApplicationMaster in client mode. Use <strong id="mrs_01_1931__b830107071113116">spark.driver.extraJavaOptions</strong> in cluster mode.</p>
</td>
<td class="cellrowborder" valign="top" width="13.530000000000001%" headers="mcps1.3.12.3.2.4.1.3 "><p id="mrs_01_1931__p1612316408453">For details, see <a href="mrs_01_1930.html">Configuring Parameters Rapidly</a>.</p>
</td>
</tr>
<tr id="mrs_01_1931__ra015a36bfd674441b46b1f4557123bbb"><td class="cellrowborder" valign="top" width="28.29%" headers="mcps1.3.12.3.2.4.1.1 "><p id="mrs_01_1931__a3052112d6f8548efb7bdd737f918f6a0">spark.yarn.am.memory</p>
</td>
<td class="cellrowborder" valign="top" width="58.18%" headers="mcps1.3.12.3.2.4.1.2 "><p id="mrs_01_1931__ad6e06ad2b8914f12b71b8a08f01c48a0">Indicates the amount of memory to use for the YARN ApplicationMaster in client mode, in the same format as JVM memory strings (for example, 512 MB or 2 GB). In cluster mode, use <strong id="mrs_01_1931__b835892616113116">spark.driver.memory</strong> instead.</p>
</td>
<td class="cellrowborder" valign="top" width="13.530000000000001%" headers="mcps1.3.12.3.2.4.1.3 "><p id="mrs_01_1931__a9c79836f6561417392c44db32440ec85">1G</p>
</td>
</tr>
<tr id="mrs_01_1931__rb8ec6c19ce7047b2ab43f0166f9499f4"><td class="cellrowborder" valign="top" width="28.29%" headers="mcps1.3.12.3.2.4.1.1 "><p id="mrs_01_1931__afbbfd6984e154762bc4c27eb74ba5f99">spark.yarn.am.memoryOverhead</p>
</td>
<td class="cellrowborder" valign="top" width="58.18%" headers="mcps1.3.12.3.2.4.1.2 "><p id="mrs_01_1931__a6be2a782b49640c9875f9dceea4a634f">This parameter is the same as <span class="parmname" id="mrs_01_1931__p697de52ae3674b718e4408b02eeab6c5"><b>spark.yarn.driver.memoryOverhead</b></span>. However, this parameter applies only to ApplicationMaster in client mode.</p>
</td>
<td class="cellrowborder" valign="top" width="13.530000000000001%" headers="mcps1.3.12.3.2.4.1.3 "><p id="mrs_01_1931__a66415168fce441f198c2ca0c3cf98e4c">-</p>
</td>
</tr>
<tr id="mrs_01_1931__rf861720168fe457099bffb92c6adc913"><td class="cellrowborder" valign="top" width="28.29%" headers="mcps1.3.12.3.2.4.1.1 "><p id="mrs_01_1931__a85337dcddcce46ef8641ab7c367fde43">spark.yarn.am.cores</p>
</td>
<td class="cellrowborder" valign="top" width="58.18%" headers="mcps1.3.12.3.2.4.1.2 "><p id="mrs_01_1931__a1e0a9f2f491f4c219a0740046c89408e">Indicates the number of cores to use for the YARN ApplicationMaster in client mode. Use <strong id="mrs_01_1931__b1106285449113116">spark.driver.cores</strong> in cluster mode.</p>
</td>
<td class="cellrowborder" valign="top" width="13.530000000000001%" headers="mcps1.3.12.3.2.4.1.3 "><p id="mrs_01_1931__a1b15294254d3432081593b607a470afb">1</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="mrs_01_1931__sc117672c09774809ab1da11fa1055ddd"><h4 class="sectiontitle">Executor Configuration</h4><p id="mrs_01_1931__a101cf78231b54d2ab2143f30f8bef6d0">An executor is a Java process. However, unlike the driver and ApplicationMaster, an executor can have multiple processes. Spark supports only same configurations. That is, the process parameters of all executors must be the same.</p>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="mrs_01_1931__t8a7e567c60ff4f148b5721f76b512791" frame="border" border="1" rules="all"><caption><b>Table 12 </b>Parameter description</caption><thead align="left"><tr id="mrs_01_1931__r3e4bb0cae1854a2ea32795732163f305"><th align="left" class="cellrowborder" valign="top" width="28.29%" id="mcps1.3.13.3.2.4.1.1"><p id="mrs_01_1931__a464975e84e994960ac4d65c8f3934517"><strong id="mrs_01_1931__a994aa0a98a934cee989ab1a2f96b431e">Parameter</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="58.18%" id="mcps1.3.13.3.2.4.1.2"><p id="mrs_01_1931__aaba8dc12da6b426b910b3fc6c2b90680"><strong id="mrs_01_1931__ac06d4ed2fd474b9ba98a3b1eab865079">Description</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="13.530000000000001%" id="mcps1.3.13.3.2.4.1.3"><p id="mrs_01_1931__a0f9a5bc8ba63441387b9f83d13b70d42"><strong id="mrs_01_1931__aaebcfa0d75614cbdb1bba582aa4c6d0f">Default Value</strong></p>
</th>
</tr>
</thead>
<tbody><tr id="mrs_01_1931__rdb66c178f69f448b903afe5c1993a451"><td class="cellrowborder" valign="top" width="28.29%" headers="mcps1.3.13.3.2.4.1.1 "><p id="mrs_01_1931__afd6eac5e6bc7497d930081bff2c1ab11">spark.executor.extraJavaOptions</p>
</td>
<td class="cellrowborder" valign="top" width="58.18%" headers="mcps1.3.13.3.2.4.1.2 "><p id="mrs_01_1931__a4e021a041e024fedb228a1fb826333bf">Indicates extra JVM option passed to the executor, for example, GC setting and logging. Do not set Spark attributes or heap size using this option. Instead, set Spark attributes using the SparkConf object or the <strong id="mrs_01_1931__b1966447375113116">spark-defaults.conf</strong> file specified when the spark-submit script is called. Set heap size using <strong id="mrs_01_1931__b661552857113116">spark.executor.memory</strong>.</p>
</td>
<td class="cellrowborder" valign="top" width="13.530000000000001%" headers="mcps1.3.13.3.2.4.1.3 "><p id="mrs_01_1931__a90a961473bdb479e990f1e171ed9262d">For details, see <a href="mrs_01_1930.html">Configuring Parameters Rapidly</a>.</p>
</td>
</tr>
<tr id="mrs_01_1931__r0a552b96474444e3bb431dbc9487dbfe"><td class="cellrowborder" valign="top" width="28.29%" headers="mcps1.3.13.3.2.4.1.1 "><p id="mrs_01_1931__a3fa4bd9af2f747bc8dfa8926abcc3b9f">spark.executor.extraClassPath</p>
</td>
<td class="cellrowborder" valign="top" width="58.18%" headers="mcps1.3.13.3.2.4.1.2 "><p id="mrs_01_1931__a6bb11e0126f14864bcd72b237a264b8c">Indicates the extra classpath attached to the executor classpath. This parameter ensures compatibility with historical versions of Spark. Generally, you do not need to set this parameter.</p>
</td>
<td class="cellrowborder" valign="top" width="13.530000000000001%" headers="mcps1.3.13.3.2.4.1.3 "><p id="mrs_01_1931__a0eecac39cc5b45179de6fdc824861c12">-</p>
</td>
</tr>
<tr id="mrs_01_1931__r4502a369c486475fbe7bc2bc950e07e0"><td class="cellrowborder" valign="top" width="28.29%" headers="mcps1.3.13.3.2.4.1.1 "><p id="mrs_01_1931__ae84b648aa0f34482a1c3724c494d1e8d">spark.executor.extraLibraryPath</p>
</td>
<td class="cellrowborder" valign="top" width="58.18%" headers="mcps1.3.13.3.2.4.1.2 "><p id="mrs_01_1931__a97950153a0d9473e951cd1694d7a052d">Sets the special library path used when the executor JVM is started.</p>
</td>
<td class="cellrowborder" valign="top" width="13.530000000000001%" headers="mcps1.3.13.3.2.4.1.3 "><p id="mrs_01_1931__p1150312235483">For details, see <a href="mrs_01_1930.html">Configuring Parameters Rapidly</a>.</p>
</td>
</tr>
<tr id="mrs_01_1931__r92927a22e8164744b37d5f734b4b190c"><td class="cellrowborder" valign="top" width="28.29%" headers="mcps1.3.13.3.2.4.1.1 "><p id="mrs_01_1931__a9b564f425907497e995a515a39d317b0">spark.executor.userClassPathFirst</p>
</td>
<td class="cellrowborder" valign="top" width="58.18%" headers="mcps1.3.13.3.2.4.1.2 "><p id="mrs_01_1931__a0f5394ee4f4649a6b08be25210431a44">(Trial) Same function as <strong id="mrs_01_1931__b427107963113116">spark.driver.userClassPathFirst</strong>. However, this parameter applies to executor instances.</p>
</td>
<td class="cellrowborder" valign="top" width="13.530000000000001%" headers="mcps1.3.13.3.2.4.1.3 "><p id="mrs_01_1931__a4afb01ca655648419bcace4100256e00">false</p>
</td>
</tr>
<tr id="mrs_01_1931__r0b0020d2dbf24e90a430f5d1f45a4999"><td class="cellrowborder" valign="top" width="28.29%" headers="mcps1.3.13.3.2.4.1.1 "><p id="mrs_01_1931__a601246d5ef984bf987d1d55efc5ecfe7">spark.executor.memory</p>
</td>
<td class="cellrowborder" valign="top" width="58.18%" headers="mcps1.3.13.3.2.4.1.2 "><p id="mrs_01_1931__a39ef2ebe37c54010906b7172865640cf">Indicates the memory size used by each executor process. Its character sting is in the same format as the JVM memory (example: 512 MB or 2 GB).</p>
</td>
<td class="cellrowborder" valign="top" width="13.530000000000001%" headers="mcps1.3.13.3.2.4.1.3 "><p id="mrs_01_1931__a6a39042699fd45fe83c832776827f942">4G</p>
</td>
</tr>
<tr id="mrs_01_1931__r219142fd5828404ebb31293db96ca077"><td class="cellrowborder" valign="top" width="28.29%" headers="mcps1.3.13.3.2.4.1.1 "><p id="mrs_01_1931__a470f02145e22400b915cc126c0bea155">spark.executorEnv.[EnvironmentVariableName]</p>
</td>
<td class="cellrowborder" valign="top" width="58.18%" headers="mcps1.3.13.3.2.4.1.2 "><p id="mrs_01_1931__ade2dfaad5fca441fb540135c0d9021b1">Adds the environment variable specified by <strong id="mrs_01_1931__b1900808321113116">EnvironmentVariableName</strong> to the executor process. You can specify multiple environment variables.</p>
</td>
<td class="cellrowborder" valign="top" width="13.530000000000001%" headers="mcps1.3.13.3.2.4.1.3 "><p id="mrs_01_1931__a239e1040e82d4a4eae75c3ab5964f7f4">-</p>
</td>
</tr>
<tr id="mrs_01_1931__r528bf28c16914068bb9e91c760791da4"><td class="cellrowborder" valign="top" width="28.29%" headers="mcps1.3.13.3.2.4.1.1 "><p id="mrs_01_1931__a0deb0fef50c545d99b935c82ccf6b4ad">spark.executor.logs.rolling.maxRetainedFiles</p>
</td>
<td class="cellrowborder" valign="top" width="58.18%" headers="mcps1.3.13.3.2.4.1.2 "><p id="mrs_01_1931__acf1e0b442e9c4b828cbf647b3bc83253">Sets the number of latest log files to be retained by the system during rolling. The old log files are deleted. This function is disabled by default.</p>
</td>
<td class="cellrowborder" valign="top" width="13.530000000000001%" headers="mcps1.3.13.3.2.4.1.3 "><p id="mrs_01_1931__aee05cd8a6e2a449ab93d205f67c10e64">-</p>
</td>
</tr>
<tr id="mrs_01_1931__re5bd3136925c4b7aa2fd1b16f0602159"><td class="cellrowborder" valign="top" width="28.29%" headers="mcps1.3.13.3.2.4.1.1 "><p id="mrs_01_1931__a66eede6c0c874f45a01422088e6bef91">spark.executor.logs.rolling.size.maxBytes</p>
</td>
<td class="cellrowborder" valign="top" width="58.18%" headers="mcps1.3.13.3.2.4.1.2 "><p id="mrs_01_1931__a2c9f357b00114a5cb0452452887ff164">Sets the maximum size of the executor log file for rolling. This function is disabled by default. The value is in bytes. To automatically clear old logs, see <strong id="mrs_01_1931__b919141855113116">spark.executor.logs.rolling.maxRetainedFiles</strong>.</p>
</td>
<td class="cellrowborder" valign="top" width="13.530000000000001%" headers="mcps1.3.13.3.2.4.1.3 "><p id="mrs_01_1931__aa4f52ff6b5bd4659be98c91078f17b52">-</p>
</td>
</tr>
<tr id="mrs_01_1931__r82d5c02fb8f0454e925e1f7003b201f2"><td class="cellrowborder" valign="top" width="28.29%" headers="mcps1.3.13.3.2.4.1.1 "><p id="mrs_01_1931__a05db71bf2f3445a2828922e023001bdd">spark.executor.logs.rolling.strategy</p>
</td>
<td class="cellrowborder" valign="top" width="58.18%" headers="mcps1.3.13.3.2.4.1.2 "><p id="mrs_01_1931__a3e7fbbdc1f8444b2ac952fce6c09b8d3">Sets the executor log rolling policy. Rolling is disabled by default. The value can be <strong id="mrs_01_1931__b457514208113116">time</strong> (time-based rolling) or <strong id="mrs_01_1931__b1001462513113116">size</strong> (size-based rolling). If this parameter is set to <strong id="mrs_01_1931__b576496085113116">time</strong>, the value of the <strong id="mrs_01_1931__b349667786113116">spark.executor.logs.rolling.time.interval</strong> attribute is used as the log rolling interval. If this parameter is set to <strong id="mrs_01_1931__b1790546423113116">size</strong>, <strong id="mrs_01_1931__b1378970792113116">spark.executor.logs.rolling.size.maxBytes</strong> is used to set the maximum size of the file for rolling.</p>
</td>
<td class="cellrowborder" valign="top" width="13.530000000000001%" headers="mcps1.3.13.3.2.4.1.3 "><p id="mrs_01_1931__a6d8fd126ec7a4aee8140f784db56e3d0">-</p>
</td>
</tr>
<tr id="mrs_01_1931__r9947d743968c412588f1dfd4e60e4627"><td class="cellrowborder" valign="top" width="28.29%" headers="mcps1.3.13.3.2.4.1.1 "><p id="mrs_01_1931__a4f81449b7b2947b0bd186c2b7a5ccf04">spark.executor.logs.rolling.time.interval</p>
</td>
<td class="cellrowborder" valign="top" width="58.18%" headers="mcps1.3.13.3.2.4.1.2 "><p id="mrs_01_1931__ae44aa96ca8224a478108a0d8574c7f82">Sets the time interval for executor log rolling. This function is disabled by default. The value can be <strong id="mrs_01_1931__b1074536419113116">daily</strong>, <strong id="mrs_01_1931__b1415027228113116">hourly</strong>, <strong id="mrs_01_1931__b78166901113116">minutely</strong>, or any number of seconds. To automatically clear old logs, see <strong id="mrs_01_1931__b1475178267113116">spark.executor.logs.rolling.maxRetainedFiles</strong>.</p>
</td>
<td class="cellrowborder" valign="top" width="13.530000000000001%" headers="mcps1.3.13.3.2.4.1.3 "><p id="mrs_01_1931__a71447cbc722c4ab5837bc4c8bb6f610d">daily</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="mrs_01_1931__sa850f74db37948e3981a5ce2449e6833"><h4 class="sectiontitle">WebUI</h4><p id="mrs_01_1931__add4d24117da344739887a391e3e5cf3c">The Web UI displays the running process and status of the Spark application.</p>
<div class="tablenoborder"><a name="mrs_01_1931__t681877b034a54c50a58b9e1864345ee4"></a><a name="t681877b034a54c50a58b9e1864345ee4"></a><table cellpadding="4" cellspacing="0" summary="" id="mrs_01_1931__t681877b034a54c50a58b9e1864345ee4" frame="border" border="1" rules="all"><caption><b>Table 13 </b>Parameter description</caption><thead align="left"><tr id="mrs_01_1931__r2e2dfb7b7f744220ae16c49866a6f95a"><th align="left" class="cellrowborder" valign="top" width="28.29%" id="mcps1.3.14.3.2.4.1.1"><p id="mrs_01_1931__a10a131446519457d95d83df6d29783b4"><strong id="mrs_01_1931__a7c3b58a1e4eb49ddb88ee6be1a8b42c7">Parameter</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="58.18%" id="mcps1.3.14.3.2.4.1.2"><p id="mrs_01_1931__ab888d01693e54d608a931dccb2433104"><strong id="mrs_01_1931__a8f6670a389604d368bbca32bc22b67e7">Description</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="13.530000000000001%" id="mcps1.3.14.3.2.4.1.3"><p id="mrs_01_1931__ad783c0f2162a4a6f8154af045f457152"><strong id="mrs_01_1931__aa901087d3a50428e81be9b54603147e7">Default Value</strong></p>
</th>
</tr>
</thead>
<tbody><tr id="mrs_01_1931__r531baf01e5e947d2b0c721dfb2e5858b"><td class="cellrowborder" valign="top" width="28.29%" headers="mcps1.3.14.3.2.4.1.1 "><p id="mrs_01_1931__a202e02074e714a36a341106b2b0057c6">spark.ui.killEnabled</p>
</td>
<td class="cellrowborder" valign="top" width="58.18%" headers="mcps1.3.14.3.2.4.1.2 "><p id="mrs_01_1931__a73d5ec93a54d4a10bfc7eb6dfbaac974">Allows stages and jobs to be stopped on the web UI.</p>
<div class="note" id="mrs_01_1931__n8f69b7bc25164b90bb06d8a017687530"><span class="notetitle"> NOTE: </span><div class="notebody"><p id="mrs_01_1931__a30a73a962a94433d9954f19584adf58e">For security purposes, the default value of this parameter is set to <strong id="mrs_01_1931__b1482558711113116">false</strong> to prevent misoperations. To enable this function, set this parameter to <strong id="mrs_01_1931__b314643566113116">true</strong> in the <strong id="mrs_01_1931__b674737933113116">spark-defaults.conf</strong> configuration file. Exercise caution when performing this operation.</p>
</div></div>
</td>
<td class="cellrowborder" valign="top" width="13.530000000000001%" headers="mcps1.3.14.3.2.4.1.3 "><p id="mrs_01_1931__a011ad8fe28424a93817310e368b1504a">true</p>
</td>
</tr>
<tr id="mrs_01_1931__r9e714909e3a94851aeb463c1c825b0f6"><td class="cellrowborder" valign="top" width="28.29%" headers="mcps1.3.14.3.2.4.1.1 "><p id="mrs_01_1931__ac1162a420b084d2590c03e0f156ff1e7">spark.ui.port</p>
</td>
<td class="cellrowborder" valign="top" width="58.18%" headers="mcps1.3.14.3.2.4.1.2 "><p id="mrs_01_1931__ae75cc4ba2a454d3aa0d9c7a39a5b345c">Specifies the port for your application's dashboard, which displays memory and workload data.</p>
</td>
<td class="cellrowborder" valign="top" width="13.530000000000001%" headers="mcps1.3.14.3.2.4.1.3 "><p id="mrs_01_1931__a8d9e7b95894247e6b33b57a8b408ade3"></p>
<ul id="mrs_01_1931__ubdf38b51c9604d5690fd357f652ed4ac"><li id="mrs_01_1931__l632e5933e0f34950b67161e836fb39ca">JDBCServer2x: <strong id="mrs_01_1931__b2414153412525">4040</strong></li><li id="mrs_01_1931__l5a59d7fe6ab34d5185ce54ded451b0d8">SparkResource2x: 0</li><li id="mrs_01_1931__li1114171211536">IndexServer2x: 22901</li></ul>
</td>
</tr>
<tr id="mrs_01_1931__r8c4773027d314cfdaa0bf7d275107bb0"><td class="cellrowborder" valign="top" width="28.29%" headers="mcps1.3.14.3.2.4.1.1 "><p id="mrs_01_1931__aab78a6ffb83e4940a75c6423753ed602">spark.ui.retainedJobs</p>
</td>
<td class="cellrowborder" valign="top" width="58.18%" headers="mcps1.3.14.3.2.4.1.2 "><p id="mrs_01_1931__a6e105e4acc954f40b9b7e1e194d627ba">Specifies the number of jobs recorded by the Spark UI and status API before GC.</p>
</td>
<td class="cellrowborder" valign="top" width="13.530000000000001%" headers="mcps1.3.14.3.2.4.1.3 "><p id="mrs_01_1931__a3dc7fa160bee49ab98470965ffcc0d40">1000</p>
</td>
</tr>
<tr id="mrs_01_1931__rb532fd5647844b81be7c6b390b866911"><td class="cellrowborder" valign="top" width="28.29%" headers="mcps1.3.14.3.2.4.1.1 "><p id="mrs_01_1931__a41ff4858b7b8479b898c3bfd363c7da8">spark.ui.retainedStages</p>
</td>
<td class="cellrowborder" valign="top" width="58.18%" headers="mcps1.3.14.3.2.4.1.2 "><p id="mrs_01_1931__a00600ea19ff94c2d8773f51b670e2473">Specifies the number of stages recorded by the Spark UI and status API before GC.</p>
</td>
<td class="cellrowborder" valign="top" width="13.530000000000001%" headers="mcps1.3.14.3.2.4.1.3 "><p id="mrs_01_1931__af9b6d99b75f9487cbb9ea8cae7d03d54">1000</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="mrs_01_1931__sca30cd5653134732b7cd5cc5bc994058"><h4 class="sectiontitle">HistoryServer</h4><p id="mrs_01_1931__a5454a8cce7e5482ca7fb5f2d8f16cdac">A History Server reads the <strong id="mrs_01_1931__b1951354192113116">EventLog</strong> file in the file system and displays the running status of the Spark application.</p>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="mrs_01_1931__t5e011e7472e947b895a0e2d4a788db0c" frame="border" border="1" rules="all"><caption><b>Table 14 </b>Parameter description</caption><thead align="left"><tr id="mrs_01_1931__r2a2ce206363f49d6a8c24440c05c121a"><th align="left" class="cellrowborder" valign="top" width="28.29%" id="mcps1.3.15.3.2.4.1.1"><p id="mrs_01_1931__ac6e3b12cb7fb4924bef0e51bca68693a"><strong id="mrs_01_1931__a768a4cac3a9948bc8cb933055be6b227">Parameter</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="58.18%" id="mcps1.3.15.3.2.4.1.2"><p id="mrs_01_1931__a55a86ba5561345c9998289755f4f9562"><strong id="mrs_01_1931__acbbd5c2b0d314943bdf19286643ea0be">Description</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="13.530000000000001%" id="mcps1.3.15.3.2.4.1.3"><p id="mrs_01_1931__a46db229b06fe46eb851c912cf4605ef6"><strong id="mrs_01_1931__a2e65eeca0cc745479acbb65651b1e266">Default Value</strong></p>
</th>
</tr>
</thead>
<tbody><tr id="mrs_01_1931__r5f21f442f5534fc1a82f9cfa2cd28ab4"><td class="cellrowborder" valign="top" width="28.29%" headers="mcps1.3.15.3.2.4.1.1 "><p id="mrs_01_1931__acf4f232a989243eeb4d2995b87e06d57">spark.history.fs.logDirectory</p>
</td>
<td class="cellrowborder" valign="top" width="58.18%" headers="mcps1.3.15.3.2.4.1.2 "><p id="mrs_01_1931__a81ce84b1d6fe48deba131843bca4cd3e">Specifies the log directory of a History Server.</p>
</td>
<td class="cellrowborder" valign="top" width="13.530000000000001%" headers="mcps1.3.15.3.2.4.1.3 "><p id="mrs_01_1931__a31f9cd2f2617414fb30e3fecaade60dd">-</p>
</td>
</tr>
<tr id="mrs_01_1931__r6cf31c8815fc4b91aadc7c242198cf4e"><td class="cellrowborder" valign="top" width="28.29%" headers="mcps1.3.15.3.2.4.1.1 "><p id="mrs_01_1931__a8a8c660d86f14cc8a80ab2dd2c61e8db">spark.history.ui.port</p>
</td>
<td class="cellrowborder" valign="top" width="58.18%" headers="mcps1.3.15.3.2.4.1.2 "><p id="mrs_01_1931__a4606125ab9b54f2397a628610db7480b">Specifies the port for JobHistory listening to connection.</p>
</td>
<td class="cellrowborder" valign="top" width="13.530000000000001%" headers="mcps1.3.15.3.2.4.1.3 "><p id="mrs_01_1931__p18350192918275">18080</p>
</td>
</tr>
<tr id="mrs_01_1931__r40f9fbe599fa4c33918db79ca6071a4d"><td class="cellrowborder" valign="top" width="28.29%" headers="mcps1.3.15.3.2.4.1.1 "><p id="mrs_01_1931__a4521a73f3c3b4caeb5ec02eec502f39d">spark.history.fs.updateInterval</p>
</td>
<td class="cellrowborder" valign="top" width="58.18%" headers="mcps1.3.15.3.2.4.1.2 "><p id="mrs_01_1931__af1f2aa38329b437c94468105444a2d36">Specifies the update interval of the information displayed on a History Server, in seconds. Each update checks for changes made to the event logs in the persistent store.</p>
</td>
<td class="cellrowborder" valign="top" width="13.530000000000001%" headers="mcps1.3.15.3.2.4.1.3 "><p id="mrs_01_1931__aa9986751365f41009af6f0a8d6aa9983">10s</p>
</td>
</tr>
<tr id="mrs_01_1931__r62f856c6cdcb4c1ea494bdc9a7050fe0"><td class="cellrowborder" valign="top" width="28.29%" headers="mcps1.3.15.3.2.4.1.1 "><p id="mrs_01_1931__a67c4e8c6b1224f099d40b6e46c543099">spark.history.fs.update.interval.seconds</p>
</td>
<td class="cellrowborder" valign="top" width="58.18%" headers="mcps1.3.15.3.2.4.1.2 "><p id="mrs_01_1931__aed83e1173cc94d90b197488fa2bacfdf">Specifies the interval for checking the update of each event log. This parameter has the same function as <strong id="mrs_01_1931__b169774127113116">spark.history.fs.updateInterval</strong>. <strong id="mrs_01_1931__b928206589113116">spark.history.fs.updateInterval</strong> is recommended.</p>
</td>
<td class="cellrowborder" valign="top" width="13.530000000000001%" headers="mcps1.3.15.3.2.4.1.3 "><p id="mrs_01_1931__a519bd7c09d114c07901bfa63da795ab0">10s</p>
</td>
</tr>
<tr id="mrs_01_1931__r2a2832f2a0b1457e97ef5128eeaacb2d"><td class="cellrowborder" valign="top" width="28.29%" headers="mcps1.3.15.3.2.4.1.1 "><p id="mrs_01_1931__abe066a479ec044e0b8de46b6a7c174f8">spark.history.updateInterval</p>
</td>
<td class="cellrowborder" valign="top" width="58.18%" headers="mcps1.3.15.3.2.4.1.2 "><p id="mrs_01_1931__a66003088d09d46219e048e347615454b">This parameter has the same function as <strong id="mrs_01_1931__b749334320113116">spark.history.fs.update.interval.seconds</strong> and <strong id="mrs_01_1931__b348880081113116">spark.history.fs.updateInterval</strong>. <strong id="mrs_01_1931__b1649540383113116">spark.history.fs.updateInterval</strong> is recommended.</p>
</td>
<td class="cellrowborder" valign="top" width="13.530000000000001%" headers="mcps1.3.15.3.2.4.1.3 "><p id="mrs_01_1931__af8b2fafa701c4149902f34d248dd85d6">10s</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="mrs_01_1931__se68abbb3b81548c0bbc6cdcc17254024"><h4 class="sectiontitle">History Server UI Timeout and Maximum Number of Access Times</h4>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="mrs_01_1931__t189419c804ed44629413d6af1319dd0f" frame="border" border="1" rules="all"><caption><b>Table 15 </b>Parameter description</caption><thead align="left"><tr id="mrs_01_1931__r9835629e3aef44458492b516ef419144"><th align="left" class="cellrowborder" valign="top" width="32.24%" id="mcps1.3.16.2.2.4.1.1"><p id="mrs_01_1931__ad25711b16baf443a8cf32cc92932c93d"><strong id="mrs_01_1931__a1e571f58247c4460a49d6da63f020a45">Parameter</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="54.6%" id="mcps1.3.16.2.2.4.1.2"><p id="mrs_01_1931__a90c98da9b1ce4b4281b44de1f0c74a5a"><strong id="mrs_01_1931__a013188ba8fe549daab53cfa3d709e5bf">Description</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="13.16%" id="mcps1.3.16.2.2.4.1.3"><p id="mrs_01_1931__aac39687d09f44815a6a965f1c615e51f"><strong id="mrs_01_1931__ad9d2a7b255404eff97dca283bd500007">Default Value</strong></p>
</th>
</tr>
</thead>
<tbody><tr id="mrs_01_1931__r0a8aa9b9995143849dd55848d7173255"><td class="cellrowborder" valign="top" width="32.24%" headers="mcps1.3.16.2.2.4.1.1 "><p id="mrs_01_1931__a0093ce781d464292a9f7b3d97b93524f">spark.session.maxAge</p>
</td>
<td class="cellrowborder" valign="top" width="54.6%" headers="mcps1.3.16.2.2.4.1.2 "><p id="mrs_01_1931__ae3665da1b85c466690a11fa45d2be35d">Specifies the session timeout interval, in seconds. This parameter applies only to the security mode. This parameter cannot be set in normal mode.</p>
</td>
<td class="cellrowborder" valign="top" width="13.16%" headers="mcps1.3.16.2.2.4.1.3 "><p id="mrs_01_1931__a3876cead438e43f9ab28646098056c79">600</p>
</td>
</tr>
<tr id="mrs_01_1931__r31194dba6f2648ca99dcffbe1803b99d"><td class="cellrowborder" valign="top" width="32.24%" headers="mcps1.3.16.2.2.4.1.1 "><p id="mrs_01_1931__a2a9591655cb64c1abc73e16131efafeb">spark.connection.maxRequest</p>
</td>
<td class="cellrowborder" valign="top" width="54.6%" headers="mcps1.3.16.2.2.4.1.2 "><p id="mrs_01_1931__aaa04fad506244d6eaae46ade716b6d00">Specifies the maximum number of concurrent client access requests to JobHistory.</p>
</td>
<td class="cellrowborder" valign="top" width="13.16%" headers="mcps1.3.16.2.2.4.1.3 "><p id="mrs_01_1931__a57ccc80ade3d4945a43f2702eddf9bac">5000</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="mrs_01_1931__s4e17242bc9794d6d89ffe53800c5fdb8"><h4 class="sectiontitle">EventLog</h4><p id="mrs_01_1931__ad7fa8e82782241e3a83c6bff9a6b9fdb">During the running of Spark applications, the running status is written into the file system in JSON format in real time for the History Server service to read and reproduce the application running status.</p>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="mrs_01_1931__tee3be067ce56486e8ad5c70c8f2acaf9" frame="border" border="1" rules="all"><caption><b>Table 16 </b>Parameter description</caption><thead align="left"><tr id="mrs_01_1931__rec9a271caee44249a34deee4d7f9a148"><th align="left" class="cellrowborder" valign="top" width="22.58%" id="mcps1.3.17.3.2.4.1.1"><p id="mrs_01_1931__a73d6965db4ae4481b99fbe311eb9e482"><strong id="mrs_01_1931__a20f5235a613e485ebfc587339eda5a82">Parameter</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="64.03%" id="mcps1.3.17.3.2.4.1.2"><p id="mrs_01_1931__a158a8d05cc2e4e51a1b9cf491013ea32"><strong id="mrs_01_1931__a500a57fa8d0f49e492232f22613650e9">Description</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="13.389999999999999%" id="mcps1.3.17.3.2.4.1.3"><p id="mrs_01_1931__a32613209de9b45eb82d140e6a7ab8c11"><strong id="mrs_01_1931__a83fdea545cac46d690f2d25bdf885c5e">Default Value</strong></p>
</th>
</tr>
</thead>
<tbody><tr id="mrs_01_1931__r612575f3e90944aaac57598fca0340df"><td class="cellrowborder" valign="top" width="22.58%" headers="mcps1.3.17.3.2.4.1.1 "><p id="mrs_01_1931__af7f4cdedaaf443eda4b541ca7bb46d84">spark.eventLog.enabled</p>
</td>
<td class="cellrowborder" valign="top" width="64.03%" headers="mcps1.3.17.3.2.4.1.2 "><p id="mrs_01_1931__a76ae99fe399a4e90a536412c9571e93b">Indicates whether to log Spark events, which are used to reconstruct the web UI after the application execution is complete.</p>
</td>
<td class="cellrowborder" valign="top" width="13.389999999999999%" headers="mcps1.3.17.3.2.4.1.3 "><p id="mrs_01_1931__a2add98ec7d40452ea8a0904a33a27dd1">true</p>
</td>
</tr>
<tr id="mrs_01_1931__r0cef38ba903e4f648f345bf3a8abad42"><td class="cellrowborder" valign="top" width="22.58%" headers="mcps1.3.17.3.2.4.1.1 "><p id="mrs_01_1931__a8f79c631f16e46f4a74d707e83fcca27">spark.eventLog.dir</p>
</td>
<td class="cellrowborder" valign="top" width="64.03%" headers="mcps1.3.17.3.2.4.1.2 "><p id="mrs_01_1931__aea6fe4ccbaac45d18eb4840f003ad704">Indicates the directory for logging Spark events if <strong id="mrs_01_1931__b171433103113116">spark.eventLog.enabled</strong> is set to <strong id="mrs_01_1931__b1055423225113116">true</strong>. In this directory, Spark creates a subdirectory for each application and logs events of the application in the subdirectory. You can also set a unified address similar to the HDFS directory so that the History Server can read historical files.</p>
</td>
<td class="cellrowborder" valign="top" width="13.389999999999999%" headers="mcps1.3.17.3.2.4.1.3 "><p id="mrs_01_1931__a23ca1d28e43a44c689a1aa2be48f8f3b">hdfs://hacluster/spark2xJobHistory2x</p>
</td>
</tr>
<tr id="mrs_01_1931__rbc30415210954d059906328bcd9ab36e"><td class="cellrowborder" valign="top" width="22.58%" headers="mcps1.3.17.3.2.4.1.1 "><p id="mrs_01_1931__a283f7af868444262b5b008db8b315807">spark.eventLog.compress</p>
</td>
<td class="cellrowborder" valign="top" width="64.03%" headers="mcps1.3.17.3.2.4.1.2 "><p id="mrs_01_1931__a88cec0d2961e41b081c2e098f70ba70d">Indicates whether to compress logged events when <strong id="mrs_01_1931__b290271883113116">spark.eventLog.enabled</strong> is set to <strong id="mrs_01_1931__b154081073113116">true</strong>.</p>
</td>
<td class="cellrowborder" valign="top" width="13.389999999999999%" headers="mcps1.3.17.3.2.4.1.3 "><p id="mrs_01_1931__af8f080bfe1914e259f5080bcd67c4077">false</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="mrs_01_1931__s9fd5277d37d047d892370e1223565c07"><h4 class="sectiontitle">Periodic Clearing of Event Logs</h4><p id="mrs_01_1931__a67e3b8660a9b44488d13aa9017cc60f3">Event logs on JobHistory increases with submitted tasks. Too many event log files exist as the number of submitted tasks increases. Spark provides the function for periodically clearing event logs. You can enable this function and set the clearing interval using related parameters.</p>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="mrs_01_1931__td3ac854d53f445bc913ce0f80d71de80" frame="border" border="1" rules="all"><caption><b>Table 17 </b>Parameter description</caption><thead align="left"><tr id="mrs_01_1931__r9491923030ae4e15864190e97fe047af"><th align="left" class="cellrowborder" valign="top" width="32.24%" id="mcps1.3.18.3.2.4.1.1"><p id="mrs_01_1931__ad87dd8481d3f41d3ac4c4013e3991849"><strong id="mrs_01_1931__affa4cd57038c44959eb2b09b5a764e87">Parameter</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="46.89%" id="mcps1.3.18.3.2.4.1.2"><p id="mrs_01_1931__a7b2081d4347f45afb4d56e64f2ef8908"><strong id="mrs_01_1931__ab414657e87bf44b980d71798beef695b">Description</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="20.87%" id="mcps1.3.18.3.2.4.1.3"><p id="mrs_01_1931__a1b0d9d4ec3c24e17b84e7398bad099aa"><strong id="mrs_01_1931__a2e3d930575ce435a8095a2df24516321">Default Value</strong></p>
</th>
</tr>
</thead>
<tbody><tr id="mrs_01_1931__r0e3c6ec71df5486f99333a1b3ecb477a"><td class="cellrowborder" valign="top" width="32.24%" headers="mcps1.3.18.3.2.4.1.1 "><p id="mrs_01_1931__ab9b315ce88e64aee91ec455f54516b94">spark.history.fs.cleaner.enabled</p>
</td>
<td class="cellrowborder" valign="top" width="46.89%" headers="mcps1.3.18.3.2.4.1.2 "><p id="mrs_01_1931__a06130cfbd3e44c85bcc00a51ecea8bb2">Indicates whether to enable the clearing function.</p>
</td>
<td class="cellrowborder" valign="top" width="20.87%" headers="mcps1.3.18.3.2.4.1.3 "><p id="mrs_01_1931__ae8a6f7f00ce540bca034b46051613025">true</p>
</td>
</tr>
<tr id="mrs_01_1931__rf0e5d11c58e74cbb8c6a6e6c99b2de32"><td class="cellrowborder" valign="top" width="32.24%" headers="mcps1.3.18.3.2.4.1.1 "><p id="mrs_01_1931__a8b2c5948ca1f469b9355634d2e7ac467">spark.history.fs.cleaner.interval</p>
</td>
<td class="cellrowborder" valign="top" width="46.89%" headers="mcps1.3.18.3.2.4.1.2 "><p id="mrs_01_1931__aaa013c1f6a684c64bba5bcf44e164204">Indicates the check interval of the clearing function.</p>
</td>
<td class="cellrowborder" valign="top" width="20.87%" headers="mcps1.3.18.3.2.4.1.3 "><p id="mrs_01_1931__afce1f4f370d0462f89c8cc3542512d26">1d</p>
</td>
</tr>
<tr id="mrs_01_1931__rd667db3aa39641e9925cdb5111c8aaf1"><td class="cellrowborder" valign="top" width="32.24%" headers="mcps1.3.18.3.2.4.1.1 "><p id="mrs_01_1931__a412807abd28b4b3f95f69227cb215e48">spark.history.fs.cleaner.maxAge</p>
</td>
<td class="cellrowborder" valign="top" width="46.89%" headers="mcps1.3.18.3.2.4.1.2 "><p id="mrs_01_1931__aa0ea24c7e32f45a4945827d6d3366c4d">Indicates the maximum duration for storing logs.</p>
</td>
<td class="cellrowborder" valign="top" width="20.87%" headers="mcps1.3.18.3.2.4.1.3 "><p id="mrs_01_1931__a1f368d29f8e84256a659f4215fef957d">4d</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="mrs_01_1931__s517f4aec14934a77b5ffdc9e7c48c5ad"><h4 class="sectiontitle">Kryo</h4><p id="mrs_01_1931__ae452d9163eaa4a6da4a523286eb35c00">Kryo is a highly efficient Java serialization framework, which is integrated into Spark by default. Almost all Spark performance tuning requires the process of converting the default serializer of Spark into a Kryo serializer. Kryo serialization supports only serialization at the Spark data layer. To configure Kryo serialization, set <span class="parmname" id="mrs_01_1931__p8cc7c9cbf7c1421f80efa6f300cdf3aa"><b>spark.serializer</b></span> to <span class="parmvalue" id="mrs_01_1931__peb2ec9b25a5f43a68066b929d00187bf"><b>org.apache.spark.serializer.KryoSerializer</b></span> and configure the following parameters to optimize Kryo serialization performance:</p>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="mrs_01_1931__t07b41ac47c8f4fa9ac64858d3c25ab18" frame="border" border="1" rules="all"><caption><b>Table 18 </b>Parameter description</caption><thead align="left"><tr id="mrs_01_1931__r26937ffc183d4ea19be4c697390ed325"><th align="left" class="cellrowborder" valign="top" width="25.840000000000003%" id="mcps1.3.19.3.2.4.1.1"><p id="mrs_01_1931__a22059de339ea4a5cb90038eb442b7819"><strong id="mrs_01_1931__a696843dd7b3d42c3a50aa17bb0f97850">Parameter</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="61.38%" id="mcps1.3.19.3.2.4.1.2"><p id="mrs_01_1931__a8e0bb4a9acef40cfa39cb0f8eea441fc"><strong id="mrs_01_1931__af28375b215ac4a45afb1e274565dda2f">Description</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="12.78%" id="mcps1.3.19.3.2.4.1.3"><p id="mrs_01_1931__a60abbb3da8c347129db92de4ba27b001"><strong id="mrs_01_1931__a95afe8ec7f5448fa97eb827e66245366">Default Value</strong></p>
</th>
</tr>
</thead>
<tbody><tr id="mrs_01_1931__r6334f90c6bf9484e9c6fd520e3213f90"><td class="cellrowborder" valign="top" width="25.840000000000003%" headers="mcps1.3.19.3.2.4.1.1 "><p id="mrs_01_1931__a57fe133f44db49bb8f285090c36ca79a">spark.kryo.classesToRegister</p>
</td>
<td class="cellrowborder" valign="top" width="61.38%" headers="mcps1.3.19.3.2.4.1.2 "><p id="mrs_01_1931__a47d4fc70ee2b4dfea75bb789998874e2">Specifies the name of the class that needs to be registered with Kryo when Kryo serialization is used. Multiple classes are separated by commas (,).</p>
</td>
<td class="cellrowborder" valign="top" width="12.78%" headers="mcps1.3.19.3.2.4.1.3 "><p id="mrs_01_1931__a39d65b9d5b394677b8b13653258f5314">-</p>
</td>
</tr>
<tr id="mrs_01_1931__r9c430f83e28f40f9ba59f2449344c063"><td class="cellrowborder" valign="top" width="25.840000000000003%" headers="mcps1.3.19.3.2.4.1.1 "><p id="mrs_01_1931__a811bd935a9e6473096a73c6a4c08071e">spark.kryo.referenceTracking</p>
</td>
<td class="cellrowborder" valign="top" width="61.38%" headers="mcps1.3.19.3.2.4.1.2 "><p id="mrs_01_1931__a42a76131b20840d490d953e53b99602b">Indicates whether to trace the references to the same object when Kryo is used to serialize data. This function is applicable to the scenario where the object graph has circular references or the same object has multiple copies. Otherwise, you can disable this function to improve performance.</p>
</td>
<td class="cellrowborder" valign="top" width="12.78%" headers="mcps1.3.19.3.2.4.1.3 "><p id="mrs_01_1931__a3a94efeb5a1d4b36aefa462e20c20b1d">true</p>
</td>
</tr>
<tr id="mrs_01_1931__rb4db8208d4fc4beda2b392e573d4a2d3"><td class="cellrowborder" valign="top" width="25.840000000000003%" headers="mcps1.3.19.3.2.4.1.1 "><p id="mrs_01_1931__aebb9e497d650498490ab7488d926f0c5">spark.kryo.registrationRequired</p>
</td>
<td class="cellrowborder" valign="top" width="61.38%" headers="mcps1.3.19.3.2.4.1.2 "><p id="mrs_01_1931__a2afe12a35e5648b093e8612d2385fa1e">Indicates whether Kryo is used to register an object. When this parameter is set to <strong id="mrs_01_1931__b1668391481113116">true</strong>, an exception is thrown if an object that is not registered with Kryo is serialized. When it is set to <strong id="mrs_01_1931__b806618278113116">false</strong> (default value), Kryo writes unregistered class names to the serialized object. This operation causes a large amount of performance overhead. Therefore, you need to enable this option before deleting a class from the registration queue.</p>
</td>
<td class="cellrowborder" valign="top" width="12.78%" headers="mcps1.3.19.3.2.4.1.3 "><p id="mrs_01_1931__a64ec9f1388ce44299f0ac93e83fb740e">false</p>
</td>
</tr>
<tr id="mrs_01_1931__rea928997f68d48c9a9c2159561cdf42e"><td class="cellrowborder" valign="top" width="25.840000000000003%" headers="mcps1.3.19.3.2.4.1.1 "><p id="mrs_01_1931__a8c0a7e51ca8a4a12b6cea5c4e680844f">spark.kryo.registrator</p>
</td>
<td class="cellrowborder" valign="top" width="61.38%" headers="mcps1.3.19.3.2.4.1.2 "><p id="mrs_01_1931__a77fc44e08e8441fcafe7859e91c1c054">If Kryo serialization is used, use Kryo to register the class with the custom class. Use this property if you need to register a class in a custom way, such as specifying a custom field serializer. Otherwise, use <strong id="mrs_01_1931__b162498793113116">spark.kryo.classesToRegister</strong>, which is simpler. Set this parameter to a class that extends KryoRegistrator.</p>
</td>
<td class="cellrowborder" valign="top" width="12.78%" headers="mcps1.3.19.3.2.4.1.3 "><p id="mrs_01_1931__ad309d3a048b94b54bcd33950505736c8">-</p>
</td>
</tr>
<tr id="mrs_01_1931__rbdf439ad56754eb7bcbf6c90deee0a9b"><td class="cellrowborder" valign="top" width="25.840000000000003%" headers="mcps1.3.19.3.2.4.1.1 "><p id="mrs_01_1931__ae13cf800fbae4b27bbebe76e04476b04">spark.kryoserializer.buffer.max</p>
</td>
<td class="cellrowborder" valign="top" width="61.38%" headers="mcps1.3.19.3.2.4.1.2 "><p id="mrs_01_1931__ae64b8f86d6a94c68ba58dbd81ed233a4">Specifies the maximum size of the Kryo serialization buffer, in MB. The value must be greater than the object that attempts to be serialized. If the error "buffer limit exceeded" occurs in Kryo, increase the value of this parameter. You can also set this parameter by setting <strong id="mrs_01_1931__b1454183365113116">spark.kryoserializer.buffer.max</strong>.</p>
</td>
<td class="cellrowborder" valign="top" width="12.78%" headers="mcps1.3.19.3.2.4.1.3 "><p id="mrs_01_1931__ae3d2afe19eb245dfa5d20b7c87b09baf">64MB</p>
</td>
</tr>
<tr id="mrs_01_1931__r68d79626020d460fb2c46e35247ca329"><td class="cellrowborder" valign="top" width="25.840000000000003%" headers="mcps1.3.19.3.2.4.1.1 "><p id="mrs_01_1931__af4aa67d47b934b5e973b5a781acab453">spark.kryoserializer.buffer</p>
</td>
<td class="cellrowborder" valign="top" width="61.38%" headers="mcps1.3.19.3.2.4.1.2 "><p id="mrs_01_1931__a9183baef28fb417987040a468e3793e4">Specifies the initial size of the Kryo serialization buffer, in MB. Each core of each worker has a buffer. If necessary, the buffer size will be increased to the value of <strong id="mrs_01_1931__b1653898643113116">spark.kryoserializer.buffer.max</strong>. You can also set this parameter by setting <strong id="mrs_01_1931__b426868810113116">spark.kryoserializer.buffer</strong>.</p>
</td>
<td class="cellrowborder" valign="top" width="12.78%" headers="mcps1.3.19.3.2.4.1.3 "><p id="mrs_01_1931__a64bc1f8380f74f21b28be9333ba433ad">64KB</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="mrs_01_1931__se75354dd6fff406f9bb3719cb07f449e"><h4 class="sectiontitle">Broadcast</h4><p id="mrs_01_1931__ac685bddea05c45fb93ecf0fe4d8b5eb1">Broadcast is used to transmit data blocks between Spark processes. In Spark, broadcast can be used for JAR packages, files, closures, and returned results. Broadcast supports two modes: Torrent and HTTP. The Torrent mode divides data into small fragments and distributes them to clusters. Data can be obtained remotely if necessary. The HTTP mode saves files to the local disk and transfers the entire files to the remote end through HTTP if necessary. The former is more stable than the latter. Therefore, Torrent is the default broadcast mode.</p>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="mrs_01_1931__t3be35c616b46464fb63c6498738a5fde" frame="border" border="1" rules="all"><caption><b>Table 19 </b>Parameter description</caption><thead align="left"><tr id="mrs_01_1931__r60043feb906b422d8de1322d17ab426f"><th align="left" class="cellrowborder" valign="top" width="25.840000000000003%" id="mcps1.3.20.3.2.4.1.1"><p id="mrs_01_1931__a1bd3a7fd58e34d9ab41bbdbaf539d1e5"><strong id="mrs_01_1931__abf584b815cf14a138625ea541f9e75d2">Parameter</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="53.49%" id="mcps1.3.20.3.2.4.1.2"><p id="mrs_01_1931__ae5117b005421457a96d649e30b3f0926"><strong id="mrs_01_1931__abdef0eaa2ba24aaf8a46eefe72809f6c">Description</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="20.669999999999998%" id="mcps1.3.20.3.2.4.1.3"><p id="mrs_01_1931__a494c4b397d2b4fc5b703b8a60f2ba977"><strong id="mrs_01_1931__a30dcd8ab8bf04fbd94258b4cef4a8e11">Default Value</strong></p>
</th>
</tr>
</thead>
<tbody><tr id="mrs_01_1931__r44df10354d9b48dbb9bccec8a0fbd688"><td class="cellrowborder" valign="top" width="25.840000000000003%" headers="mcps1.3.20.3.2.4.1.1 "><p id="mrs_01_1931__a531fda02491c4636952b69532afd0c3c">spark.broadcast.factory</p>
</td>
<td class="cellrowborder" valign="top" width="53.49%" headers="mcps1.3.20.3.2.4.1.2 "><p id="mrs_01_1931__a548e0bd7df19455abd4d387f39f6cd47">Indicates the broadcast mode.</p>
</td>
<td class="cellrowborder" valign="top" width="20.669999999999998%" headers="mcps1.3.20.3.2.4.1.3 "><p id="mrs_01_1931__afdce512a0efc4a9081902cdcbcd2c9ad">org.apache.spark.broadcast.TorrentBroadcastFactory</p>
</td>
</tr>
<tr id="mrs_01_1931__ra0b3e1b2fb0d4c0aa5fe37deb4cf483b"><td class="cellrowborder" valign="top" width="25.840000000000003%" headers="mcps1.3.20.3.2.4.1.1 "><p id="mrs_01_1931__a1253860b076449fa80d877bf984975b8">spark.broadcast.blockSize</p>
</td>
<td class="cellrowborder" valign="top" width="53.49%" headers="mcps1.3.20.3.2.4.1.2 "><p id="mrs_01_1931__ae0b08dd2549e42a39063869e0f9ad9c3">Indicates the block size of <strong id="mrs_01_1931__b350436373113116">TorrentBroadcastFactory</strong>. If the value is too large, the concurrency during broadcast is reduced (the speed is slow). If the value is too small, BlockManager performance may be affected.</p>
</td>
<td class="cellrowborder" valign="top" width="20.669999999999998%" headers="mcps1.3.20.3.2.4.1.3 "><p id="mrs_01_1931__a064e908aa5734f8f82bde7050c2e9fc2">4096</p>
</td>
</tr>
<tr id="mrs_01_1931__r0cfe5b17bafd49be8bae426004d7f6ce"><td class="cellrowborder" valign="top" width="25.840000000000003%" headers="mcps1.3.20.3.2.4.1.1 "><p id="mrs_01_1931__a191303cea59748daaedd1899ae17e2a0">spark.broadcast.compress</p>
</td>
<td class="cellrowborder" valign="top" width="53.49%" headers="mcps1.3.20.3.2.4.1.2 "><p id="mrs_01_1931__abb5dae3105094354b7586336e4e7c7d9">Indicates whether to compress broadcast variables before sending them. You are advised to compress the broadcast variables.</p>
</td>
<td class="cellrowborder" valign="top" width="20.669999999999998%" headers="mcps1.3.20.3.2.4.1.3 "><p id="mrs_01_1931__a57831d21aa4046be87bd4f3aa9be9919">true</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="mrs_01_1931__scbd203c766a64dff8f37d31e0791904f"><h4 class="sectiontitle">Storage</h4><p id="mrs_01_1931__ab8bf9599cf8d4d0884845c54d33b8654">Spark features in-memory computing. Spark Storage is used to manage memory resources. Storage stores data blocks generated during RDD caching. The heap memory in the JVM acts as a whole. Therefore, <span class="parmname" id="mrs_01_1931__p6f12e106abe54cc09cd4620167a25041"><b>Storage Memory Size</b></span> is an important concept during Spark Storage management.</p>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="mrs_01_1931__t2fcf863cccd34a598277eccb915aa47c" frame="border" border="1" rules="all"><caption><b>Table 20 </b>Parameter description</caption><thead align="left"><tr id="mrs_01_1931__rb9ef5c9d6d47494c9f4a0a3a11e8bb5e"><th align="left" class="cellrowborder" valign="top" width="25.840000000000003%" id="mcps1.3.21.3.2.4.1.1"><p id="mrs_01_1931__ab0e9a7a697514b27bd73fb308bc8f78d"><strong id="mrs_01_1931__a7e77cfd3cbc040d1a7d261a84230ad49">Parameter</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="61.760000000000005%" id="mcps1.3.21.3.2.4.1.2"><p id="mrs_01_1931__a061484e901184d3e976761758bc2d553"><strong id="mrs_01_1931__afb5e761b13a741cb8c84c70816d40a96">Description</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="12.4%" id="mcps1.3.21.3.2.4.1.3"><p id="mrs_01_1931__ac5e0535feb494f8999d9d8691b9072ef"><strong id="mrs_01_1931__a3f7b1d3ab9624b2399655a4fd6ef30d2">Default Value</strong></p>
</th>
</tr>
</thead>
<tbody><tr id="mrs_01_1931__rf67faf6f491149cd924fc9958377c150"><td class="cellrowborder" valign="top" width="25.840000000000003%" headers="mcps1.3.21.3.2.4.1.1 "><p id="mrs_01_1931__a28d6a42c457740739312f6f42d6d04bc">spark.storage.memoryMapThreshold</p>
</td>
<td class="cellrowborder" valign="top" width="61.760000000000005%" headers="mcps1.3.21.3.2.4.1.2 "><p id="mrs_01_1931__a652bc021cf4d4b90ae2f4431fcd2faf9">Specifies the block size. If the size of a block exceeds the value of this parameter, Spark performs memory mapping for the disk file. This prevents Spark from mapping too small blocks during memory mapping. Generally, memory mapping for blocks whose page size is close to or less than that of the operating system has high overhead.</p>
</td>
<td class="cellrowborder" valign="top" width="12.4%" headers="mcps1.3.21.3.2.4.1.3 "><p id="mrs_01_1931__aac9c897035044e07bc02954a26187aae">2m</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="mrs_01_1931__sadd8b128c7da4477bbe3b42434447ef0"><h4 class="sectiontitle">PORT</h4>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="mrs_01_1931__tf774bf59a9384ed0876e408ae446ebeb" frame="border" border="1" rules="all"><caption><b>Table 21 </b>Parameter description</caption><thead align="left"><tr id="mrs_01_1931__r5e1532192fdd4fa1a0d9700d1b63b8b9"><th align="left" class="cellrowborder" valign="top" width="25.840000000000003%" id="mcps1.3.22.2.2.4.1.1"><p id="mrs_01_1931__ac429e2a7d26644428bb61bfcc0bae148"><strong id="mrs_01_1931__a939311e32c744cf983fdefe64745f210">Parameter</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="61.760000000000005%" id="mcps1.3.22.2.2.4.1.2"><p id="mrs_01_1931__a3da9e36be8244da889a8f75f68acb57d"><strong id="mrs_01_1931__a4fb12a5635e34661bbd932f7458933df">Description</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="12.4%" id="mcps1.3.22.2.2.4.1.3"><p id="mrs_01_1931__aa23f6f0177b84bf7a51dcbcf7989f9ac"><strong id="mrs_01_1931__abef84c22267444a0a55ac5fc032eb4cc">Default Value</strong></p>
</th>
</tr>
</thead>
<tbody><tr id="mrs_01_1931__r3577963041a84b67919ae27479f44fd1"><td class="cellrowborder" valign="top" width="25.840000000000003%" headers="mcps1.3.22.2.2.4.1.1 "><p id="mrs_01_1931__ad2d57ab568b443bf9a20ed3dc3561666">spark.ui.port</p>
</td>
<td class="cellrowborder" valign="top" width="61.760000000000005%" headers="mcps1.3.22.2.2.4.1.2 "><p id="mrs_01_1931__a0528cdb8d07b46b6bb66e01f18a969a8">Specifies the port for your application's dashboard, which displays memory and workload data.</p>
</td>
<td class="cellrowborder" valign="top" width="12.4%" headers="mcps1.3.22.2.2.4.1.3 "><p id="mrs_01_1931__a493fc7c69ce541adb8285ada7da5aa88"></p>
<ul id="mrs_01_1931__u8c890d9f33dd4175b8045210d12d395f"><li id="mrs_01_1931__lf1ef7039e18241e5806794379976bcb5">JDBCServer2x: <strong id="mrs_01_1931__b182196178012">4040</strong></li><li id="mrs_01_1931__lc93cf8b7343541749286a26e80da78e0">SparkResource2x: 0</li></ul>
</td>
</tr>
<tr id="mrs_01_1931__r6d989b7c1d24400d97cd84b59b9c3fb2"><td class="cellrowborder" valign="top" width="25.840000000000003%" headers="mcps1.3.22.2.2.4.1.1 "><p id="mrs_01_1931__a0b5f9b0bc228460981ec1d06bf34518a">spark.blockManager.port</p>
</td>
<td class="cellrowborder" valign="top" width="61.760000000000005%" headers="mcps1.3.22.2.2.4.1.2 "><p id="mrs_01_1931__a25f54591218746cc8d53f28af31cad15">Specifies all ports listened by BlockManager. These ports are on both the driver and executor.</p>
</td>
<td class="cellrowborder" valign="top" width="12.4%" headers="mcps1.3.22.2.2.4.1.3 "><p id="mrs_01_1931__ae52ebc4d2e9b431b9c1b2b275d92a0ff"><a href="#mrs_01_1931__s8305221320854535b9528b54f2edfc32">Range of Random Ports</a></p>
</td>
</tr>
<tr id="mrs_01_1931__r27abc0e380ee414cb663684b6e8c2718"><td class="cellrowborder" valign="top" width="25.840000000000003%" headers="mcps1.3.22.2.2.4.1.1 "><p id="mrs_01_1931__a76d3064a62c9476fb58aeeb9b73e84d3">spark.driver.port</p>
</td>
<td class="cellrowborder" valign="top" width="61.760000000000005%" headers="mcps1.3.22.2.2.4.1.2 "><p id="mrs_01_1931__af62677c91ac148f9973acd8306bb29d4">Specifies the port for the driver to listen on, which is used for the driver to communicate with the executor.</p>
</td>
<td class="cellrowborder" valign="top" width="12.4%" headers="mcps1.3.22.2.2.4.1.3 "><p id="mrs_01_1931__a7b6aba6ce73d4170b8117f9fc3d2de2b"><a href="#mrs_01_1931__s8305221320854535b9528b54f2edfc32">Range of Random Ports</a></p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="mrs_01_1931__s8305221320854535b9528b54f2edfc32"><a name="mrs_01_1931__s8305221320854535b9528b54f2edfc32"></a><a name="s8305221320854535b9528b54f2edfc32"></a><h4 class="sectiontitle">Range of Random Ports</h4><p id="mrs_01_1931__aa7d2f805bc8145a49a9e3d714dc80e61">All random ports must be within a certain range.</p>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="mrs_01_1931__t0985de5c847e4371bfc384dff16a9263" frame="border" border="1" rules="all"><caption><b>Table 22 </b>Parameter description</caption><thead align="left"><tr id="mrs_01_1931__rb377270b5a304ff1bd3443d9ad41887d"><th align="left" class="cellrowborder" valign="top" width="32.05%" id="mcps1.3.23.3.2.4.1.1"><p id="mrs_01_1931__a70c7d3652a9943259b875f0248764169"><strong id="mrs_01_1931__af4718e953ceb438e8e16d19da95e118e">Parameter</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="53.290000000000006%" id="mcps1.3.23.3.2.4.1.2"><p id="mrs_01_1931__ac177535a5a40411ab9d3a8497cb8aa4b"><strong id="mrs_01_1931__a5e68835c72414687819fcbe962098b25">Description</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="14.66%" id="mcps1.3.23.3.2.4.1.3"><p id="mrs_01_1931__a9618029aa3004e4cacbf2824f10e257b"><strong id="mrs_01_1931__abd26c7edd1024a3e8de7673a1c7830d5">Default Value</strong></p>
</th>
</tr>
</thead>
<tbody><tr id="mrs_01_1931__re9e6e5e82a0b4564a1c968f23ba151ba"><td class="cellrowborder" valign="top" width="32.05%" headers="mcps1.3.23.3.2.4.1.1 "><p id="mrs_01_1931__a2f4b16f7d9a342569828e5f5ed03e5a8">spark.random.port.min</p>
</td>
<td class="cellrowborder" valign="top" width="53.290000000000006%" headers="mcps1.3.23.3.2.4.1.2 "><p id="mrs_01_1931__a3be975302fe2466586dec683ad0317ec">Sets the minimum random port.</p>
</td>
<td class="cellrowborder" valign="top" width="14.66%" headers="mcps1.3.23.3.2.4.1.3 "><p id="mrs_01_1931__ae49fb3a95c66469aa77fa26d353cddc7">22600</p>
</td>
</tr>
<tr id="mrs_01_1931__ra4a27dc812ab4ef8b40074cf13d2183c"><td class="cellrowborder" valign="top" width="32.05%" headers="mcps1.3.23.3.2.4.1.1 "><p id="mrs_01_1931__a259dda9cfa0549fca9b48d5b81b4e378">spark.random.port.max</p>
</td>
<td class="cellrowborder" valign="top" width="53.290000000000006%" headers="mcps1.3.23.3.2.4.1.2 "><p id="mrs_01_1931__ac6b51ea625e54ed482c2bfb6bda1d889">Sets the maximum random port.</p>
</td>
<td class="cellrowborder" valign="top" width="14.66%" headers="mcps1.3.23.3.2.4.1.3 "><p id="mrs_01_1931__a3e321e1563da499d94356a59bf4deb4b">22899</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="mrs_01_1931__sf70e60936b3a4f08b9b3b9d6d7da352f"><h4 class="sectiontitle">TIMEOUT</h4><p id="mrs_01_1931__a0d65f38706bd4842b3b5e391cc1c37ea">By default, computation tasks that can well process medium-scale data are configured in Spark. However, if the data volume is too large, the tasks may fail due to timeout. In the scenario with a large amount of data, the timeout parameter in Spark needs to be assigned a larger value.</p>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="mrs_01_1931__t3eaaee5cd4bf4a8fbfe2079d07c3f791" frame="border" border="1" rules="all"><caption><b>Table 23 </b>Parameter description</caption><thead align="left"><tr id="mrs_01_1931__r70db388e917148459c6ae131e52ffc56"><th align="left" class="cellrowborder" valign="top" width="25.840000000000003%" id="mcps1.3.24.3.2.4.1.1"><p id="mrs_01_1931__a1ef49f4c3b834435bcf9521fb75f2e12"><strong id="mrs_01_1931__a43f9c7b8e21a420e91cef31abfb32a82">Parameter</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="61.760000000000005%" id="mcps1.3.24.3.2.4.1.2"><p id="mrs_01_1931__a74cedc7198164750a241982f066879c2"><strong id="mrs_01_1931__a4e868e0af7ca42b0896e8f578e1468fb">Description</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="12.4%" id="mcps1.3.24.3.2.4.1.3"><p id="mrs_01_1931__ae9dfb58b1c384ddf95030f8df2ff38cc"><strong id="mrs_01_1931__aacde62c5ef8242acb5af252eb2c0638c">Default Value</strong></p>
</th>
</tr>
</thead>
<tbody><tr id="mrs_01_1931__r7321199d2aae4dd9b702095387765de8"><td class="cellrowborder" valign="top" width="25.840000000000003%" headers="mcps1.3.24.3.2.4.1.1 "><p id="mrs_01_1931__a44d30d398f6442748a60581095e156f0">spark.files.fetchTimeout</p>
</td>
<td class="cellrowborder" valign="top" width="61.760000000000005%" headers="mcps1.3.24.3.2.4.1.2 "><p id="mrs_01_1931__a3450bec4699c4392a85f248986221c90">Specifies the communication timeout (in seconds) when fetching files added using <strong id="mrs_01_1931__b208422670113116">SparkContext.addFile()</strong> of the driver.</p>
</td>
<td class="cellrowborder" valign="top" width="12.4%" headers="mcps1.3.24.3.2.4.1.3 "><p id="mrs_01_1931__a3121a0189e6f43baaaec0c537b52be73">60s</p>
</td>
</tr>
<tr id="mrs_01_1931__r44a6f03e311944f087002a804c35c072"><td class="cellrowborder" valign="top" width="25.840000000000003%" headers="mcps1.3.24.3.2.4.1.1 "><p id="mrs_01_1931__a8f48b137a1a84a50a58e7716bf1abe17">spark.network.timeout</p>
</td>
<td class="cellrowborder" valign="top" width="61.760000000000005%" headers="mcps1.3.24.3.2.4.1.2 "><p id="mrs_01_1931__ade7cdcec6d374c29b05ab86c5d88d3bc">Specifies the default timeout for all network interactions, in seconds. You can use this parameter to replace <strong id="mrs_01_1931__b1788234412113116">spark.core.connection.ack.wait.timeout</strong>, <strong id="mrs_01_1931__b944773840113116">spark.akka.timeout</strong>, <strong id="mrs_01_1931__b1957561750113116">spark.storage.blockManagerSlaveTimeoutMs</strong>, or <strong id="mrs_01_1931__b264790427113116">spark.shuffle.io.connectionTimeout</strong>.</p>
</td>
<td class="cellrowborder" valign="top" width="12.4%" headers="mcps1.3.24.3.2.4.1.3 "><p id="mrs_01_1931__ad02bfac90f8c4e988ea2aeb838b1071e">360s</p>
</td>
</tr>
<tr id="mrs_01_1931__re5ea316c7696487599f933d31e64a61a"><td class="cellrowborder" valign="top" width="25.840000000000003%" headers="mcps1.3.24.3.2.4.1.1 "><p id="mrs_01_1931__a9badce4df0344583a50c020410a26156">spark.core.connection.ack.wait.timeout</p>
</td>
<td class="cellrowborder" valign="top" width="61.760000000000005%" headers="mcps1.3.24.3.2.4.1.2 "><p id="mrs_01_1931__a5cd4c1cfcacd4c64a75f051bdce17dc0">Specifies the timeout for a connection to wait for a response, in seconds. To avoid long-time waiting caused by GC, you can set this parameter to a larger value.</p>
</td>
<td class="cellrowborder" valign="top" width="12.4%" headers="mcps1.3.24.3.2.4.1.3 "><p id="mrs_01_1931__a947e6f5093424f2f91b7ade7de89beaf">60</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="mrs_01_1931__s876c84758f3b43f39bbc1b481a6fd76a"><h4 class="sectiontitle">Encryption</h4><p id="mrs_01_1931__a3821811d7df04af887aa52cea56f135f">Spark supports SSL for Akka and HTTP (for the broadcast and file server) protocols, but does not support SSL for the web UI and block transfer service.</p>
<p id="mrs_01_1931__af5f9dd9c892c4768b94bd0efe6c0c4df">SSL must be configured on each node and configured for each component involved in communication using a particular protocol.</p>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="mrs_01_1931__tea3def85e96d455c8153edb0c38c5f90" frame="border" border="1" rules="all"><caption><b>Table 24 </b>Parameter description</caption><thead align="left"><tr id="mrs_01_1931__r9fd962a2c98a4c5194c4f7d7c281c960"><th align="left" class="cellrowborder" valign="top" width="19.82%" id="mcps1.3.25.4.2.4.1.1"><p id="mrs_01_1931__a8633b1d79e074d1bba1b0b37a863ddb2"><strong id="mrs_01_1931__a409b26f848ea45d998dadd9bcc0ef3f6">Parameter</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="67.78%" id="mcps1.3.25.4.2.4.1.2"><p id="mrs_01_1931__a19f15a12c5444e7db14f2eb7c6820a36"><strong id="mrs_01_1931__aac6d3249e8574da480c407321029b400">Description</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="12.400000000000002%" id="mcps1.3.25.4.2.4.1.3"><p id="mrs_01_1931__a1ac0797cefe547bcaec945a8e37526fb"><strong id="mrs_01_1931__a224429d3fdb4486ea2c56f6344de9b45">Default Value</strong></p>
</th>
</tr>
</thead>
<tbody><tr id="mrs_01_1931__r9a8cab50c05149cc9166133dac03d9c9"><td class="cellrowborder" valign="top" width="19.82%" headers="mcps1.3.25.4.2.4.1.1 "><p id="mrs_01_1931__a93a00f22eb2048d5a2e3b0d1b82ae6ab">spark.ssl.enabled</p>
</td>
<td class="cellrowborder" valign="top" width="67.78%" headers="mcps1.3.25.4.2.4.1.2 "><p id="mrs_01_1931__a51665b08bc0c436699f33a275469d21f">Indicates whether to enable SSL connections for all supported protocols.</p>
<p id="mrs_01_1931__a108d6044597f487495ecc1beb0094458">All SSL settings similar to <strong id="mrs_01_1931__b418933004113116">spark.ssl.</strong><em id="mrs_01_1931__i754290832113116">xxx</em> indicate the global configuration of all supported protocols. To override the global configuration of a particular protocol, you must override the property in the namespace specified by the protocol.</p>
<p id="mrs_01_1931__a9aa71bdc609c462e8caf80f93dd53030">Use <span class="parmname" id="mrs_01_1931__pd08d3bd5948242648c941e417791efc1"><b>spark.ssl.YYY.XXX</b></span> to overwrite the global configuration of the particular protocol specified by <strong id="mrs_01_1931__b1539756425113116">YYY</strong>. <strong id="mrs_01_1931__b83725049113116">YYY</strong> can be either <strong id="mrs_01_1931__b824689128113116">akka</strong> for Akka-based connections or <strong id="mrs_01_1931__b172051947113116">fs</strong> for the broadcast and file server.</p>
</td>
<td class="cellrowborder" valign="top" width="12.400000000000002%" headers="mcps1.3.25.4.2.4.1.3 "><p id="mrs_01_1931__abc999e5708724c7888f65a26ac40cd2d">false</p>
</td>
</tr>
<tr id="mrs_01_1931__r6a87774d04f04e3980af6fc3993150c1"><td class="cellrowborder" valign="top" width="19.82%" headers="mcps1.3.25.4.2.4.1.1 "><p id="mrs_01_1931__a05b62a34a672441db6b98f82f4f4ff72">spark.ssl.enabledAlgorithms</p>
</td>
<td class="cellrowborder" valign="top" width="67.78%" headers="mcps1.3.25.4.2.4.1.2 "><p id="mrs_01_1931__a09028fd9b4e5433e9b7580d88d598360">Indicates the comma-separated list of passwords. The specified passwords must be supported by the JVM.</p>
</td>
<td class="cellrowborder" valign="top" width="12.400000000000002%" headers="mcps1.3.25.4.2.4.1.3 "><p id="mrs_01_1931__a5d6a578ceb8e4fd9bae29ee38651be34">-</p>
</td>
</tr>
<tr id="mrs_01_1931__rb0fdd064b88e4fc2aefe5c5477fbdd93"><td class="cellrowborder" valign="top" width="19.82%" headers="mcps1.3.25.4.2.4.1.1 "><p id="mrs_01_1931__ae946db9388134e79ad777066112e732e">spark.ssl.keyPassword</p>
</td>
<td class="cellrowborder" valign="top" width="67.78%" headers="mcps1.3.25.4.2.4.1.2 "><p id="mrs_01_1931__a8ba7690816304623832ed0bd2186c30c">Specifies the password of a private key in the keystore.</p>
</td>
<td class="cellrowborder" valign="top" width="12.400000000000002%" headers="mcps1.3.25.4.2.4.1.3 "><p id="mrs_01_1931__a204c4abb80114880b1056c72f38ffea7">-</p>
</td>
</tr>
<tr id="mrs_01_1931__rf1d2197950fc434c830d246333c1ff9c"><td class="cellrowborder" valign="top" width="19.82%" headers="mcps1.3.25.4.2.4.1.1 "><p id="mrs_01_1931__a926d910f74da44d6995552d30da35679">spark.ssl.keyStore</p>
</td>
<td class="cellrowborder" valign="top" width="67.78%" headers="mcps1.3.25.4.2.4.1.2 "><p id="mrs_01_1931__a2a4de0251e67423b98a7d8dbbb1be404">Specifies the path of the keystore file. The path can be absolute or relative to the directory where the component is started.</p>
</td>
<td class="cellrowborder" valign="top" width="12.400000000000002%" headers="mcps1.3.25.4.2.4.1.3 "><p id="mrs_01_1931__ac1ecc3703004409788832a7c3528abad">-</p>
</td>
</tr>
<tr id="mrs_01_1931__ra8cab889b3f1494c99dda87d63063bb3"><td class="cellrowborder" valign="top" width="19.82%" headers="mcps1.3.25.4.2.4.1.1 "><p id="mrs_01_1931__aa92fa46b39a748039d46cab948848d17">spark.ssl.keyStorePassword</p>
</td>
<td class="cellrowborder" valign="top" width="67.78%" headers="mcps1.3.25.4.2.4.1.2 "><p id="mrs_01_1931__a8f1e012f445e40318469fc7be96f2a0a">Specifies the password of the keystore.</p>
</td>
<td class="cellrowborder" valign="top" width="12.400000000000002%" headers="mcps1.3.25.4.2.4.1.3 "><p id="mrs_01_1931__a6da787a243034c778aa2f7df51639903">-</p>
</td>
</tr>
<tr id="mrs_01_1931__r09cfee75852240febd3697f55863bdd9"><td class="cellrowborder" valign="top" width="19.82%" headers="mcps1.3.25.4.2.4.1.1 "><p id="mrs_01_1931__a713563f41ca0409f9b25da9166f24cb8">spark.ssl.protocol</p>
</td>
<td class="cellrowborder" valign="top" width="67.78%" headers="mcps1.3.25.4.2.4.1.2 "><p id="mrs_01_1931__a8aca7dc81cc74ce09260c76b41b40c71">Specifies the protocol name. This protocol must be supported by the JVM. The reference list of protocols is available on this page.</p>
</td>
<td class="cellrowborder" valign="top" width="12.400000000000002%" headers="mcps1.3.25.4.2.4.1.3 "><p id="mrs_01_1931__a736b7de85bc6462c861002903acbfee4">-</p>
</td>
</tr>
<tr id="mrs_01_1931__r8d08dae678b9446ea7532e52554ea9dc"><td class="cellrowborder" valign="top" width="19.82%" headers="mcps1.3.25.4.2.4.1.1 "><p id="mrs_01_1931__a1d3b49cbb4ab4124b49420c03543ccdc">spark.ssl.trustStore</p>
</td>
<td class="cellrowborder" valign="top" width="67.78%" headers="mcps1.3.25.4.2.4.1.2 "><p id="mrs_01_1931__a0081ebb79ffd45d8a3a1c04aa7c5abb9">Specifies the path of the truststore file. The path can be absolute or relative to the directory where the component is started.</p>
</td>
<td class="cellrowborder" valign="top" width="12.400000000000002%" headers="mcps1.3.25.4.2.4.1.3 "><p id="mrs_01_1931__a362f43b7db89481093cab43450d3290f">-</p>
</td>
</tr>
<tr id="mrs_01_1931__rca15cfa26b20457faecdbf6b7a1f0180"><td class="cellrowborder" valign="top" width="19.82%" headers="mcps1.3.25.4.2.4.1.1 "><p id="mrs_01_1931__ad80c025c94e84314a6e21f857dc09c47">spark.ssl.trustStorePassword</p>
</td>
<td class="cellrowborder" valign="top" width="67.78%" headers="mcps1.3.25.4.2.4.1.2 "><p id="mrs_01_1931__a804e80cf5095486a952834c63fb23759">Specifies the password of the truststore.</p>
</td>
<td class="cellrowborder" valign="top" width="12.400000000000002%" headers="mcps1.3.25.4.2.4.1.3 "><p id="mrs_01_1931__a6113b64f182d4d66806caebb02560bda">-</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="mrs_01_1931__s140b486b9b3c4a9ab734686f9769a0ec"><h4 class="sectiontitle">Security</h4><p id="mrs_01_1931__ade6633cc3084458d850196468820e5b0">Spark supports shared key-based authentication. You can use <strong id="mrs_01_1931__b850360982113116">spark.authenticate</strong> to configure authentication. This parameter controls whether the Spark communication protocol uses the shared key for authentication. This authentication is a basic handshake that ensures that both sides have the same shared key and are allowed to communicate. If the shared keys are different, the communication is not allowed. You can create shared keys as follows:</p>
<ul id="mrs_01_1931__u43629b4fb96140e6a9913d59cf13d92d"><li id="mrs_01_1931__l68906cc2880e4d9e81f383c089e89831">For Spark on Yarn deployments, set <strong id="mrs_01_1931__b75676946113116">spark.authenticate</strong> to <strong id="mrs_01_1931__b634425057113116">true</strong>. Then, shared keys are automatically generated and distributed. Each application exclusively occupies a shared key.</li><li id="mrs_01_1931__l52cf46ba36c444828f36d161fe348f2c">For other types of Spark deployments, configure Spark parameter <strong id="mrs_01_1931__b87181652113116">spark.authenticate.secret</strong> on each node. All masters, workers, and applications use this key.</li></ul>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="mrs_01_1931__t895bbdff479a46d198d04532f42ea815" frame="border" border="1" rules="all"><caption><b>Table 25 </b>Parameter description</caption><thead align="left"><tr id="mrs_01_1931__ra0be08314ed847b3b584aee3c05fcb2d"><th align="left" class="cellrowborder" valign="top" width="22.46%" id="mcps1.3.26.4.2.4.1.1"><p id="mrs_01_1931__a816b5e7bdbff4f10910847c37c90146a"><strong id="mrs_01_1931__aa523e0567a754e2fb3cfb23999bbd2f6">Parameter</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="65.14%" id="mcps1.3.26.4.2.4.1.2"><p id="mrs_01_1931__a190cc82d6fe547c2a37489c5c534fefb"><strong id="mrs_01_1931__a8924e8ada5b14dc49516a54c9c510082">Description</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="12.4%" id="mcps1.3.26.4.2.4.1.3"><p id="mrs_01_1931__a6118daf67c8a4018b95223f018891fcb"><strong id="mrs_01_1931__a5906a7ac75224d6d8ea946036ab87d77">Default Value</strong></p>
</th>
</tr>
</thead>
<tbody><tr id="mrs_01_1931__rdeddd3056ff94d1e892873aa448611a8"><td class="cellrowborder" valign="top" width="22.46%" headers="mcps1.3.26.4.2.4.1.1 "><p id="mrs_01_1931__a1f30e0e618e047f2853528d48a95b017">spark.acls.enable</p>
</td>
<td class="cellrowborder" valign="top" width="65.14%" headers="mcps1.3.26.4.2.4.1.2 "><p id="mrs_01_1931__a04795a44ce874eebb9b9d37aad632c74">Indicates whether to enable Spark ACLs. If Spark ACLs are enabled, the system checks whether the user has the permission to access and modify jobs. Note that this requires the user to be identifiable. If the user is identified as invalid, the check will not be performed. Filters can be used to verify and set users on the UI.</p>
</td>
<td class="cellrowborder" valign="top" width="12.4%" headers="mcps1.3.26.4.2.4.1.3 "><p id="mrs_01_1931__ace2948e264b54d0bafd578d6b85d238b">true</p>
</td>
</tr>
<tr id="mrs_01_1931__r316d510357fe493eae2230bda53ef504"><td class="cellrowborder" valign="top" width="22.46%" headers="mcps1.3.26.4.2.4.1.1 "><p id="mrs_01_1931__a0b8cd984d0084bf29363c3e33d98946c">spark.admin.acls</p>
</td>
<td class="cellrowborder" valign="top" width="65.14%" headers="mcps1.3.26.4.2.4.1.2 "><p id="mrs_01_1931__adc2c180c617c4b7c8560ba10ec425303">Specifies the comma-separated list of users/<span id="mrs_01_1931__ph31760224359">spark </span>administrators that have the permissions to view and modify all Spark jobs. This list can be used if you are running on a shared cluster and working with the help of an <span id="mrs_01_1931__ph03931929183517">spark </span>administrator or developer.</p>
</td>
<td class="cellrowborder" valign="top" width="12.4%" headers="mcps1.3.26.4.2.4.1.3 "><p id="mrs_01_1931__a2ca6aab5a883407e8c443dc33c2e4bba">admin</p>
</td>
</tr>
<tr id="mrs_01_1931__rd60a3200ed49416d890a0e592b22b987"><td class="cellrowborder" valign="top" width="22.46%" headers="mcps1.3.26.4.2.4.1.1 "><p id="mrs_01_1931__aeda4565f047b4ae0a74c45204830c6c2">spark.authenticate</p>
</td>
<td class="cellrowborder" valign="top" width="65.14%" headers="mcps1.3.26.4.2.4.1.2 "><p id="mrs_01_1931__a691d14e33f324482a032c5fb03ca6c73">Indicates whether Spark authenticates its internal connections. If the application is not running on Yarn, see <strong id="mrs_01_1931__b944688427113116">spark.authenticate.secret</strong>.</p>
</td>
<td class="cellrowborder" valign="top" width="12.4%" headers="mcps1.3.26.4.2.4.1.3 "><p id="mrs_01_1931__ac9d51fc8fee2484a91f7cc26df7aa4e5">true</p>
</td>
</tr>
<tr id="mrs_01_1931__rad7909974d8b4f32854a1c6d4fcd8ffd"><td class="cellrowborder" valign="top" width="22.46%" headers="mcps1.3.26.4.2.4.1.1 "><p id="mrs_01_1931__a280a464aaf94479c8b38b59cb037a813">spark.authenticate.secret</p>
</td>
<td class="cellrowborder" valign="top" width="65.14%" headers="mcps1.3.26.4.2.4.1.2 "><p id="mrs_01_1931__a94570d17fbe945dda8b26ef3bce8a20a">Sets the key for authentication between Spark components. This parameter must be set if Spark does not run on Yarn and authentication is disabled.</p>
</td>
<td class="cellrowborder" valign="top" width="12.4%" headers="mcps1.3.26.4.2.4.1.3 "><p id="mrs_01_1931__aa7cbe0bd32ab49608438e01d89a32b72">-</p>
</td>
</tr>
<tr id="mrs_01_1931__r5a1f661b5c09446a870f04afb7996b38"><td class="cellrowborder" valign="top" width="22.46%" headers="mcps1.3.26.4.2.4.1.1 "><p id="mrs_01_1931__a9e4205a4991a4845be73d4e890b8936d">spark.modify.acls</p>
</td>
<td class="cellrowborder" valign="top" width="65.14%" headers="mcps1.3.26.4.2.4.1.2 "><p id="mrs_01_1931__a747d926d31d34b6ea1dde87dee7a35bf">Specifies the comma-separated list of users who have the permission to modify Spark jobs. By default, only users who have enabled Spark jobs have the permission to modify the list (for example, delete the list).</p>
</td>
<td class="cellrowborder" valign="top" width="12.4%" headers="mcps1.3.26.4.2.4.1.3 "><p id="mrs_01_1931__a51b8219554754a19b8a7e31c6f26571a">-</p>
</td>
</tr>
<tr id="mrs_01_1931__ra4e6159967214c289494aef02e5e61ea"><td class="cellrowborder" valign="top" width="22.46%" headers="mcps1.3.26.4.2.4.1.1 "><p id="mrs_01_1931__aeda64388a0484530886bf3470cea3c20">spark.ui.view.acls</p>
</td>
<td class="cellrowborder" valign="top" width="65.14%" headers="mcps1.3.26.4.2.4.1.2 "><p id="mrs_01_1931__a71ffb82e3f1846aebb2231f2e955091d">Specifies the comma-separated list of users who have the permission to access the Spark web UI. By default, only users who have enabled Spark jobs have the access permission.</p>
</td>
<td class="cellrowborder" valign="top" width="12.4%" headers="mcps1.3.26.4.2.4.1.3 "><p id="mrs_01_1931__a4f0978871754435b842aeb86d0d6813b">-</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="mrs_01_1931__s6284d8c3efa049488db07870e057ff7b"><h4 class="sectiontitle">Enabling the Authentication Mechanism Between Spark Processes</h4><p id="mrs_01_1931__a8001436038ff43658eaf1b83d34b6f41">Spark processes support shared key-based authentication. You can configure <strong id="mrs_01_1931__b981330369113116">spark.authenticate</strong> to control whether Spark performs authentication during communication. In this authentication mode, the two communication parties share the same key only using simple handshakes.</p>
<p id="mrs_01_1931__p0369201318574">Configure the following parameters in the <span class="filepath" id="mrs_01_1931__filepath220292233119"><b>spark-defaults.conf</b></span> file on the Spark client.</p>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="mrs_01_1931__tae615170a0b7462c8cc258aa7fb6a2dc" frame="border" border="1" rules="all"><caption><b>Table 26 </b>Parameter description</caption><thead align="left"><tr id="mrs_01_1931__r2e2a94cbf8b744698b056ab6b4a81866"><th align="left" class="cellrowborder" valign="top" width="23.78%" id="mcps1.3.27.4.2.4.1.1"><p id="mrs_01_1931__ae972f853216340ed96f03f277b37887c"><strong id="mrs_01_1931__a3df564073991484ba6bd277fd201adf2">Parameter</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="63.629999999999995%" id="mcps1.3.27.4.2.4.1.2"><p id="mrs_01_1931__a5878406fbc0441248938e2897e919819"><strong id="mrs_01_1931__a074bac9940be4b1a80da79933fe9fd44">Description</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="12.590000000000002%" id="mcps1.3.27.4.2.4.1.3"><p id="mrs_01_1931__ab6001eb3d6324fd893084eeca0a89110"><strong id="mrs_01_1931__aec19d1577f2145be99b0f0f764042d1a">Default Value</strong></p>
</th>
</tr>
</thead>
<tbody><tr id="mrs_01_1931__r9804d03863204c05830c4b5acbc9d9be"><td class="cellrowborder" valign="top" width="23.78%" headers="mcps1.3.27.4.2.4.1.1 "><p id="mrs_01_1931__adad9bb0e1b204d18ba57963e78dfdb01">spark.authenticate</p>
</td>
<td class="cellrowborder" valign="top" width="63.629999999999995%" headers="mcps1.3.27.4.2.4.1.2 "><p id="mrs_01_1931__a3d56893c48094f5daf2b0f42c1d4d264">For Spark on Yarn deployments, set this parameter to <strong id="mrs_01_1931__b938024084113116">true</strong>. Then, keys are automatically generated and distributed, and each application uses a unique key.</p>
</td>
<td class="cellrowborder" valign="top" width="12.590000000000002%" headers="mcps1.3.27.4.2.4.1.3 "><p id="mrs_01_1931__a2ce15216dd9b4b68a7385776c88b35fc">true</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="mrs_01_1931__s039b73942fdf48d784d903152d5c7f0b"><h4 class="sectiontitle">Compression</h4><p id="mrs_01_1931__a50df113df690403d8a604dfc2338e02f">Data compression is policy that optimizes memory usage at the expense of CPU. Therefore, when the Spark memory is severely insufficient (this issue is common due to the characteristics of in-memory computing), data compression can greatly improve performance. Spark supports three types of compression algorithm: Snappy, LZ4, and LZF. Snappy is the default compression algorithm and invokes the native method to compress and decompress data. In Yarn mode, pay attention to the impact of non-heap memory on the container process.</p>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="mrs_01_1931__tb1170c2915ed4988a7b5a3be5c669002" frame="border" border="1" rules="all"><caption><b>Table 27 </b>Parameter description</caption><thead align="left"><tr id="mrs_01_1931__r51aef49d77094906a4aaeb094d4aca2e"><th align="left" class="cellrowborder" valign="top" width="25.840000000000003%" id="mcps1.3.28.3.2.4.1.1"><p id="mrs_01_1931__aaa432cc73ebd49b48cde4b8ce77dc697"><strong id="mrs_01_1931__a17516c24550c4440b95516f2132f8e5b">Parameter</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="61.760000000000005%" id="mcps1.3.28.3.2.4.1.2"><p id="mrs_01_1931__a01057f498e5e43d793c86724a8a65534"><strong id="mrs_01_1931__ab7acc9d529004d459370a2ecba1af6d5">Description</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="12.4%" id="mcps1.3.28.3.2.4.1.3"><p id="mrs_01_1931__aca847be0a5b4422c80dfa0aabddd4229"><strong id="mrs_01_1931__a8cef30c872df47e69c0f30ee5b0287d9">Default Value</strong></p>
</th>
</tr>
</thead>
<tbody><tr id="mrs_01_1931__r739d697cb68c46a3bb1743094f2d941f"><td class="cellrowborder" valign="top" width="25.840000000000003%" headers="mcps1.3.28.3.2.4.1.1 "><p id="mrs_01_1931__ac8470463da2d4b6da9b9102bf63cbe78">spark.io.compression.codec</p>
</td>
<td class="cellrowborder" valign="top" width="61.760000000000005%" headers="mcps1.3.28.3.2.4.1.2 "><p id="mrs_01_1931__a6f9a78704f994d1bad83f9a28237b512">Indicates the codec for compressing internal data, such as RDD partitions, broadcast variables, and shuffle output. By default, Spark supports three types of compression algorithm: LZ4, LZF, and Snappy. You can specify algorithms using fully qualified class names, such as <strong id="mrs_01_1931__b1835319289113116">org.apache.spark.io.LZ4CompressionCodec</strong>, <strong id="mrs_01_1931__b2143381190113116">org.apache.spark.io.LZFCompressionCodec</strong>, and <strong id="mrs_01_1931__b1502383458113116">org.apache.spark.io.SnappyCompressionCodec</strong>.</p>
</td>
<td class="cellrowborder" valign="top" width="12.4%" headers="mcps1.3.28.3.2.4.1.3 "><p id="mrs_01_1931__aef4a4c9a120b4e7a8adaef93addb0ac0">lz4</p>
</td>
</tr>
<tr id="mrs_01_1931__rb97602acbff3405c998bede8557dbf69"><td class="cellrowborder" valign="top" width="25.840000000000003%" headers="mcps1.3.28.3.2.4.1.1 "><p id="mrs_01_1931__a48f70748f5724a54b7d382cdd0a491df">spark.io.compression.lz4.block.size</p>
</td>
<td class="cellrowborder" valign="top" width="61.760000000000005%" headers="mcps1.3.28.3.2.4.1.2 "><p id="mrs_01_1931__a57543f9f78d5465cb1247af6127c8263">Indicates the block size (bytes) used in LZ4 compression when the LZ4 compression algorithm is used. When LZ4 is used, reducing the block size also reduces the shuffle memory usage.</p>
</td>
<td class="cellrowborder" valign="top" width="12.4%" headers="mcps1.3.28.3.2.4.1.3 "><p id="mrs_01_1931__af62c325047f34ca08338e7eda11b17b8">32768</p>
</td>
</tr>
<tr id="mrs_01_1931__re492a58f44814d45984743fc64a01886"><td class="cellrowborder" valign="top" width="25.840000000000003%" headers="mcps1.3.28.3.2.4.1.1 "><p id="mrs_01_1931__a37aeef1bc3224ed6a802177936642293">spark.io.compression.snappy.block.size</p>
</td>
<td class="cellrowborder" valign="top" width="61.760000000000005%" headers="mcps1.3.28.3.2.4.1.2 "><p id="mrs_01_1931__afaf7cdc0dcf64c81b7824e182dc26aa7">Indicates the block size (bytes) used in Snappy compression when the Snappy compression algorithm is used. When Snappy is used, reducing the block size also reduces the shuffle memory usage.</p>
</td>
<td class="cellrowborder" valign="top" width="12.4%" headers="mcps1.3.28.3.2.4.1.3 "><p id="mrs_01_1931__ac537b379cbef4b109d8ebb66369894c5">32768</p>
</td>
</tr>
<tr id="mrs_01_1931__rcb3ee987fe484c8380028a91817c1323"><td class="cellrowborder" valign="top" width="25.840000000000003%" headers="mcps1.3.28.3.2.4.1.1 "><p id="mrs_01_1931__ad9fe09490efa4f3da367b651724b870c">spark.shuffle.compress</p>
</td>
<td class="cellrowborder" valign="top" width="61.760000000000005%" headers="mcps1.3.28.3.2.4.1.2 "><p id="mrs_01_1931__ae9f4a507501049a68abad94e348f61a5">Indicates whether to compress the output files of a Map task. You are advised to compress the broadcast variables. using <strong id="mrs_01_1931__b2041101007113116">spark.io.compression.codec</strong>.</p>
</td>
<td class="cellrowborder" valign="top" width="12.4%" headers="mcps1.3.28.3.2.4.1.3 "><p id="mrs_01_1931__a9598cd0ec8e04dbc85d28ef577769436">true</p>
</td>
</tr>
<tr id="mrs_01_1931__r6510d2b1f5f8481ab8e78394527bafdf"><td class="cellrowborder" valign="top" width="25.840000000000003%" headers="mcps1.3.28.3.2.4.1.1 "><p id="mrs_01_1931__a9811fc87dbde4aad8c35b547c5bef00b">spark.shuffle.spill.compress</p>
</td>
<td class="cellrowborder" valign="top" width="61.760000000000005%" headers="mcps1.3.28.3.2.4.1.2 "><p id="mrs_01_1931__a196f3cf3c62940ec99dbe886a59d6c9b">Indicates whether to compress the data overflowed during shuffle using <strong id="mrs_01_1931__b1145119801113116">spark.io.compression.codec</strong>.</p>
</td>
<td class="cellrowborder" valign="top" width="12.4%" headers="mcps1.3.28.3.2.4.1.3 "><p id="mrs_01_1931__a781db13cb124478e9874551070ae6b99">true</p>
</td>
</tr>
<tr id="mrs_01_1931__rd9a3910908b9496c9ccceac66c9601d4"><td class="cellrowborder" valign="top" width="25.840000000000003%" headers="mcps1.3.28.3.2.4.1.1 "><p id="mrs_01_1931__a4ae09ac5a4bd40dbb9169fd02e63f961">spark.eventLog.compress</p>
</td>
<td class="cellrowborder" valign="top" width="61.760000000000005%" headers="mcps1.3.28.3.2.4.1.2 "><p id="mrs_01_1931__a15ee983ca18b4c4aaacddb8cc8816204">Indicates whether to compress logged events when <strong id="mrs_01_1931__b425127401113116">spark.eventLog.enabled</strong> is set to <strong id="mrs_01_1931__b382466275113116">true</strong>.</p>
</td>
<td class="cellrowborder" valign="top" width="12.4%" headers="mcps1.3.28.3.2.4.1.3 "><p id="mrs_01_1931__a4d249186eb85439d94abd71ac05c2f8d">false</p>
</td>
</tr>
<tr id="mrs_01_1931__rd402f162d4004ca7a9398fedccd3e279"><td class="cellrowborder" valign="top" width="25.840000000000003%" headers="mcps1.3.28.3.2.4.1.1 "><p id="mrs_01_1931__a83874ef9f91541fc9f9ee1650ee85afd">spark.broadcast.compress</p>
</td>
<td class="cellrowborder" valign="top" width="61.760000000000005%" headers="mcps1.3.28.3.2.4.1.2 "><p id="mrs_01_1931__a44fe47d2ab9f497abd3ca2bf41139c0a">Indicates whether to compress broadcast variables before sending them. You are advised to compress the broadcast variables.</p>
</td>
<td class="cellrowborder" valign="top" width="12.4%" headers="mcps1.3.28.3.2.4.1.3 "><p id="mrs_01_1931__aa60222c898f94157894f9e0324b30161">true</p>
</td>
</tr>
<tr id="mrs_01_1931__r020237780cdd407eb798e8727f2e4a0c"><td class="cellrowborder" valign="top" width="25.840000000000003%" headers="mcps1.3.28.3.2.4.1.1 "><p id="mrs_01_1931__a00b3fe7445e34ff58ace775112d3950e">spark.rdd.compress</p>
</td>
<td class="cellrowborder" valign="top" width="61.760000000000005%" headers="mcps1.3.28.3.2.4.1.2 "><p id="mrs_01_1931__a65d50a815f584c4c95a13dfda9c21f80">Indicates whether to compress serialized RDD partitions (for example, the <strong id="mrs_01_1931__b1538037764113116">StorageLevel.MEMORY_ONLY_SER</strong> partition). Substantial space can be saved at the cost of some extra CPU time.</p>
</td>
<td class="cellrowborder" valign="top" width="12.4%" headers="mcps1.3.28.3.2.4.1.3 "><p id="mrs_01_1931__a64fa9b1dfe484235be6620c2501769c6">false</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="mrs_01_1931__se1e2c79b44ac4cfe9888dd1274f04289"><h4 class="sectiontitle">Reducing the Probability of Abnormal Client Application Operations When Resources Are Insufficient</h4><p id="mrs_01_1931__acc8532f4e8674d6f9bcd034b539298ef">When resources are insufficient, ApplicationMaster tasks must wait and will not be processed until enough resources are available for use. If the actual waiting time exceeds the configured waiting time, the ApplicationMaster tasks will be deleted. Adjust the following parameters to reduce the probability of abnormal client application operation.</p>
</div>
<p id="mrs_01_1931__a33492af620554226b834bf000918141d">Configure the following parameters in the <span class="filepath" id="mrs_01_1931__fa6c2fa5a488644ae937e117b0ff33efb"><b>spark-defaults.conf</b></span> file on the client.</p>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="mrs_01_1931__t58e24e8b7f7046e69f643661c41c79ac" frame="border" border="1" rules="all"><caption><b>Table 28 </b>Parameter description</caption><thead align="left"><tr id="mrs_01_1931__ra96cba67c2a546fab4870447501709cb"><th align="left" class="cellrowborder" valign="top" width="20.93%" id="mcps1.3.31.2.4.1.1"><p id="mrs_01_1931__a780b4a11b7394628ad5ea072920f3878"><strong id="mrs_01_1931__a7af6931b2807445b8d0aae2ab5932e9d">Parameter</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="63.22%" id="mcps1.3.31.2.4.1.2"><p id="mrs_01_1931__aed2d5dcd74c24243aba24ccc33ac0b20"><strong id="mrs_01_1931__aafdd24ed4d0d4fa8ba2c374069a0e09b">Description</strong></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="15.85%" id="mcps1.3.31.2.4.1.3"><p id="mrs_01_1931__ac39d1171f9c2417b975b403432347da8"><strong id="mrs_01_1931__a2da44c67239b4d07ac221254be52bf1b">Default Value</strong></p>
</th>
</tr>
</thead>
<tbody><tr id="mrs_01_1931__rdd0e7fae0c7c4085b15a5b8591d4147c"><td class="cellrowborder" valign="top" width="20.93%" headers="mcps1.3.31.2.4.1.1 "><p id="mrs_01_1931__ac674d71fc7e148e3a62d1d323501c3a6">spark.yarn.applicationMaster.waitTries</p>
</td>
<td class="cellrowborder" valign="top" width="63.22%" headers="mcps1.3.31.2.4.1.2 "><p id="mrs_01_1931__a12a995e78c6741409ac4d31840072e45">Specifies the number of the times that ApplicationMaster waits for Spark master, which is also the times that ApplicationMaster waits for SparkContext initialization. Enlarge this parameter value to prevent ApplicationMaster tasks from being deleted and reduce the probability of abnormal client application operations.</p>
</td>
<td class="cellrowborder" valign="top" width="15.85%" headers="mcps1.3.31.2.4.1.3 "><p id="mrs_01_1931__ac88c24aa94b941068fd8f0d1b47c60d2">10</p>
</td>
</tr>
<tr id="mrs_01_1931__rd4ba8cf79af843489705d1ad3b9e4d6b"><td class="cellrowborder" valign="top" width="20.93%" headers="mcps1.3.31.2.4.1.1 "><p id="mrs_01_1931__a7c587f7a66ad4626b3de91af139c935c">spark.yarn.am.memory</p>
</td>
<td class="cellrowborder" valign="top" width="63.22%" headers="mcps1.3.31.2.4.1.2 "><p id="mrs_01_1931__a8242b485564f463d99092c3f3512a4e0">Specifies the ApplicationMaster memory. Enlarge this parameter value to prevent ApplicationMaster tasks from being deleted by ResourceManager due to insufficient memory and reduce the probability of abnormal client application operations.</p>
</td>
<td class="cellrowborder" valign="top" width="15.85%" headers="mcps1.3.31.2.4.1.3 "><p id="mrs_01_1931__a48bce86e4c9d4395b4ac33929bf58f77">1G</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1928.html">Basic Operation</a></div>
</div>
</div>