forked from docs/doc-exports
Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com> Co-authored-by: Hasko, Vladimir <vladimir.hasko@t-systems.com> Co-committed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com>
251 lines
30 KiB
HTML
251 lines
30 KiB
HTML
<a name="dli_05_0062"></a><a name="dli_05_0062"></a>
|
|
|
|
<h1 class="topictitle1">Calling UDAFs in Spark SQL Jobs</h1>
|
|
<div id="body8662426"><div class="section" id="dli_05_0062__en-us_topic_0000001488009165_section647692455619"><h4 class="sectiontitle">Scenario</h4><p id="dli_05_0062__en-us_topic_0000001488009165_p4438202975617">DLI allows you to use a Hive User Defined Aggregation Function (UDAF) to process multiple rows of data. Hive UDAF is usually used together with groupBy. It is equivalent to SUM() and AVG() commonly used in SQL and is also an aggregation function.</p>
|
|
</div>
|
|
<div class="section" id="dli_05_0062__en-us_topic_0000001488009165_section1193021082411"><h4 class="sectiontitle">Constraints</h4><ul id="dli_05_0062__en-us_topic_0000001488009165_ul3398194318243"><li id="dli_05_0062__en-us_topic_0000001488009165_li1478174512249">To perform UDAF-related operations on DLI, you need to create a SQL queue instead of using the default queue.</li><li id="dli_05_0062__en-us_topic_0000001488009165_li16697105011537">When UDAFs are used across accounts, other users, except the user who creates them, need to be authorized before using the UDAF.<p id="dli_05_0062__en-us_topic_0000001488009165_p694585115538"><a name="dli_05_0062__en-us_topic_0000001488009165_li16697105011537"></a><a name="en-us_topic_0000001488009165_li16697105011537"></a>To grant required permissions, log in to the DLI console and choose <strong id="dli_05_0062__b154915499161254">Data Management</strong> > <strong id="dli_05_0062__b74438397161254">Package Management</strong>. On the displayed page, select your UDAF Jar package and click <strong id="dli_05_0062__b203682677161254">Manage Permissions</strong> in the <strong id="dli_05_0062__b44287203261254">Operation</strong> column. On the permission management page, click <strong id="dli_05_0062__b199723475361254">Grant Permission</strong> in the upper right corner and select the required permissions.</p>
|
|
</li><li id="dli_05_0062__en-us_topic_0000001488009165_li19560143132519">If you use a static class or interface in a UDF, add <strong id="dli_05_0062__b78521962761254">try catch</strong> to capture exceptions. Otherwise, package conflicts may occur.</li></ul>
|
|
</div>
|
|
<div class="section" id="dli_05_0062__en-us_topic_0000001488009165_section1289584132818"><h4 class="sectiontitle">Environment Preparations</h4><p id="dli_05_0062__en-us_topic_0000001488009165_p15789124822812">Before you start, set up the development environment.</p>
|
|
|
|
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="dli_05_0062__en-us_topic_0000001488009165_table16607550103414" frame="border" border="1" rules="all"><caption><b>Table 1 </b>Development environment</caption><thead align="left"><tr id="dli_05_0062__en-us_topic_0000001488009165_row1260855083418"><th align="left" class="cellrowborder" valign="top" width="26.72%" id="mcps1.3.3.3.2.3.1.1"><p id="dli_05_0062__en-us_topic_0000001488009165_p106081750173418">Item</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="73.28%" id="mcps1.3.3.3.2.3.1.2"><p id="dli_05_0062__en-us_topic_0000001488009165_p20608165013348">Description</p>
|
|
</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody><tr id="dli_05_0062__en-us_topic_0000001488009165_row156081050163413"><td class="cellrowborder" valign="top" width="26.72%" headers="mcps1.3.3.3.2.3.1.1 "><p id="dli_05_0062__en-us_topic_0000001488009165_p196081350193415">OS</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="73.28%" headers="mcps1.3.3.3.2.3.1.2 "><p id="dli_05_0062__en-us_topic_0000001488009165_p13473171914392">Windows 7 or later</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_05_0062__en-us_topic_0000001488009165_row760835017341"><td class="cellrowborder" valign="top" width="26.72%" headers="mcps1.3.3.3.2.3.1.1 "><p id="dli_05_0062__en-us_topic_0000001488009165_p36081750153419">JDK</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="73.28%" headers="mcps1.3.3.3.2.3.1.2 "><p id="dli_05_0062__en-us_topic_0000001488009165_p360935011349">JDK 1.8 (<a href="https://www.oracle.com/java/technologies/javase-downloads.html" target="_blank" rel="noopener noreferrer">Java downloads</a>).</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_05_0062__en-us_topic_0000001488009165_row560915504341"><td class="cellrowborder" valign="top" width="26.72%" headers="mcps1.3.3.3.2.3.1.1 "><p id="dli_05_0062__en-us_topic_0000001488009165_p17609125016347">IntelliJ IDEA</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="73.28%" headers="mcps1.3.3.3.2.3.1.2 "><p id="dli_05_0062__en-us_topic_0000001488009165_p16609145043415"><a href="https://www.jetbrains.com/idea/" target="_blank" rel="noopener noreferrer">IntelliJ IDEA</a> is used for application development. The version of the tool must be 2019.1 or later.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_05_0062__en-us_topic_0000001488009165_row7609205063410"><td class="cellrowborder" valign="top" width="26.72%" headers="mcps1.3.3.3.2.3.1.1 "><p id="dli_05_0062__en-us_topic_0000001488009165_p1321921083618">Maven</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="73.28%" headers="mcps1.3.3.3.2.3.1.2 "><p id="dli_05_0062__en-us_topic_0000001488009165_p760945011348">Basic configuration of the development environment. For details about how to get started, see <a href="https://maven.apache.org/download.cgi" target="_blank" rel="noopener noreferrer">Downloading Apache Maven</a> and <a href="https://maven.apache.org/install.html" target="_blank" rel="noopener noreferrer">Installing Apache Maven</a>. Maven is used for project management throughout the lifecycle of software development.</p>
|
|
</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
</div>
|
|
</div>
|
|
<div class="section" id="dli_05_0062__en-us_topic_0000001488009165_section14465145811424"><h4 class="sectiontitle">Development Process</h4><p id="dli_05_0062__en-us_topic_0000001488009165_p028917191431">The following figure shows the process of developing a UDAF.</p>
|
|
<div class="fignone" id="dli_05_0062__en-us_topic_0000001488009165_fig173705751414"><span class="figcap"><b>Figure 1 </b>Development process</span><br><span><img id="dli_05_0062__en-us_topic_0000001488009165_image1128016718244" src="en-us_image_0000001487274748.png"></span></div>
|
|
|
|
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="dli_05_0062__en-us_topic_0000001488009165_table199217229155" frame="border" border="1" rules="all"><caption><b>Table 2 </b>Process description</caption><thead align="left"><tr id="dli_05_0062__en-us_topic_0000001488009165_row399252218151"><th align="left" class="cellrowborder" valign="top" width="6.35%" id="mcps1.3.4.4.2.5.1.1"><p id="dli_05_0062__en-us_topic_0000001488009165_p9992152215153">No.</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="28.71%" id="mcps1.3.4.4.2.5.1.2"><p id="dli_05_0062__en-us_topic_0000001488009165_p1699242211517">Phase</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="18.42%" id="mcps1.3.4.4.2.5.1.3"><p id="dli_05_0062__en-us_topic_0000001488009165_p109921122161520">Software Portal</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="46.52%" id="mcps1.3.4.4.2.5.1.4"><p id="dli_05_0062__en-us_topic_0000001488009165_p4719192682315">Description</p>
|
|
</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody><tr id="dli_05_0062__en-us_topic_0000001488009165_row17992152261511"><td class="cellrowborder" valign="top" width="6.35%" headers="mcps1.3.4.4.2.5.1.1 "><p id="dli_05_0062__en-us_topic_0000001488009165_p099213229158">1</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="28.71%" headers="mcps1.3.4.4.2.5.1.2 "><p id="dli_05_0062__en-us_topic_0000001488009165_p8277350101617">Create a Maven project and configure the POM file.</p>
|
|
</td>
|
|
<td class="cellrowborder" rowspan="3" valign="top" width="18.42%" headers="mcps1.3.4.4.2.5.1.3 "><p id="dli_05_0062__en-us_topic_0000001488009165_p1196994912180">IntelliJ IDEA</p>
|
|
</td>
|
|
<td class="cellrowborder" rowspan="3" valign="top" width="46.52%" headers="mcps1.3.4.4.2.5.1.4 "><p id="dli_05_0062__en-us_topic_0000001488009165_p17720826192313">Compile the UDAF function code by referring to the <a href="#dli_05_0062__en-us_topic_0000001488009165_section1255957113015">Procedure</a> description.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_05_0062__en-us_topic_0000001488009165_row1299222218153"><td class="cellrowborder" valign="top" headers="mcps1.3.4.4.2.5.1.1 "><p id="dli_05_0062__en-us_topic_0000001488009165_p59921122171516">2</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" headers="mcps1.3.4.4.2.5.1.2 "><p id="dli_05_0062__en-us_topic_0000001488009165_p1627815506168">Editing UDAF code</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_05_0062__en-us_topic_0000001488009165_row19993182291512"><td class="cellrowborder" valign="top" headers="mcps1.3.4.4.2.5.1.1 "><p id="dli_05_0062__en-us_topic_0000001488009165_p1099319228156">3</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" headers="mcps1.3.4.4.2.5.1.2 "><p id="dli_05_0062__en-us_topic_0000001488009165_p92781850131619">Debug, compile, and pack the code into a Jar package.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_05_0062__en-us_topic_0000001488009165_row6993422151515"><td class="cellrowborder" valign="top" width="6.35%" headers="mcps1.3.4.4.2.5.1.1 "><p id="dli_05_0062__en-us_topic_0000001488009165_p139931022141517">4</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="28.71%" headers="mcps1.3.4.4.2.5.1.2 "><p id="dli_05_0062__en-us_topic_0000001488009165_p227805011620">Upload the Jar package to OBS.</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="18.42%" headers="mcps1.3.4.4.2.5.1.3 "><p id="dli_05_0062__en-us_topic_0000001488009165_p99931522101511">OBS console</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="46.52%" headers="mcps1.3.4.4.2.5.1.4 "><p id="dli_05_0062__en-us_topic_0000001488009165_p13946133492513">Upload the UDAF Jar file to an OBS path.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_05_0062__en-us_topic_0000001488009165_row19993102241514"><td class="cellrowborder" valign="top" width="6.35%" headers="mcps1.3.4.4.2.5.1.1 "><p id="dli_05_0062__en-us_topic_0000001488009165_p4993822191514">5</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="28.71%" headers="mcps1.3.4.4.2.5.1.2 "><p id="dli_05_0062__en-us_topic_0000001488009165_p20278150151620">Create a DLI package.</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="18.42%" headers="mcps1.3.4.4.2.5.1.3 "><p id="dli_05_0062__en-us_topic_0000001488009165_p0993822111520">DLI console</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="46.52%" headers="mcps1.3.4.4.2.5.1.4 "><p id="dli_05_0062__en-us_topic_0000001488009165_p37202264236">Select the UDAF Jar file that has been uploaded to OBS for management.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_05_0062__en-us_topic_0000001488009165_row7316134914175"><td class="cellrowborder" valign="top" width="6.35%" headers="mcps1.3.4.4.2.5.1.1 "><p id="dli_05_0062__en-us_topic_0000001488009165_p1931634913173">6</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="28.71%" headers="mcps1.3.4.4.2.5.1.2 "><p id="dli_05_0062__en-us_topic_0000001488009165_p208418016189">Create a UDAF on DLI.</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="18.42%" headers="mcps1.3.4.4.2.5.1.3 "><p id="dli_05_0062__en-us_topic_0000001488009165_p7316449121713">DLI console</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="46.52%" headers="mcps1.3.4.4.2.5.1.4 "><p id="dli_05_0062__en-us_topic_0000001488009165_p15720192610230">Create a UDAF on the SQL job management page of the DLI console.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_05_0062__en-us_topic_0000001488009165_row115504265167"><td class="cellrowborder" valign="top" width="6.35%" headers="mcps1.3.4.4.2.5.1.1 "><p id="dli_05_0062__en-us_topic_0000001488009165_p6551152651612">7</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="28.71%" headers="mcps1.3.4.4.2.5.1.2 "><p id="dli_05_0062__en-us_topic_0000001488009165_p172781650121618">Verify and use the UDAF.</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="18.42%" headers="mcps1.3.4.4.2.5.1.3 "><p id="dli_05_0062__en-us_topic_0000001488009165_p55514264163">DLI console</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="46.52%" headers="mcps1.3.4.4.2.5.1.4 "><p id="dli_05_0062__en-us_topic_0000001488009165_p472018263233">Use the UDAF in your DLI job.</p>
|
|
</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
</div>
|
|
</div>
|
|
<div class="section" id="dli_05_0062__en-us_topic_0000001488009165_section1255957113015"><a name="dli_05_0062__en-us_topic_0000001488009165_section1255957113015"></a><a name="en-us_topic_0000001488009165_section1255957113015"></a><h4 class="sectiontitle">Procedure</h4><ol id="dli_05_0062__en-us_topic_0000001488009165_ol59071214135610"><li id="dli_05_0062__en-us_topic_0000001488009165_li57841417165617">Create a Maven project and configure the POM file. This step uses IntelliJ IDEA 2020.2 as an example.<ol type="a" id="dli_05_0062__en-us_topic_0000001488009165_ol1122718208712"><li id="dli_05_0062__en-us_topic_0000001488009165_li102261020872">Start IntelliJ IDEA and choose <strong id="dli_05_0062__b187283160461254">File</strong> > <strong id="dli_05_0062__b128816888961254">New</strong> > <strong id="dli_05_0062__b118412570861254">Project</strong>.<div class="fignone" id="dli_05_0062__en-us_topic_0000001488009165_fig152264201173"><span class="figcap"><b>Figure 2 </b>Creating a project</span><br><span><img id="dli_05_0062__en-us_topic_0000001488009165_image4226120975" src="en-us_image_0000001538394645.png"></span></div>
|
|
</li><li id="dli_05_0062__en-us_topic_0000001488009165_li1422615201975">Choose <strong id="dli_05_0062__b61583963661254">Maven</strong>, set <strong id="dli_05_0062__b4765858861254">Project SDK</strong> to <strong id="dli_05_0062__b54370602361254">1.8</strong>, and click <strong id="dli_05_0062__b103027165261254">Next</strong>.<div class="fignone" id="dli_05_0062__en-us_topic_0000001488009165_fig1922614206714"><span class="figcap"><b>Figure 3 </b>Configuring the project SDK</span><br><span><img id="dli_05_0062__en-us_topic_0000001488009165_image152265201275" src="en-us_image_0000001487114864.png"></span></div>
|
|
</li><li id="dli_05_0062__en-us_topic_0000001488009165_li1222617201872">Specify the project name and the project path, and click <strong id="dli_05_0062__b10606144761254">Create</strong>. In the displayed page, click <strong id="dli_05_0062__b210420277261254">Finish</strong>.<div class="fignone" id="dli_05_0062__en-us_topic_0000001488009165_fig82261820179"><span class="figcap"><b>Figure 4 </b>Setting project information</span><br><span><img id="dli_05_0062__en-us_topic_0000001488009165_image0226102017718" src="en-us_image_0000001538514573.png"></span></div>
|
|
</li><li id="dli_05_0062__en-us_topic_0000001488009165_li0227820670">Add the following content to the <strong id="dli_05_0062__b22791902861254">pom.xml</strong> file.<pre class="screen" id="dli_05_0062__en-us_topic_0000001488009165_screen16227202016719"><dependencies>
|
|
<dependency>
|
|
<groupId>org.apache.hive</groupId>
|
|
<artifactId>hive-exec</artifactId>
|
|
<version>1.2.1</version>
|
|
</dependency>
|
|
</dependencies></pre>
|
|
<div class="fignone" id="dli_05_0062__en-us_topic_0000001488009165_fig1322712209719"><span class="figcap"><b>Figure 5 </b>Adding configurations to the POM file</span><br><span><img id="dli_05_0062__en-us_topic_0000001488009165_image16227520973" src="en-us_image_0000001487434660.png"></span></div>
|
|
</li><li id="dli_05_0062__en-us_topic_0000001488009165_li10227122015712">Choose <strong id="dli_05_0062__b41140885261254">src</strong> > <strong id="dli_05_0062__b50709723561254">main</strong> and right-click the <strong id="dli_05_0062__b68809952061254">java</strong> folder. Choose <strong id="dli_05_0062__b71459364161254">New</strong> > <strong id="dli_05_0062__b51017391061254">Package</strong> to create a package and a class file.<div class="p" id="dli_05_0062__en-us_topic_0000001488009165_p0474198125615">Set <strong id="dli_05_0062__b150366838961254">Package</strong> as required. In this example, set <strong id="dli_05_0062__b154842289861254">Package</strong> to <strong id="dli_05_0062__b208673333761254">com.dli.demo</strong>.<div class="fignone" id="dli_05_0062__en-us_topic_0000001488009165_fig1016112715564"><span class="figcap"><b>Figure 6 </b>Creating a package</span><br><span><img id="dli_05_0062__en-us_topic_0000001488009165_image51611678561" src="en-us_image_0000001487594580.png"></span></div>
|
|
</div>
|
|
<p id="dli_05_0062__en-us_topic_0000001488009165_p516187195620">Create a Java Class file in the package path. In this example, the Java Class file is <strong id="dli_05_0062__b44856234461254">AvgFilterUDAFDemo</strong>.</p>
|
|
<div class="fignone" id="dli_05_0062__en-us_topic_0000001488009165_fig1216247175618"><span class="figcap"><b>Figure 7 </b>Creating a class</span><br><span><img id="dli_05_0062__en-us_topic_0000001488009165_image121623712565" src="en-us_image_0000001538354737.png"></span></div>
|
|
</li></ol>
|
|
</li><li id="dli_05_0062__en-us_topic_0000001488009165_li682043171114">Write UDAF code. Pay attention to the following requirements when you implement the UDAF:<ul id="dli_05_0062__en-us_topic_0000001488009165_ul066452817574"><li id="dli_05_0062__en-us_topic_0000001488009165_li19664102818574">The UDAF class must inherit from <strong id="dli_05_0062__b210246698861254">org.apache.hadoop.hive.ql.exec.UDAF</strong> and <strong id="dli_05_0062__b176474032961254">org.apache.hadoop.hive.ql.exec.UDAFEvaluator</strong> classes. The function class must inherit from the UDAF class, and the <strong id="dli_05_0062__b55329477461254">Evaluator</strong> class must implement the <strong id="dli_05_0062__b67605007061254">UDAFEvaluator</strong> interface.</li><li id="dli_05_0062__en-us_topic_0000001488009165_li26643285574">The <strong id="dli_05_0062__b104763781161254">Evaluator</strong> class must implement the <strong id="dli_05_0062__b184371166061254">init</strong>, <strong id="dli_05_0062__b95809176361254">iterate</strong>, <strong id="dli_05_0062__b60271042861254">terminatePartial</strong>, <strong id="dli_05_0062__b28256125361254">merge</strong>, and <strong id="dli_05_0062__b153601023761254">terminate</strong> functions of <strong id="dli_05_0062__b94758136661254">UDAFEvaluator</strong>.<ul id="dli_05_0062__en-us_topic_0000001488009165_ul187121921105714"><li id="dli_05_0062__en-us_topic_0000001488009165_li1171214210577">The <strong id="dli_05_0062__b14716015361254">init</strong> function overrides the <strong id="dli_05_0062__b89483286061254">init</strong> function of the <strong id="dli_05_0062__b94239691061254">UDAFEvaluator</strong> interface.</li><li id="dli_05_0062__en-us_topic_0000001488009165_li1971362135720">The <strong id="dli_05_0062__en-us_topic_0000001488009165_b4923192241311">iterate</strong> function receives input parameters for internal iteration.</li><li id="dli_05_0062__en-us_topic_0000001488009165_li11713102135717">The <strong id="dli_05_0062__b203228771161254">terminatePartial</strong> function has no parameter. It returns the data obtained after the <strong id="dli_05_0062__b131833216361254">iterate</strong> traversal is complete. <strong id="dli_05_0062__b191342500061254">terminatePartial</strong> is similar to Hadoop <strong id="dli_05_0062__b32178098061254">Combiner</strong>.</li><li id="dli_05_0062__en-us_topic_0000001488009165_li1671382145717">The <strong id="dli_05_0062__b84540950561254">merge</strong> function receives the return values of <strong id="dli_05_0062__b137387091161254">terminatePartial</strong>.</li><li id="dli_05_0062__en-us_topic_0000001488009165_li1871320210572">The <strong id="dli_05_0062__en-us_topic_0000001488009165_b97934719146">terminate</strong> function returns the aggregated result.</li></ul>
|
|
<p id="dli_05_0062__en-us_topic_0000001488009165_p791921717144">For details about how to implement the UDAF, see the following sample code:</p>
|
|
<pre class="screen" id="dli_05_0062__en-us_topic_0000001488009165_screen26639375257">package com.dli.demo;
|
|
|
|
import org.apache.hadoop.hive.ql.exec.UDAF;
|
|
import org.apache.hadoop.hive.ql.exec.UDAFEvaluator;
|
|
|
|
/***
|
|
* @jdk jdk1.8.0
|
|
* @version 1.0
|
|
***/
|
|
public class AvgFilterUDAFDemo extends UDAF {
|
|
|
|
/**
|
|
* Defines the static inner class <strong id="dli_05_0062__b34501339461254">AvgFilter</strong>.
|
|
*/
|
|
public static class PartialResult
|
|
{
|
|
public Long sum;
|
|
}
|
|
|
|
public static class VarianceEvaluator implements UDAFEvaluator {
|
|
|
|
// Initializes the <strong id="dli_05_0062__b64030062461254">PartialResult</strong> object.
|
|
private AvgFilterUDAFDemo.PartialResult partial;
|
|
|
|
// Declares a <strong id="dli_05_0062__b16648491561254">VarianceEvaluator</strong> constructor that has no parameters.
|
|
public VarianceEvaluator(){
|
|
|
|
this.partial = new AvgFilterUDAFDemo.PartialResult();
|
|
|
|
init();
|
|
}
|
|
|
|
/**
|
|
* Initializes the UDAF, which is similar to a constructor.
|
|
*/
|
|
@Override
|
|
public void init() {
|
|
|
|
// Sets the initial value of <strong id="dli_05_0062__b1920369661254">sum</strong>.
|
|
this.partial.sum = 0L;
|
|
}
|
|
|
|
/**
|
|
* Receives input parameters for internal iteration.
|
|
* @param x
|
|
* @return
|
|
*/
|
|
public void iterate(Long x) {
|
|
if (x == null) {
|
|
return;
|
|
}
|
|
AvgFilterUDAFDemo.PartialResult tmp9_6 = this.partial;
|
|
tmp9_6.sum = tmp9_6.sum | x;
|
|
}
|
|
|
|
/**
|
|
* Returns the data obtained after the <strong id="dli_05_0062__b213564760161254">iterate</strong> traversal is complete.
|
|
* <strong id="dli_05_0062__b193626176361254">terminatePartial</strong> is similar to Hadoop <strong id="dli_05_0062__b114639375661254">Combiner</strong>.
|
|
* @return
|
|
*/
|
|
public AvgFilterUDAFDemo.PartialResult terminatePartial()
|
|
{
|
|
return this.partial;
|
|
}
|
|
|
|
/**
|
|
* Receives the return values of <strong id="dli_05_0062__b106022850461254">terminatePartial</strong> and merges the data.
|
|
* @param
|
|
* @return
|
|
*/
|
|
public void merge(AvgFilterUDAFDemo.PartialResult pr)
|
|
{
|
|
if (pr == null) {
|
|
return;
|
|
}
|
|
AvgFilterUDAFDemo.PartialResult tmp9_6 = this.partial;
|
|
tmp9_6.sum = tmp9_6.sum | pr.sum;
|
|
}
|
|
|
|
/**
|
|
* Returns the aggregated result.
|
|
* @return
|
|
*/
|
|
public Long terminate()
|
|
{
|
|
if (this.partial.sum == null) {
|
|
return 0L;
|
|
}
|
|
return this.partial.sum;
|
|
}
|
|
}
|
|
}</pre>
|
|
</li></ul>
|
|
</li><li id="dli_05_0062__en-us_topic_0000001488009165_li17751185518266">Use IntelliJ IDEA to compile the code and pack it into the JAR package.<ol type="a" id="dli_05_0062__en-us_topic_0000001488009165_ol369523152718"><li id="dli_05_0062__en-us_topic_0000001488009165_li118510322918">Click <strong id="dli_05_0062__b212794875961254">Maven</strong> in the tool bar on the right, and click <strong id="dli_05_0062__b79316762161254">clean</strong> and <strong id="dli_05_0062__b40572191661254">compile</strong> to compile the code.<p id="dli_05_0062__en-us_topic_0000001488009165_p18851139298">After the compilation is successful, click <strong id="dli_05_0062__b23997056661254">package</strong>.</p>
|
|
<div class="fignone" id="dli_05_0062__en-us_topic_0000001488009165_fig157441227163112"><span class="figcap"><b>Figure 8 </b>Exporting the Jar file</span><br><span><img id="dli_05_0062__en-us_topic_0000001488009165_image1874432703113" src="en-us_image_0000001538394649.png"></span></div>
|
|
</li><li id="dli_05_0062__en-us_topic_0000001488009165_li201810498335">The generated JAR package is stored in the <strong id="dli_05_0062__b178268642761254">target</strong> directory. In this example, <strong id="dli_05_0062__b207902989161254">MyUDAF-1.0-SNAPSHOT.jar</strong> is stored in <strong id="dli_05_0062__b95721518261254">D:\DLITest\MyUDAF\target</strong>.</li></ol>
|
|
</li><li id="dli_05_0062__en-us_topic_0000001488009165_li963172810347">Log in to the OBS console and upload the file to the OBS path.<div class="note" id="dli_05_0062__en-us_topic_0000001488009165_note364414226366"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="dli_05_0062__en-us_topic_0000001488009165_p102910218370">The region of the OBS bucket to which the Jar package is uploaded must be the same as the region of the DLI queue. Cross-region operations are not allowed.</p>
|
|
</div></div>
|
|
</li><li id="dli_05_0062__en-us_topic_0000001488009165_li1073875911212">(Optional) Upload the file to DLI for package management.<ol type="a" id="dli_05_0062__en-us_topic_0000001488009165_ol581320251731"><li id="dli_05_0062__en-us_topic_0000001488009165_li928318201531">Log in to the DLI management console and choose <strong id="dli_05_0062__b57038131561254">Data Management</strong> > <strong id="dli_05_0062__b29909131661254">Package Management</strong>.</li><li id="dli_05_0062__en-us_topic_0000001488009165_li1128312208316">On the <strong id="dli_05_0062__b139610950161254">Package Management</strong> page, click <strong id="dli_05_0062__b54006927561254">Create</strong> in the upper right corner.</li><li id="dli_05_0062__en-us_topic_0000001488009165_li1731916351419">In the <strong id="dli_05_0062__b120732969561254">Create Package</strong> dialog, set the following parameters:<ul id="dli_05_0062__en-us_topic_0000001488009165_ul4484181932715"><li id="dli_05_0062__en-us_topic_0000001488009165_li348431915277"><strong id="dli_05_0062__b206382130961254">Type</strong>: Select <strong id="dli_05_0062__b140017201061254">JAR</strong>.</li><li id="dli_05_0062__en-us_topic_0000001488009165_li17484131911277"><strong id="dli_05_0062__b154219453961254">OBS Path</strong>: Specify the OBS path for storing the package.</li><li id="dli_05_0062__en-us_topic_0000001488009165_li1048417198274">Set <strong id="dli_05_0062__b95598880461254">Group</strong> and <strong id="dli_05_0062__b164734307061254">Group Name</strong> as required for package identification and management.</li></ul>
|
|
</li><li id="dli_05_0062__en-us_topic_0000001488009165_li1017113321450">Click <strong id="dli_05_0062__b49922647661254">OK</strong>.</li></ol>
|
|
</li><li id="dli_05_0062__en-us_topic_0000001488009165_li13507814611"><a name="dli_05_0062__en-us_topic_0000001488009165_li13507814611"></a><a name="en-us_topic_0000001488009165_li13507814611"></a>Create the UDAF on DLI.<ol type="a" id="dli_05_0062__en-us_topic_0000001488009165_ol587820351787"><li id="dli_05_0062__en-us_topic_0000001488009165_li1399374110815">Log in to the DLI management console and create a SQL queue and a database.</li><li id="dli_05_0062__en-us_topic_0000001488009165_li11877103519810">Log in to the DLI console, choose <strong id="dli_05_0062__b2375525961254">SQL Editor</strong>. Set <strong id="dli_05_0062__b158663424761254">Engine</strong> to <strong id="dli_05_0062__b9042545661254">spark</strong>, and select the created SQL queue and database.</li><li id="dli_05_0062__en-us_topic_0000001488009165_li108786351183">In the SQL editing area, run the following statement to create a UDAF and click <strong id="dli_05_0062__b116981222661254">Execute</strong>.<div class="note" id="dli_05_0062__en-us_topic_0000001488009165_note118771135487"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="dli_05_0062__en-us_topic_0000001488009165_p087715353815">If the reloading function of the UDAF is enabled, the create statement changes.</p>
|
|
</div></div>
|
|
<pre class="screen" id="dli_05_0062__en-us_topic_0000001488009165_screen8878173512810">CREATE FUNCTION AvgFilterUDAFDemo AS 'com.dli.demo.AvgFilterUDAFDemo' using jar 'obs://dli-test-obs01/MyUDAF-1.0-SNAPSHOT.jar';</pre>
|
|
<p id="dli_05_0062__p1915174092314">Or</p>
|
|
<pre class="screen" id="dli_05_0062__screen9885154262320">CREATE OR REPLACE FUNCTION AvgFilterUDAFDemo AS 'com.dli.demo.AvgFilterUDAFDemo' using jar 'obs://dli-test-obs01/MyUDAF-1.0-SNAPSHOT.jar';</pre>
|
|
</li></ol>
|
|
</li><li id="dli_05_0062__en-us_topic_0000001488009165_li675502913919">Restart the original SQL queue for the added function to take effect.<ol type="a" id="dli_05_0062__en-us_topic_0000001488009165_ol760819311793"><li id="dli_05_0062__en-us_topic_0000001488009165_li2778832803">Log in to the DLI management console and choose <strong id="dli_05_0062__b73494451361254">Resources</strong> > <strong id="dli_05_0062__b8780883361254">Queue Management</strong> from the navigation pane. In the <strong id="dli_05_0062__b157218854061254">Operation</strong> column of the SQL queue, click <strong id="dli_05_0062__b34734956761254">Restart</strong>.</li><li id="dli_05_0062__en-us_topic_0000001488009165_li52511753698">In the <strong id="dli_05_0062__b173285253061254">Restart</strong> dialog box, click <strong id="dli_05_0062__b178525720161254">OK</strong>.</li></ol>
|
|
</li><li id="dli_05_0062__en-us_topic_0000001488009165_li3839192481013">Use the UDAF.<p id="dli_05_0062__en-us_topic_0000001488009165_p11755171213288"><a name="dli_05_0062__en-us_topic_0000001488009165_li3839192481013"></a><a name="en-us_topic_0000001488009165_li3839192481013"></a>Use the UDAF function created in <a href="#dli_05_0062__en-us_topic_0000001488009165_li13507814611">6</a> in the query statement:</p>
|
|
<pre class="screen" id="dli_05_0062__en-us_topic_0000001488009165_screen145271112120">select AvgFilterUDAFDemo(real_stock_rate) AS show_rate FROM dw_ad_estimate_real_stock_rate limit 1000;</pre>
|
|
</li><li id="dli_05_0062__en-us_topic_0000001488009165_li1876813021218">(Optional) Delete the UDAF.<p id="dli_05_0062__en-us_topic_0000001488009165_p876823071211"><a name="dli_05_0062__en-us_topic_0000001488009165_li1876813021218"></a><a name="en-us_topic_0000001488009165_li1876813021218"></a>If the UDAF is no longer used, run the following statement to delete it:</p>
|
|
<pre class="screen" id="dli_05_0062__en-us_topic_0000001488009165_screen991012381122">Drop FUNCTION AvgFilterUDAFDemo;</pre>
|
|
</li></ol>
|
|
</div>
|
|
</div>
|
|
<div>
|
|
<div class="familylinks">
|
|
<div class="parentlink"><strong>Parent topic:</strong> <a href="dli_09_0120.html">SQL Jobs</a></div>
|
|
</div>
|
|
</div>
|
|
|