forked from docs/doc-exports
Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com> Co-authored-by: Su, Xiaomeng <suxiaomeng1@huawei.com> Co-committed-by: Su, Xiaomeng <suxiaomeng1@huawei.com>
232 lines
18 KiB
HTML
232 lines
18 KiB
HTML
<a name="dli_08_15016"></a><a name="dli_08_15016"></a>
|
|
|
|
<h1 class="topictitle1">Avro</h1>
|
|
<div id="body0000001309855881"><div class="section" id="dli_08_15016__section1699332117215"><h4 class="sectiontitle">Function</h4><p id="dli_08_15016__p457512386218">Apache Avro is supported for you to read and write Avro data based on an Avro schema with Flink. The Avro schema is derived from the table schema.</p>
|
|
<p id="dli_08_15016__p31920232323">For details, see <a href="https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/connectors/table/formats/avro/" target="_blank" rel="noopener noreferrer">Avro Format</a>.</p>
|
|
</div>
|
|
<div class="section" id="dli_08_15016__section0152105362113"><h4 class="sectiontitle">Supported Connectors</h4><ul id="dli_08_15016__ul7219843143018"><li id="dli_08_15016__li421954373011">Kafka</li><li id="dli_08_15016__li3158115212300">Upsert Kafka</li><li id="dli_08_15016__li776010427481">FileSystem</li></ul>
|
|
</div>
|
|
<div class="section" id="dli_08_15016__section7549418211"><h4 class="sectiontitle">Parameters</h4>
|
|
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="dli_08_15016__table6232172462311" frame="border" border="1" rules="all"><caption><b>Table 1 </b>Parameters</caption><thead align="left"><tr id="dli_08_15016__row18232132462319"><th align="left" class="cellrowborder" valign="top" width="10.83%" id="mcps1.3.3.2.2.6.1.1"><p id="dli_08_15016__p2232142432313">Parameter</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="29.17%" id="mcps1.3.3.2.2.6.1.2"><p id="dli_08_15016__p323210243235">Mandatory</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="20%" id="mcps1.3.3.2.2.6.1.3"><p id="dli_08_15016__p323232415231">Default value</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="20%" id="mcps1.3.3.2.2.6.1.4"><p id="dli_08_15016__p14232132416238">Type</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="20%" id="mcps1.3.3.2.2.6.1.5"><p id="dli_08_15016__p42321224172315">Description</p>
|
|
</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody><tr id="dli_08_15016__row17232182412231"><td class="cellrowborder" valign="top" width="10.83%" headers="mcps1.3.3.2.2.6.1.1 "><p id="dli_08_15016__p54831039182715">format</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="29.17%" headers="mcps1.3.3.2.2.6.1.2 "><p id="dli_08_15016__p9483639172712">Yes</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.2.6.1.3 "><p id="dli_08_15016__p11483103911278">None</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.2.6.1.4 "><p id="dli_08_15016__p4483039152719">String</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.2.6.1.5 "><p id="dli_08_15016__p2483103915272">Format to be used. Set the value to <strong id="dli_08_15016__b1067516101393">avro</strong>.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_08_15016__row1023212410233"><td class="cellrowborder" valign="top" width="10.83%" headers="mcps1.3.3.2.2.6.1.1 "><p id="dli_08_15016__p648333942714">avro.codec</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="29.17%" headers="mcps1.3.3.2.2.6.1.2 "><p id="dli_08_15016__p16483183918279">No</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.2.6.1.3 "><p id="dli_08_15016__p154834393274">None</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.2.6.1.4 "><p id="dli_08_15016__p1848363902711">String</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.2.6.1.5 "><p id="dli_08_15016__p12483143992710">For Filesystem only, the compression codec for avro. Snappy compression as default. The valid enumerations are: <strong id="dli_08_15016__b561253664820">null</strong>, <strong id="dli_08_15016__b126931138134810">deflate</strong>, <strong id="dli_08_15016__b3839339134818">snappy</strong>, <strong id="dli_08_15016__b191811042124819">bzip2</strong>, and <strong id="dli_08_15016__b8163847114813">xz</strong>.</p>
|
|
</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
</div>
|
|
</div>
|
|
<div class="section" id="dli_08_15016__section6691185734518"><h4 class="sectiontitle">Data Type Mapping</h4><p id="dli_08_15016__p12512158104613">Currently, the Avro schema is derived from the table schema and cannot be explicitly defined. The following table lists mappings between Flink to Avro types.</p>
|
|
<p id="dli_08_15016__p1247461719478">In addition to the following types, Flink supports reading/writing nullable types. Flink maps nullable types to Avro <strong id="dli_08_15016__b868418311532">union(something, null)</strong>, where <strong id="dli_08_15016__b97297525462">something</strong> is an Avro type converted from Flink type.</p>
|
|
|
|
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="dli_08_15016__table127011632174812" frame="border" border="1" rules="all"><caption><b>Table 2 </b>Data type mapping</caption><thead align="left"><tr id="dli_08_15016__row6701153244811"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.4.4.2.4.1.1"><p id="dli_08_15016__p3701203218486">Flink SQL Type</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.4.4.2.4.1.2"><p id="dli_08_15016__p2070153214817">Avro Type</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.4.4.2.4.1.3"><p id="dli_08_15016__p137011632114817">Avro Logical Type</p>
|
|
</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody><tr id="dli_08_15016__row19701173224810"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.1 "><p id="dli_08_15016__p370112322488">CHAR/VARCHAR/STRING</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.2 "><p id="dli_08_15016__p270183214484">String</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.3 "><p id="dli_08_15016__p77011932174814">-</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_08_15016__row147011332174815"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.1 "><p id="dli_08_15016__p137011332154819">BOOLEAN</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.2 "><p id="dli_08_15016__p370112325486">Boolean</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.3 "><p id="dli_08_15016__p770110325482">-</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_08_15016__row17011232104817"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.1 "><p id="dli_08_15016__p17011532104816">BINARY/VARBINARY</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.2 "><p id="dli_08_15016__p3701532134810">bytes</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.3 "><p id="dli_08_15016__p070183217489">-</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_08_15016__row67013329481"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.1 "><p id="dli_08_15016__p97011322488">DECIMAL</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.2 "><p id="dli_08_15016__p1670183213488">fixed</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.3 "><p id="dli_08_15016__p15701532154815">decimal</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_08_15016__row19701153284818"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.1 "><p id="dli_08_15016__p177015322486">TINYINT</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.2 "><p id="dli_08_15016__p5701173210482">int</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.3 "><p id="dli_08_15016__p1770163254810">-</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_08_15016__row4701173224815"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.1 "><p id="dli_08_15016__p9701123214484">SMALLINT</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.2 "><p id="dli_08_15016__p11701103214819">int</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.3 "><p id="dli_08_15016__p11701332154814">-</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_08_15016__row107018327484"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.1 "><p id="dli_08_15016__p17017323481">INT</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.2 "><p id="dli_08_15016__p1701133234816">int</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.3 "><p id="dli_08_15016__p187011132114817">-</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_08_15016__row18701332144820"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.1 "><p id="dli_08_15016__p4701173216484">BIGINT</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.2 "><p id="dli_08_15016__p370133254820">long</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.3 "><p id="dli_08_15016__p127011632144818">-</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_08_15016__row177011432134819"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.1 "><p id="dli_08_15016__p470173224817">FLOAT</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.2 "><p id="dli_08_15016__p4701163215481">float</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.3 "><p id="dli_08_15016__p117018325484">-</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_08_15016__row670133224818"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.1 "><p id="dli_08_15016__p3701173274814">DOUBLE</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.2 "><p id="dli_08_15016__p1170123211486">double</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.3 "><p id="dli_08_15016__p13701732174819">-</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_08_15016__row570103210486"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.1 "><p id="dli_08_15016__p87014325483">DATE</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.2 "><p id="dli_08_15016__p15701103214489">int</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.3 "><p id="dli_08_15016__p147010322486">date</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_08_15016__row147012032104818"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.1 "><p id="dli_08_15016__p20701103264818">TIME</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.2 "><p id="dli_08_15016__p1270163224810">int</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.3 "><p id="dli_08_15016__p18701123210482">time-millis</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_08_15016__row770123214811"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.1 "><p id="dli_08_15016__p27016329485">TIMESTAMP</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.2 "><p id="dli_08_15016__p20701173264814">long</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.3 "><p id="dli_08_15016__p1370103244811">timestamp-millis</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_08_15016__row6701532114817"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.1 "><p id="dli_08_15016__p2701173214483">ARRAY</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.2 "><p id="dli_08_15016__p19701532164817">array</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.3 "><p id="dli_08_15016__p1270116324482">-</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_08_15016__row127015324484"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.1 "><p id="dli_08_15016__p0701163216480">MAP (keys must be of the string, char, or varchar type.)</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.2 "><p id="dli_08_15016__p16701143244811">map</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.3 "><p id="dli_08_15016__p177018324480">-</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_08_15016__row1570163264815"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.1 "><p id="dli_08_15016__p1070163217486">MULTISET (elements must be of the string, char, or varchar type.)</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.2 "><p id="dli_08_15016__p13702832194820">map</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.3 "><p id="dli_08_15016__p12702133244815">-</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="dli_08_15016__row137021432184818"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.1 "><p id="dli_08_15016__p10702232124819">ROW</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.2 "><p id="dli_08_15016__p8702203219487">record</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.4.4.2.4.1.3 "><p id="dli_08_15016__p47024324482">-</p>
|
|
</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
</div>
|
|
</div>
|
|
<div class="section" id="dli_08_15016__section1451265719219"><h4 class="sectiontitle">Example</h4><p id="dli_08_15016__p10979854205910">Read data from Kafka, deserialize the data to the Avro format, and outputs the data to Print.</p>
|
|
<ol id="dli_08_15016__ol1276162418571"><li id="dli_08_15016__li04031578234"><span>Create a datasource connection for access to the VPC and subnet where Kafka locates and bind the connection to the queue. Set a security group and inbound rule to allow access of the queue and test the connectivity of the queue using the Kafka IP address. For example, locate a general-purpose queue where the job runs and choose <strong id="dli_08_15016__b797218545485">More</strong> > <strong id="dli_08_15016__b897235418485">Test Address Connectivity</strong> in the <strong id="dli_08_15016__b997275434810">Operation</strong> column. If the connection is successful, the datasource is bound to the queue. Otherwise, the binding fails.</span></li><li id="dli_08_15016__li1599913011242"><span>Create a Flink OpenSource SQL job and select Flink 1.15. Copy the following statement and submit the job:</span><p><pre class="screen" id="dli_08_15016__screen175599544570">CREATE TABLE kafkaSource (
|
|
order_id string,
|
|
order_channel string,
|
|
order_time string,
|
|
pay_amount double,
|
|
real_pay double,
|
|
pay_time string,
|
|
user_id string,
|
|
user_name string,
|
|
area_id string
|
|
) WITH (
|
|
'connector' = 'kafka',
|
|
'topic' = '<em id="dli_08_15016__i217094374613"><strong id="dli_08_15016__b177598427467">kafkaTopic</strong></em>',
|
|
'properties.bootstrap.servers' = '<em id="dli_08_15016__i366724417446"><strong id="dli_08_15016__b1466716449449">KafkaAddress1:KafkaPort,KafkaAddress2:KafkaPort</strong></em>',
|
|
'properties.group.id' = '<em id="dli_08_15016__i2107121104713"><strong id="dli_08_15016__b13675001471">GroupId</strong></em>',
|
|
'scan.startup.mode' = 'latest-offset',
|
|
'format' = 'avro'
|
|
);
|
|
|
|
|
|
CREATE TABLE printSink (
|
|
order_id string,
|
|
order_channel string,
|
|
order_time string,
|
|
pay_amount double,
|
|
real_pay double,
|
|
pay_time string,
|
|
user_id string,
|
|
user_name string,
|
|
area_id string
|
|
) WITH (
|
|
'connector' = 'print'
|
|
);
|
|
insert into printSink select * from kafkaSource;</pre>
|
|
</p></li><li id="dli_08_15016__li122251732830"><span>Insert the following data to Kafka using Avro data serialization:</span><p><pre class="screen" id="dli_08_15016__screen144861201921">{"order_id":"202103241000000001","order_channel":"webShop","order_time":"2021-03-24 10:00:00","pay_amount":100.0,"real_pay":100.0,"pay_time":"2021-03-24 10:02:03","user_id":"0001","user_name":"Alice","area_id":"330106"}
|
|
|
|
{"order_id":"202103241606060001","order_channel":"appShop","order_time":"2021-03-24 16:06:06","pay_amount":200.0,"real_pay":180.0,"pay_time":"2021-03-24 16:10:06","user_id":"0001","user_name":"Alice","area_id":"330106"}</pre>
|
|
</p></li><li id="dli_08_15016__li4353143193117"><span>Perform the following operations to view the data result in the <strong id="dli_08_15016__b27117147892656">taskmanager.out</strong> file:</span><p><ol type="a" id="dli_08_15016__ol864115198285"><li id="dli_08_15016__li10901621122819">Log in to the DLI console. In the navigation pane, choose <strong id="dli_08_15016__b125992276391918">Job Management</strong> > <strong id="dli_08_15016__b141370065891918">Flink Jobs</strong>.</li><li id="dli_08_15016__li1912163912282">Click the name of the corresponding Flink job, choose <strong id="dli_08_15016__b7858005591929">Run Log</strong>, click <strong id="dli_08_15016__b36279761991929">OBS Bucket</strong>, and locate the folder of the log you want to view according to the date.</li><li id="dli_08_15016__li0641191914285">Go to the folder of the date, find the folder whose name contains <strong id="dli_08_15016__b164230351242750">taskmanager</strong>, download the <strong id="dli_08_15016__b100660138742750">.out</strong> file, and view result logs.</li></ol>
|
|
<pre class="screen" id="dli_08_15016__screen399124820519">+I[202103241000000001, webShop, 2021-03-24 10:00:00, 100.0, 100.0, 2021-03-24 10:02:03, 0001, Alice, 330106]
|
|
+I[202103241606060001, appShop, 2021-03-24 16:06:06, 200.0, 180.0, 2021-03-24 16:10:06, 0001, Alice, 330106]</pre>
|
|
</p></li></ol>
|
|
</div>
|
|
</div>
|
|
<div>
|
|
<div class="familylinks">
|
|
<div class="parentlink"><strong>Parent topic:</strong> <a href="dli_08_15014.html">Formats</a></div>
|
|
</div>
|
|
</div>
|
|
|