Files
doc-exports/docs/dli/sqlreference/dli_08_15026.html
Su, Xiaomeng be9eabe464 dli_sqlreference_20250305
Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com>
Co-authored-by: Su, Xiaomeng <suxiaomeng1@huawei.com>
Co-committed-by: Su, Xiaomeng <suxiaomeng1@huawei.com>
2025-03-25 09:06:21 +00:00

152 lines
14 KiB
HTML

<a name="dli_08_15026"></a><a name="dli_08_15026"></a>
<h1 class="topictitle1">Raw</h1>
<div id="body0000001262495798"><div class="section" id="dli_08_15026__section21131854311"><h4 class="sectiontitle">Function</h4><p id="dli_08_15026__p1106133684412">The Raw format allows to read and write raw (byte based) values as a single column.</p>
<div class="note" id="dli_08_15026__note173491744114816"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><ul id="dli_08_15026__ul12961416134910"><li id="dli_08_15026__li29661616496">This format encodes null values as null of byte[] type. This may have limitation when used in <strong id="dli_08_15026__b15604149594277">upsert-kafka</strong>, because <strong id="dli_08_15026__b8028846374277">upsert-kafka</strong> treats null values as a tombstone message (DELETE on the key). Therefore, we recommend avoiding using <strong id="dli_08_15026__b10791740125014">upsert-kafka</strong> connector and the <strong id="dli_08_15026__b1517704945018">raw</strong> format as a <strong id="dli_08_15026__b182375218500">value.format</strong> if the field can have a null value.</li><li id="dli_08_15026__li199661616495">The raw format connector is built-in, no additional dependencies are required. For details, see <a href="https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/connectors/table/formats/raw/" target="_blank" rel="noopener noreferrer">Raw Format</a>.</li></ul>
</div></div>
</div>
<div class="section" id="dli_08_15026__section122491371116"><h4 class="sectiontitle">Supported Connectors</h4><ul id="dli_08_15026__ul188074312166"><li id="dli_08_15026__li14357112884017">Kafka</li><li id="dli_08_15026__li109046505172">Upsert Kafka</li><li id="dli_08_15026__li161416115184">FileSystem</li></ul>
</div>
<div class="section" id="dli_08_15026__section861233454311"><h4 class="sectiontitle">Parameter Description</h4>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="dli_08_15026__table9325840115315" frame="border" border="1" rules="all"><caption><b>Table 1 </b></caption><thead align="left"><tr id="dli_08_15026__row1732694010539"><th align="left" class="cellrowborder" valign="top" width="20%" id="mcps1.3.3.2.2.6.1.1"><p id="dli_08_15026__p93267409538">Parameter</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="20%" id="mcps1.3.3.2.2.6.1.2"><p id="dli_08_15026__p173261640155311">Mandatory</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="20%" id="mcps1.3.3.2.2.6.1.3"><p id="dli_08_15026__p10341171813543">Default Value</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="20%" id="mcps1.3.3.2.2.6.1.4"><p id="dli_08_15026__p1032617403532">Type</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="20%" id="mcps1.3.3.2.2.6.1.5"><p id="dli_08_15026__p053368175414">Description</p>
</th>
</tr>
</thead>
<tbody><tr id="dli_08_15026__row123261240135316"><td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.2.6.1.1 "><p id="dli_08_15026__p1660815540549">format</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.2.6.1.2 "><p id="dli_08_15026__p7608854165417">Yes</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.2.6.1.3 "><p id="dli_08_15026__p2608954125415">None</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.2.6.1.4 "><p id="dli_08_15026__p86085545542">String</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.2.6.1.5 "><p id="dli_08_15026__p156081554105413">Format to be used. Set this parameter to <strong id="dli_08_15026__b194972128524">raw</strong>.</p>
</td>
</tr>
<tr id="dli_08_15026__row14326114075311"><td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.2.6.1.1 "><p id="dli_08_15026__p1608195435416">raw.charset</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.2.6.1.2 "><p id="dli_08_15026__p660985455411">No</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.2.6.1.3 "><p id="dli_08_15026__p1160905485412">UTF-8</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.2.6.1.4 "><p id="dli_08_15026__p176091354145410">String</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.2.6.1.5 "><p id="dli_08_15026__p126091954205414">Charset to encode the text string.</p>
</td>
</tr>
<tr id="dli_08_15026__row13326194014531"><td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.2.6.1.1 "><p id="dli_08_15026__p060985485415">raw.endianness</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.2.6.1.2 "><p id="dli_08_15026__p17609254165418">No</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.2.6.1.3 "><p id="dli_08_15026__p460995413547">big-endian</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.2.6.1.4 "><p id="dli_08_15026__p20609175415544">String</p>
</td>
<td class="cellrowborder" valign="top" width="20%" headers="mcps1.3.3.2.2.6.1.5 "><p id="dli_08_15026__p19609145415543">Endianness to encode the bytes of numeric value. Valid values are <strong id="dli_08_15026__b191534417545">big-endian</strong> and <strong id="dli_08_15026__b1350245419535">little-endian</strong>. You can search for <a href="https://en.wikipedia.org/wiki/Endianness" target="_blank" rel="noopener noreferrer">endianness</a> for more details.</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="dli_08_15026__section1051753152015"><h4 class="sectiontitle">Data Type Mapping</h4><p id="dli_08_15026__p1978815710203">The table below details the SQL types the format supports, including details of the serializer and deserializer class for encoding and decoding.</p>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="dli_08_15026__table12172416122119" frame="border" border="1" rules="all"><caption><b>Table 2 </b>Data type mapping</caption><thead align="left"><tr id="dli_08_15026__row16172716112113"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.4.3.2.3.1.1"><p id="dli_08_15026__p9173131682115">Flink SQL Type</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.4.3.2.3.1.2"><p id="dli_08_15026__p1517315161214">Value</p>
</th>
</tr>
</thead>
<tbody><tr id="dli_08_15026__row1617318169212"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.4.3.2.3.1.1 "><p id="dli_08_15026__p1417321615219">CHAR/VARCHAR/STRING</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.4.3.2.3.1.2 "><p id="dli_08_15026__p131739169216">A UTF-8 (by default) encoded text string. The encoding charset can be configured by <strong id="dli_08_15026__b1619312314121">raw.charse</strong>.</p>
</td>
</tr>
<tr id="dli_08_15026__row1617341632114"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.4.3.2.3.1.1 "><p id="dli_08_15026__p41731168217">BINARY / VARBINARY / BYTES</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.4.3.2.3.1.2 "><p id="dli_08_15026__p121735161214">The sequence of bytes itself.</p>
</td>
</tr>
<tr id="dli_08_15026__row5173916182116"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.4.3.2.3.1.1 "><p id="dli_08_15026__p151739163215">BOOLEAN</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.4.3.2.3.1.2 "><p id="dli_08_15026__p2173161619218">A single byte to indicate boolean value, <strong id="dli_08_15026__b527415810129">0</strong> means <strong id="dli_08_15026__b23532002138">false</strong>, <strong id="dli_08_15026__b15349834131">1</strong> means <strong id="dli_08_15026__b175153419135">true</strong>.</p>
</td>
</tr>
<tr id="dli_08_15026__row18173151632116"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.4.3.2.3.1.1 "><p id="dli_08_15026__p1717317167218">TINYINT</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.4.3.2.3.1.2 "><p id="dli_08_15026__p2017341622116">A single byte of the signed number value.</p>
</td>
</tr>
<tr id="dli_08_15026__row17173131642111"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.4.3.2.3.1.1 "><p id="dli_08_15026__p1717321602118">SMALLINT</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.4.3.2.3.1.2 "><p id="dli_08_15026__p18173191612219">Two bytes with big-endian (by default) encoding. The endianness can be configured by <strong id="dli_08_15026__b9396177203">raw.endianness</strong>.</p>
</td>
</tr>
<tr id="dli_08_15026__row2173111617212"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.4.3.2.3.1.1 "><p id="dli_08_15026__p171731216132118">INT</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.4.3.2.3.1.2 "><p id="dli_08_15026__p1017313169217">Four bytes with big-endian (by default) encoding. The endianness can be configured by <strong id="dli_08_15026__b11416632182014">raw.endianness</strong>.</p>
</td>
</tr>
<tr id="dli_08_15026__row13173161614215"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.4.3.2.3.1.1 "><p id="dli_08_15026__p10173171614213">BIGINT</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.4.3.2.3.1.2 "><p id="dli_08_15026__p7173916202115">Eight bytes with big-endian (by default) encoding. The endianness can be configured by <strong id="dli_08_15026__b43443585209">raw.endianness</strong>.</p>
</td>
</tr>
<tr id="dli_08_15026__row13173141642111"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.4.3.2.3.1.1 "><p id="dli_08_15026__p517361611212">FLOAT</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.4.3.2.3.1.2 "><p id="dli_08_15026__p12173151613212">Four bytes with IEEE 754 format and big-endian (by default) encoding. The endianness can be configured by <strong id="dli_08_15026__b1518720165217">raw.endianness</strong>.</p>
</td>
</tr>
<tr id="dli_08_15026__row317316166216"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.4.3.2.3.1.1 "><p id="dli_08_15026__p1317371616218">DOUBLE</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.4.3.2.3.1.2 "><p id="dli_08_15026__p1617331622119">Eight bytes with IEEE 754 format and big-endian (by default) encoding. The endianness can be configured by <strong id="dli_08_15026__b147991337172116">raw.endianness</strong>.</p>
</td>
</tr>
<tr id="dli_08_15026__row16173131622115"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.4.3.2.3.1.1 "><p id="dli_08_15026__p18173131617219">RAW</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.4.3.2.3.1.2 "><p id="dli_08_15026__p16173181617218">The sequence of bytes serialized by the underlying TypeSerializer of the RAW type.</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="dli_08_15026__section2958175584318"><h4 class="sectiontitle">Example</h4><p id="dli_08_15026__p1375334113113">Use Kafka to send data and output the data to Print.</p>
<ol id="dli_08_15026__ol14231744133117"><li id="dli_08_15026__li04031578234"><span>Create a datasource connection for the communication with the VPC and subnet where Kafka locates and bind the connection to the queue. Set a security group and inbound rule to allow access of the queue and test the connectivity of the queue using the Kafka IP address. For example, locate a general-purpose queue where the job runs and choose <strong id="dli_08_15026__b6555621192614">More</strong> &gt; <strong id="dli_08_15026__b75551721132618">Test Address Connectivity</strong> in the <strong id="dli_08_15026__b16555821192619">Operation</strong> column. If the connection is successful, the datasource is bound to the queue. Otherwise, the binding fails.</span></li><li id="dli_08_15026__li1599913011242"><span>Create a Flink OpenSource SQL job and select Flink 1.15. Copy the following statement and submit the job:</span><p><pre class="screen" id="dli_08_15026__screen525011213328">CREATE TABLE kafkaSource (
log string
) WITH (
'connector' = 'kafka',
'topic' = '<em id="dli_08_15026__i2092805817216"><strong id="dli_08_15026__b269205811212">kafkaTopic</strong></em>',
'properties.bootstrap.servers' = '<em id="dli_08_15026__i13283145113512"><strong id="dli_08_15026__b4283751135115">KafkaAddress1:KafkaPort,KafkaAddress2:KafkaPort</strong></em>',
'properties.group.id' = '<em id="dli_08_15026__i99651417310"><strong id="dli_08_15026__b443871734">GroupId</strong></em>',
'scan.startup.mode' = 'latest-offset',
'format' = 'raw'
);
CREATE TABLE printSink (
log string
) WITH (
'connector' = 'print'
);
insert into printSink select * from kafkaSource; </pre>
</p></li><li id="dli_08_15026__li18839185317311"><span>Insert the following data to the corresponding topic in Kafka:</span><p><pre class="screen" id="dli_08_15026__screen56532035203216">47.29.201.179 - - [28/Feb/2019:13:17:10 +0000] "GET /?p=1 HTTP/2.0" 200 5316 "https://domain.com/?p=1" "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.119 Safari/537.36" "2.75"</pre>
</p></li><li id="dli_08_15026__li4353143193117"><span>Perform the following operations to view the data result in the <strong id="dli_08_15026__b10233708999275">taskmanager.out</strong> file:</span><p><ol type="a" id="dli_08_15026__ol864115198285"><li id="dli_08_15026__li10901621122819">Log in to the DLI console. In the navigation pane, choose <strong id="dli_08_15026__b173092608391927">Job Management</strong> &gt; <strong id="dli_08_15026__b100570924291927">Flink Jobs</strong>.</li><li id="dli_08_15026__li1912163912282">Click the name of the corresponding Flink job, choose <strong id="dli_08_15026__b190426799891937">Run Log</strong>, click <strong id="dli_08_15026__b210703689891937">OBS Bucket</strong>, and locate the folder of the log you want to view according to the date.</li><li id="dli_08_15026__li0641191914285">Go to the folder of the date, find the folder whose name contains <strong id="dli_08_15026__b14088613114277">taskmanager</strong>, download the <strong id="dli_08_15026__b1797810474277">.out</strong> file, and view result logs.</li></ol>
<pre class="screen" id="dli_08_15026__screen13510115814327">+I[47.29.201.179 - - [28/Feb/2019:13:17:10 +0000] "GET /?p=1 HTTP/2.0" 200 5316 "https://domain.com/?p=1" "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.119 Safari/537.36" "2.75"]</pre>
</p></li></ol>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="dli_08_15014.html">Formats</a></div>
</div>
</div>