Yang, Tong 6182f91ba8 MRS component operation guide_normal 2.0.38.SP20 version
Reviewed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com>
Co-authored-by: Yang, Tong <yangtong2@huawei.com>
Co-committed-by: Yang, Tong <yangtong2@huawei.com>
2022-12-09 14:55:21 +00:00

17 lines
1.6 KiB
HTML

<a name="mrs_01_24071"></a><a name="mrs_01_24071"></a>
<h1 class="topictitle1">Parquet/Avro schema Is Reported When Updated Data Is Written</h1>
<div id="body0000001146783027"><div class="section" id="mrs_01_24071__section1453810451898"><h4 class="sectiontitle">Question</h4><p id="mrs_01_24071__p178111501326">The following error is reported when data is written:</p>
<pre class="screen" id="mrs_01_24071__screen256782103318">org.apache.parquet.io.InvalidRecordException: Parquet/Avro schema mismatch: Avro field 'col1' not found</pre>
</div>
<div class="section" id="mrs_01_24071__section11310239101"><h4 class="sectiontitle">Answer</h4><p id="mrs_01_24071__p182882343158">You are advised to evolve schemas in backward compatible mode while using Hudi. This error usually occurs when you delete some columns, such as <strong id="mrs_01_24071__b17144812151318">col1</strong>, in backward incompatible mode and then update <strong id="mrs_01_24071__b6145212111311">col1</strong> written with the old schema in the Parquet file. In this case, the Parquet file attempts to search for all the current fields in the input record, if <strong id="mrs_01_24071__b114571214135">col1</strong> does not exist, the preceding exception is thrown.</p>
<p id="mrs_01_24071__p8288153410151">To solve this problem, create an uber schema using all the schema versions evolved and use this uber schema as the target schema. You can obtain a schema from Hive MetaStore and merge it with the current schema.</p>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_24070.html">Data Write</a></div>
</div>
</div>