Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Got a "can not write FileMetaData" exception for parquet format #1730

Closed
1 of 2 tasks
prm-xingcan opened this issue Aug 3, 2023 · 6 comments
Closed
1 of 2 tasks
Labels
bug Something isn't working

Comments

@prm-xingcan
Copy link

prm-xingcan commented Aug 3, 2023

Search before asking

  • I searched in the issues and found nothing similar.

Paimon version

0.5-SNAPSHOT

Compute Engine

Flink 1.16.2

Minimal reproduce step

  • The table schema is really complicated.

  • We use MinIO for storage.

What doesn't meet your expectations?

2023-08-03 14:55:37 [Writer -> view -> Global Committer -> view -> Sink: end (1/1)#42] WARN  org.apache.paimon.io.SingleFileWriter                        [] - Exception occurs when closing file s3a://path-to-table/foo.parquet. Cleaning up.
java.io.IOException: can not write FileMetaData(version:1, schema:[*******])
	at org.apache.paimon.shade.org.apache.parquet.format.Util.write(Util.java:376) ~[blob_p-2258027c6ed64f64bd7d0647a2f49d33ca0de848-101b36893f6ba48422020fc2e8febf80:?]
	at org.apache.paimon.shade.org.apache.parquet.format.Util.writeFileMetaData(Util.java:143) ~[blob_p-2258027c6ed64f64bd7d0647a2f49d33ca0de848-101b36893f6ba48422020fc2e8febf80:?]
	at org.apache.paimon.shade.org.apache.parquet.format.Util.writeFileMetaData(Util.java:138) ~[blob_p-2258027c6ed64f64bd7d0647a2f49d33ca0de848-101b36893f6ba48422020fc2e8febf80:?]
	at org.apache.paimon.shade.org.apache.parquet.hadoop.ParquetFileWriter.serializeFooter(ParquetFileWriter.java:1333) ~[blob_p-2258027c6ed64f64bd7d0647a2f49d33ca0de848-101b36893f6ba48422020fc2e8febf80:?]
	at org.apache.paimon.shade.org.apache.parquet.hadoop.ParquetFileWriter.end(ParquetFileWriter.java:1196) ~[blob_p-2258027c6ed64f64bd7d0647a2f49d33ca0de848-101b36893f6ba48422020fc2e8febf80:?]
	at org.apache.paimon.shade.org.apache.parquet.hadoop.InternalParquetRecordWriter.close(InternalParquetRecordWriter.java:132) ~[blob_p-2258027c6ed64f64bd7d0647a2f49d33ca0de848-101b36893f6ba48422020fc2e8febf80:?]
	at org.apache.paimon.shade.org.apache.parquet.hadoop.ParquetWriter.close(ParquetWriter.java:319) ~[blob_p-2258027c6ed64f64bd7d0647a2f49d33ca0de848-101b36893f6ba48422020fc2e8febf80:?]
	at org.apache.paimon.format.parquet.writer.ParquetBulkWriter.finish(ParquetBulkWriter.java:57) ~[blob_p-2258027c6ed64f64bd7d0647a2f49d33ca0de848-101b36893f6ba48422020fc2e8febf80:?]
	at org.apache.paimon.io.SingleFileWriter.close(SingleFileWriter.java:145) ~[blob_p-2258027c6ed64f64bd7d0647a2f49d33ca0de848-101b36893f6ba48422020fc2e8febf80:?]
	at org.apache.paimon.io.RollingFileWriter.closeCurrentWriter(RollingFileWriter.java:108) ~[blob_p-2258027c6ed64f64bd7d0647a2f49d33ca0de848-101b36893f6ba48422020fc2e8febf80:?]
	at org.apache.paimon.io.RollingFileWriter.close(RollingFileWriter.java:145) ~[blob_p-2258027c6ed64f64bd7d0647a2f49d33ca0de848-101b36893f6ba48422020fc2e8febf80:?]
	at org.apache.paimon.mergetree.MergeTreeWriter.flushWriteBuffer(MergeTreeWriter.java:208) ~[blob_p-2258027c6ed64f64bd7d0647a2f49d33ca0de848-101b36893f6ba48422020fc2e8febf80:?]
	at org.apache.paimon.mergetree.MergeTreeWriter.prepareCommit(MergeTreeWriter.java:229) ~[blob_p-2258027c6ed64f64bd7d0647a2f49d33ca0de848-101b36893f6ba48422020fc2e8febf80:?]
	at org.apache.paimon.operation.AbstractFileStoreWrite.prepareCommit(AbstractFileStoreWrite.java:178) ~[blob_p-2258027c6ed64f64bd7d0647a2f49d33ca0de848-101b36893f6ba48422020fc2e8febf80:?]
	at org.apache.paimon.table.sink.TableWriteImpl.prepareCommit(TableWriteImpl.java:157) ~[blob_p-2258027c6ed64f64bd7d0647a2f49d33ca0de848-101b36893f6ba48422020fc2e8febf80:?]
	at org.apache.paimon.flink.sink.StoreSinkWriteImpl.prepareCommit(StoreSinkWriteImpl.java:187) ~[blob_p-2258027c6ed64f64bd7d0647a2f49d33ca0de848-101b36893f6ba48422020fc2e8febf80:?]
	at org.apache.paimon.flink.sink.TableWriteOperator.prepareCommit(TableWriteOperator.java:118) ~[blob_p-2258027c6ed64f64bd7d0647a2f49d33ca0de848-101b36893f6ba48422020fc2e8febf80:?]
	at org.apache.paimon.flink.sink.RowDataStoreWriteOperator.prepareCommit(RowDataStoreWriteOperator.java:183) ~[blob_p-2258027c6ed64f64bd7d0647a2f49d33ca0de848-101b36893f6ba48422020fc2e8febf80:?]
	at org.apache.paimon.flink.sink.PrepareCommitOperator.emitCommittables(PrepareCommitOperator.java:110) ~[blob_p-2258027c6ed64f64bd7d0647a2f49d33ca0de848-101b36893f6ba48422020fc2e8febf80:?]
	at org.apache.paimon.flink.sink.PrepareCommitOperator.prepareSnapshotPreBarrier(PrepareCommitOperator.java:90) ~[blob_p-2258027c6ed64f64bd7d0647a2f49d33ca0de848-101b36893f6ba48422020fc2e8febf80:?]
	at org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.prepareSnapshotPreBarrier(RegularOperatorChain.java:89) ~[flink-dist-1.16.2.jar:1.16.2]
	at org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.checkpointState(SubtaskCheckpointCoordinatorImpl.java:334) ~[flink-dist-1.16.2.jar:1.16.2]
	at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$performCheckpoint$13(StreamTask.java:1286) ~[flink-dist-1.16.2.jar:1.16.2]
	at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:50) ~[flink-dist-1.16.2.jar:1.16.2]
	at org.apache.flink.streaming.runtime.tasks.StreamTask.performCheckpoint(StreamTask.java:1274) ~[flink-dist-1.16.2.jar:1.16.2]
	at org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpointOnBarrier(StreamTask.java:1231) ~[flink-dist-1.16.2.jar:1.16.2]
	at org.apache.flink.streaming.runtime.io.checkpointing.CheckpointBarrierHandler.notifyCheckpoint(CheckpointBarrierHandler.java:147) ~[flink-dist-1.16.2.jar:1.16.2]
	at org.apache.flink.streaming.runtime.io.checkpointing.SingleCheckpointBarrierHandler.triggerCheckpoint(SingleCheckpointBarrierHandler.java:287) ~[flink-dist-1.16.2.jar:1.16.2]
	at org.apache.flink.streaming.runtime.io.checkpointing.SingleCheckpointBarrierHandler.access$100(SingleCheckpointBarrierHandler.java:64) ~[flink-dist-1.16.2.jar:1.16.2]
	at org.apache.flink.streaming.runtime.io.checkpointing.SingleCheckpointBarrierHandler$ControllerImpl.triggerGlobalCheckpoint(SingleCheckpointBarrierHandler.java:488) ~[flink-dist-1.16.2.jar:1.16.2]
	at org.apache.flink.streaming.runtime.io.checkpointing.AbstractAlignedBarrierHandlerState.triggerGlobalCheckpoint(AbstractAlignedBarrierHandlerState.java:74) ~[flink-dist-1.16.2.jar:1.16.2]
	at org.apache.flink.streaming.runtime.io.checkpointing.AbstractAlignedBarrierHandlerState.barrierReceived(AbstractAlignedBarrierHandlerState.java:66) ~[flink-dist-1.16.2.jar:1.16.2]
	at org.apache.flink.streaming.runtime.io.checkpointing.SingleCheckpointBarrierHandler.lambda$processBarrier$2(SingleCheckpointBarrierHandler.java:234) ~[flink-dist-1.16.2.jar:1.16.2]
	at org.apache.flink.streaming.runtime.io.checkpointing.SingleCheckpointBarrierHandler.markCheckpointAlignedAndTransformState(SingleCheckpointBarrierHandler.java:262) ~[flink-dist-1.16.2.jar:1.16.2]
	at org.apache.flink.streaming.runtime.io.checkpointing.SingleCheckpointBarrierHandler.processBarrier(SingleCheckpointBarrierHandler.java:231) ~[flink-dist-1.16.2.jar:1.16.2]
	at org.apache.flink.streaming.runtime.io.checkpointing.CheckpointedInputGate.handleEvent(CheckpointedInputGate.java:181) ~[flink-dist-1.16.2.jar:1.16.2]
	at org.apache.flink.streaming.runtime.io.checkpointing.CheckpointedInputGate.pollNext(CheckpointedInputGate.java:159) ~[flink-dist-1.16.2.jar:1.16.2]
	at org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.emitNext(AbstractStreamTaskNetworkInput.java:110) ~[flink-dist-1.16.2.jar:1.16.2]
	at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65) ~[flink-dist-1.16.2.jar:1.16.2]
	at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:545) ~[flink-dist-1.16.2.jar:1.16.2]
	at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:231) ~[flink-dist-1.16.2.jar:1.16.2]
	at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:836) ~[flink-dist-1.16.2.jar:1.16.2]
	at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:785) ~[flink-dist-1.16.2.jar:1.16.2]
	at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:935) [flink-dist-1.16.2.jar:1.16.2]
	at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:914) [flink-dist-1.16.2.jar:1.16.2]
	at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:728) [flink-dist-1.16.2.jar:1.16.2]
	at org.apache.flink.runtime.taskmanager.Task.run(Task.java:550) [flink-dist-1.16.2.jar:1.16.2]
	at java.lang.Thread.run(Unknown Source) [?:?]
2023-08-03 22:31:46
java.io.IOException: Could not perform checkpoint 2 for operator Writer -> delivery_log -> Global Committer -> delivery_log -> Sink: end (1/1)#1.
	at org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpointOnBarrier(StreamTask.java:1243)
	at org.apache.flink.streaming.runtime.io.checkpointing.CheckpointBarrierHandler.notifyCheckpoint(CheckpointBarrierHandler.java:147)
	at org.apache.flink.streaming.runtime.io.checkpointing.SingleCheckpointBarrierHandler.triggerCheckpoint(SingleCheckpointBarrierHandler.java:287)
	at org.apache.flink.streaming.runtime.io.checkpointing.SingleCheckpointBarrierHandler.access$100(SingleCheckpointBarrierHandler.java:64)
	at org.apache.flink.streaming.runtime.io.checkpointing.SingleCheckpointBarrierHandler$ControllerImpl.triggerGlobalCheckpoint(SingleCheckpointBarrierHandler.java:488)
	at org.apache.flink.streaming.runtime.io.checkpointing.AbstractAlignedBarrierHandlerState.triggerGlobalCheckpoint(AbstractAlignedBarrierHandlerState.java:74)
	at org.apache.flink.streaming.runtime.io.checkpointing.AbstractAlignedBarrierHandlerState.barrierReceived(AbstractAlignedBarrierHandlerState.java:66)
	at org.apache.flink.streaming.runtime.io.checkpointing.SingleCheckpointBarrierHandler.lambda$processBarrier$2(SingleCheckpointBarrierHandler.java:234)
	at org.apache.flink.streaming.runtime.io.checkpointing.SingleCheckpointBarrierHandler.markCheckpointAlignedAndTransformState(SingleCheckpointBarrierHandler.java:262)
	at org.apache.flink.streaming.runtime.io.checkpointing.SingleCheckpointBarrierHandler.processBarrier(SingleCheckpointBarrierHandler.java:231)
	at org.apache.flink.streaming.runtime.io.checkpointing.CheckpointedInputGate.handleEvent(CheckpointedInputGate.java:181)
	at org.apache.flink.streaming.runtime.io.checkpointing.CheckpointedInputGate.pollNext(CheckpointedInputGate.java:159)
	at org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.emitNext(AbstractStreamTaskNetworkInput.java:110)
	at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
	at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:545)
	at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:231)
	at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:836)
	at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:785)
	at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:935)
	at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:914)
	at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:728)
	at org.apache.flink.runtime.taskmanager.Task.run(Task.java:550)
	at java.base/java.lang.Thread.run(Unknown Source)
	at org.apache.paimon.flink.sink.StoreSinkWriteImpl.prepareCommit(StoreSinkWriteImpl.java:192)
	at org.apache.paimon.flink.sink.TableWriteOperator.prepareCommit(TableWriteOperator.java:118)
	at org.apache.paimon.flink.sink.RowDataStoreWriteOperator.prepareCommit(RowDataStoreWriteOperator.java:183)
	at org.apache.paimon.flink.sink.PrepareCommitOperator.emitCommittables(PrepareCommitOperator.java:110)
	at org.apache.paimon.flink.sink.PrepareCommitOperator.prepareSnapshotPreBarrier(PrepareCommitOperator.java:90)
	at org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.prepareSnapshotPreBarrier(RegularOperatorChain.java:89)
	at org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.checkpointState(SubtaskCheckpointCoordinatorImpl.java:334)
	at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$performCheckpoint$13(StreamTask.java:1286)
	at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:50)
	at org.apache.flink.streaming.runtime.tasks.StreamTask.performCheckpoint(StreamTask.java:1274)
	at org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpointOnBarrier(StreamTask.java:1231)
	... 22 more
Caused by: java.io.IOException: can not write FileMetaData(version:1, schema:[*****])
	at org.apache.paimon.shade.org.apache.parquet.format.Util.write(Util.java:376)
	at org.apache.paimon.shade.org.apache.parquet.format.Util.writeFileMetaData(Util.java:143)
	at org.apache.paimon.shade.org.apache.parquet.format.Util.writeFileMetaData(Util.java:138)
	at org.apache.paimon.shade.org.apache.parquet.hadoop.ParquetFileWriter.serializeFooter(ParquetFileWriter.java:1333)
	at org.apache.paimon.shade.org.apache.parquet.hadoop.ParquetFileWriter.end(ParquetFileWriter.java:1196)
	at org.apache.paimon.shade.org.apache.parquet.hadoop.InternalParquetRecordWriter.close(InternalParquetRecordWriter.java:132)
	at org.apache.paimon.shade.org.apache.parquet.hadoop.ParquetWriter.close(ParquetWriter.java:319)
	at org.apache.paimon.format.parquet.writer.ParquetBulkWriter.finish(ParquetBulkWriter.java:57)
	at org.apache.paimon.io.SingleFileWriter.close(SingleFileWriter.java:145)
	at org.apache.paimon.io.RollingFileWriter.closeCurrentWriter(RollingFileWriter.java:108)
	at org.apache.paimon.io.RollingFileWriter.close(RollingFileWriter.java:145)
	at org.apache.paimon.mergetree.MergeTreeWriter.flushWriteBuffer(MergeTreeWriter.java:208)
	at org.apache.paimon.mergetree.MergeTreeWriter.prepareCommit(MergeTreeWriter.java:229)
	at org.apache.paimon.operation.AbstractFileStoreWrite.prepareCommit(AbstractFileStoreWrite.java:178)
	at org.apache.paimon.table.sink.TableWriteImpl.prepareCommit(TableWriteImpl.java:157)
	at org.apache.paimon.flink.sink.StoreSinkWriteImpl.prepareCommit(StoreSinkWriteImpl.java:187)
	... 32 more
Caused by: org.apache.paimon.shade.parquet.org.apache.thrift.transport.TTransportException: java.io.IOException: Filesystem WriteOperationHelper {bucket=event-logs} closed
	at org.apache.paimon.shade.parquet.org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:199)
	at org.apache.paimon.shade.parquet.org.apache.thrift.protocol.TCompactProtocol.writeByteDirect(TCompactProtocol.java:482)
	at org.apache.paimon.shade.parquet.org.apache.thrift.protocol.TCompactProtocol.writeByteDirect(TCompactProtocol.java:489)
	at org.apache.paimon.shade.parquet.org.apache.thrift.protocol.TCompactProtocol.writeFieldBeginInternal(TCompactProtocol.java:263)
	at org.apache.paimon.shade.parquet.org.apache.thrift.protocol.TCompactProtocol.writeFieldBegin(TCompactProtocol.java:245)
	at org.apache.paimon.shade.org.apache.parquet.format.InterningProtocol.writeFieldBegin(InterningProtocol.java:71)
	at org.apache.paimon.shade.org.apache.parquet.format.FileMetaData$FileMetaDataStandardScheme.write(FileMetaData.java:1390)
	at org.apache.paimon.shade.org.apache.parquet.format.FileMetaData$FileMetaDataStandardScheme.write(FileMetaData.java:1240)
	at org.apache.paimon.shade.org.apache.parquet.format.FileMetaData.write(FileMetaData.java:1118)
	at org.apache.paimon.shade.org.apache.parquet.format.Util.write(Util.java:373)
	... 47 more
Caused by: java.io.IOException: Filesystem WriteOperationHelper {bucket=event-logs} closed
	at org.apache.hadoop.fs.s3a.S3ABlockOutputStream.checkOpen(S3ABlockOutputStream.java:250)
	at org.apache.hadoop.fs.s3a.S3ABlockOutputStream.write(S3ABlockOutputStream.java:301)
	at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:62)
	at java.base/java.io.DataOutputStream.write(Unknown Source)
	at org.apache.flink.fs.s3hadoop.common.HadoopDataOutputStream.write(HadoopDataOutputStream.java:47)
	at org.apache.flink.core.fs.FSDataOutputStreamWrapper.write(FSDataOutputStreamWrapper.java:65)
	at org.apache.paimon.flink.FlinkFileIO$FlinkPositionOutputStream.write(FlinkFileIO.java:189)
	at org.apache.paimon.format.parquet.writer.PositionOutputStreamAdapter.write(PositionOutputStreamAdapter.java:54)
	at org.apache.paimon.shade.parquet.org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:197)
	... 56 more

Anything else?

  • I'm writing multiple tables in a single job with a bunch of INSERT INTO statements and some tables can write parquet fields without any issue;
  • Used default settings for tables;
  • It works well for both avro and orc formats.

Are you willing to submit a PR?

  • I'm willing to submit a PR!
@prm-xingcan prm-xingcan added the bug Something isn't working label Aug 3, 2023
@JingsongLi
Copy link
Contributor

@prm-xingcan Hi, this is weird, I've never encountered it before, it looks like an environmental issue.

@prm-xingcan
Copy link
Author

Hi @JingsongLi, yes, it's weird. I'll try the same job in more environments.

@prm-xingcan
Copy link
Author

prm-xingcan commented Aug 7, 2023

@JingsongLi It turns out to be caused by another exception when writing out a JoinedRow

java.lang.RuntimeException: org.apache.paimon.shade.org.apache.parquet.io.ParquetEncodingException: empty fields are illegal, the field should be ommited completely instead
	at org.apache.paimon.format.parquet.writer.ParquetRowDataBuilder$ParquetWriteSupport.write(ParquetRowDataBuilder.java:77) ~[paimon-flink-1.16-0.5-20230801.002429-104.jar:0.5-SNAPSHOT]
	at org.apache.paimon.format.parquet.writer.ParquetRowDataBuilder$ParquetWriteSupport.write(ParquetRowDataBuilder.java:56) ~[paimon-flink-1.16-0.5-20230801.002429-104.jar:0.5-SNAPSHOT]
	at org.apache.paimon.shade.org.apache.parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:138) ~[paimon-flink-1.16-0.5-20230801.002429-104.jar:0.5-SNAPSHOT]
	at org.apache.paimon.shade.org.apache.parquet.hadoop.ParquetWriter.write(ParquetWriter.java:310) ~[paimon-flink-1.16-0.5-20230801.002429-104.jar:0.5-SNAPSHOT]
	at org.apache.paimon.format.parquet.writer.ParquetBulkWriter.addElement(ParquetBulkWriter.java:47) ~[paimon-flink-1.16-0.5-20230801.002429-104.jar:0.5-SNAPSHOT]
	at org.apache.paimon.io.SingleFileWriter.writeImpl(SingleFileWriter.java:100) ~[paimon-flink-1.16-0.5-20230801.002429-104.jar:0.5-SNAPSHOT]
	at org.apache.paimon.io.StatsCollectingSingleFileWriter.write(StatsCollectingSingleFileWriter.java:70) ~[paimon-flink-1.16-0.5-20230801.002429-104.jar:0.5-SNAPSHOT]
	at org.apache.paimon.io.KeyValueDataFileWriter.write(KeyValueDataFileWriter.java:106) ~[paimon-flink-1.16-0.5-20230801.002429-104.jar:0.5-SNAPSHOT]
	at org.apache.paimon.io.KeyValueDataFileWriter.write(KeyValueDataFileWriter.java:52) ~[paimon-flink-1.16-0.5-20230801.002429-104.jar:0.5-SNAPSHOT]
	at org.apache.paimon.io.RollingFileWriter.write(RollingFileWriter.java:82) ~[paimon-flink-1.16-0.5-20230801.002429-104.jar:0.5-SNAPSHOT]
	at org.apache.paimon.mergetree.SortBufferWriteBuffer.forEach(SortBufferWriteBuffer.java:135) ~[paimon-flink-1.16-0.5-20230801.002429-104.jar:0.5-SNAPSHOT]
	at org.apache.paimon.mergetree.MergeTreeWriter.flushWriteBuffer(MergeTreeWriter.java:199) ~[paimon-flink-1.16-0.5-20230801.002429-104.jar:0.5-SNAPSHOT]
	at org.apache.paimon.mergetree.MergeTreeWriter.prepareCommit(MergeTreeWriter.java:229) ~[paimon-flink-1.16-0.5-20230801.002429-104.jar:0.5-SNAPSHOT]
	at org.apache.paimon.operation.AbstractFileStoreWrite.prepareCommit(AbstractFileStoreWrite.java:178) ~[paimon-flink-1.16-0.5-20230801.002429-104.jar:0.5-SNAPSHOT]
	at org.apache.paimon.table.sink.TableWriteImpl.prepareCommit(TableWriteImpl.java:157) ~[paimon-flink-1.16-0.5-20230801.002429-104.jar:0.5-SNAPSHOT]
	at org.apache.paimon.flink.sink.StoreSinkWriteImpl.prepareCommit(StoreSinkWriteImpl.java:187) ~[paimon-flink-1.16-0.5-20230801.002429-104.jar:0.5-SNAPSHOT]
	at org.apache.paimon.flink.sink.TableWriteOperator.prepareCommit(TableWriteOperator.java:118) ~[paimon-flink-1.16-0.5-20230801.002429-104.jar:0.5-SNAPSHOT]
	at org.apache.paimon.flink.sink.RowDataStoreWriteOperator.prepareCommit(RowDataStoreWriteOperator.java:183) ~[paimon-flink-1.16-0.5-20230801.002429-104.jar:0.5-SNAPSHOT]
	at org.apache.paimon.flink.sink.PrepareCommitOperator.emitCommittables(PrepareCommitOperator.java:110) ~[paimon-flink-1.16-0.5-20230801.002429-104.jar:0.5-SNAPSHOT]
	at org.apache.paimon.flink.sink.PrepareCommitOperator.endInput(PrepareCommitOperator.java:98) ~[paimon-flink-1.16-0.5-20230801.002429-104.jar:0.5-SNAPSHOT]
Caused by: org.apache.paimon.shade.org.apache.parquet.io.ParquetEncodingException: empty fields are illegal, the field should be ommited completely instead
	at org.apache.paimon.shade.org.apache.parquet.io.MessageColumnIO$MessageColumnIORecordConsumer.endField(MessageColumnIO.java:329) ~[paimon-flink-1.16-0.5-20230801.002429-104.jar:0.5-SNAPSHOT]
	at org.apache.paimon.format.parquet.writer.ParquetRowDataWriter$ArrayWriter.write(ParquetRowDataWriter.java:475) ~[paimon-flink-1.16-0.5-20230801.002429-104.jar:0.5-SNAPSHOT]
	at org.apache.paimon.format.parquet.writer.ParquetRowDataWriter$RowWriter.write(ParquetRowDataWriter.java:510) ~[paimon-flink-1.16-0.5-20230801.002429-104.jar:0.5-SNAPSHOT]
	at org.apache.paimon.format.parquet.writer.ParquetRowDataWriter$RowWriter.write(ParquetRowDataWriter.java:520) ~[paimon-flink-1.16-0.5-20230801.002429-104.jar:0.5-SNAPSHOT]
	at org.apache.paimon.format.parquet.writer.ParquetRowDataWriter$RowWriter.write(ParquetRowDataWriter.java:510) ~[paimon-flink-1.16-0.5-20230801.002429-104.jar:0.5-SNAPSHOT]
	at org.apache.paimon.format.parquet.writer.ParquetRowDataWriter.write(ParquetRowDataWriter.java:73) ~[paimon-flink-1.16-0.5-20230801.002429-104.jar:0.5-SNAPSHOT]
	at org.apache.paimon.format.parquet.writer.ParquetRowDataBuilder$ParquetWriteSupport.write(ParquetRowDataBuilder.java:75) ~[paimon-flink-1.16-0.5-20230801.002429-104.jar:0.5-SNAPSHOT]
	... 31 more

I tried using Hudi to write out the same record (RowData) in parquet format and it succeeded.

@prm-xingcan
Copy link
Author

Hi @JingsongLi, is it possible to add nested map & array support in the next release?

@prm-xingcan
Copy link
Author

This should have been fixed in 4009ede

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants