Skip to content

Commit

Permalink
[doc] Add changelog merging into changelog-producer
Browse files Browse the repository at this point in the history
  • Loading branch information
JingsongLi committed Nov 27, 2024
1 parent a67bab1 commit 7a39013
Show file tree
Hide file tree
Showing 2 changed files with 11 additions and 9 deletions.
9 changes: 0 additions & 9 deletions docs/content/maintenance/write-performance.md
Original file line number Diff line number Diff line change
Expand Up @@ -160,12 +160,3 @@ You can use fine-grained-resource-management of Flink to increase committer heap
1. Configure Flink Configuration `cluster.fine-grained-resource-management.enabled: true`. (This is default after Flink 1.18)
2. Configure Paimon Table Options: `sink.committer-memory`, for example 300 MB, depends on your `TaskManager`.
(`sink.committer-cpu` is also supported)

## Changelog Compaction

If Flink's checkpoint interval is short (for example, 30 seconds) and the number of buckets is large,
each snapshot may produce lots of small changelog files.
Too many files may put a burden on the distributed storage cluster.

In order to compact small changelog files into large ones, you can set the table option `changelog.precommit-compact = true`.
Default value of this option is false, if true, it will add a compact coordinator and worker operator after the writer operator, which copies changelog files into large ones.
11 changes: 11 additions & 0 deletions docs/content/primary-key-table/changelog-producer.md
Original file line number Diff line number Diff line change
Expand Up @@ -130,3 +130,14 @@ efficient as the input changelog producer and the latency to produce changelog m

Full-compaction changelog-producer supports `changelog-producer.row-deduplicate` to avoid generating -U, +U
changelog for the same record.

## Changelog Merging

For `input`, `lookup`, `full-compaction` 'changelog-producer'.

If Flink's checkpoint interval is short (for example, 30 seconds) and the number of buckets is large, each snapshot may
produce lots of small changelog files. Too many files may put a burden on the distributed storage cluster.

In order to compact small changelog files into large ones, you can set the table option `changelog.precommit-compact = true`.
Default value of this option is false, if true, it will add a compact coordinator and worker operator after the writer
operator, which copies changelog files into large ones.

0 comments on commit 7a39013

Please sign in to comment.