From 37bf58be1e9b61030e08b9b3801450ab8508174e Mon Sep 17 00:00:00 2001 From: Jingsong Date: Wed, 7 Aug 2024 15:10:01 +0800 Subject: [PATCH] [doc] Introduce Full Compaction in compaction doc --- docs/content/primary-key-table/compaction.md | 13 +++++++++++++ docs/content/primary-key-table/table-mode.md | 5 +---- 2 files changed, 14 insertions(+), 4 deletions(-) diff --git a/docs/content/primary-key-table/compaction.md b/docs/content/primary-key-table/compaction.md index e5040a03aadd..0ba607bca049 100644 --- a/docs/content/primary-key-table/compaction.md +++ b/docs/content/primary-key-table/compaction.md @@ -79,6 +79,19 @@ In compaction, you can configure record-Level expire time to expire records, you Expiration happens in compaction, and there is no strong guarantee to expire records in time. +## Full Compaction + +Paimon Compaction uses [Universal-Compaction](https://github.com/facebook/rocksdb/wiki/Universal-Compaction). +By default, when there is too much incremental data, Full Compaction will be automatically performed. You don't usually +have to worry about it. + +Paimon also provides a configuration that allows for regular execution of Full Compaction. + +1. 'compaction.optimization-interval': Implying how often to perform an optimization full compaction, this + configuration is used to ensure the query timeliness of the read-optimized system table. +2. 'full-compaction.delta-commits': Full compaction will be constantly triggered after delta commits. its disadvantage + is that it can only perform compaction synchronously, which will affect writing efficiency. + ## Compaction Options ### Number of Sorted Runs to Pause Writing diff --git a/docs/content/primary-key-table/table-mode.md b/docs/content/primary-key-table/table-mode.md index a96fe7790d04..d7bc2efb9109 100644 --- a/docs/content/primary-key-table/table-mode.md +++ b/docs/content/primary-key-table/table-mode.md @@ -109,10 +109,7 @@ So by default, compaction is synchronous, and if asynchronous is turned on, ther If you don't want to use Deletion Vectors mode, you want to query fast enough in MOR mode, but can only find older data, you can also: -1. Configure 'compaction.optimization-interval' when writing data. For streaming jobs, optimized compaction will then - be performed periodically; For batch jobs, optimized compaction will be carried out when the job ends. (Or configure - `'full-compaction.delta-commits'`, its disadvantage is that it can only perform compaction synchronously, which will - affect writing efficiency) +1. Configure 'compaction.optimization-interval' when writing data. 2. Query from [read-optimized system table]({{< ref "maintenance/system-tables#read-optimized-table" >}}). Reading from results of optimized files avoids merging records with the same key, thus improving reading performance.