-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[core] Introduce data-file.thin-mode in primary key table write #4666
Conversation
741f14f
to
7f0442e
Compare
6ea1031
to
b1aa915
Compare
paimon-core/src/main/java/org/apache/paimon/io/KeyValueDataFileWriter.java
Outdated
Show resolved
Hide resolved
paimon-core/src/main/java/org/apache/paimon/mergetree/MergeTreeWriter.java
Outdated
Show resolved
Hide resolved
paimon-core/src/test/java/org/apache/paimon/table/PrimaryKeyFileStoreTableTest.java
Outdated
Show resolved
Hide resolved
paimon-core/src/main/java/org/apache/paimon/io/KeyValueThinDataFileWriterImpl.java
Outdated
Show resolved
Hide resolved
return toRow(record.sequenceNumber(), record.valueKind(), record.value()); | ||
} | ||
|
||
public InternalRow toRow(long sequenceNumber, RowKind valueKind, InternalRow value) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest add a comment here.
record.key() is not write into row
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Already comment before class definition.
paimon-core/src/main/java/org/apache/paimon/io/KeyValueThinDataFileWriterImpl.java
Outdated
Show resolved
Hide resolved
paimon-core/src/main/java/org/apache/paimon/utils/StatsCollectorFactories.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 cc @tsreaper to take a look
|| (options.dataFileThinMode() | ||
&& keyNames.contains(SpecialFields.KEY_FIELD_PREFIX + field))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add comments about why do we need this condition.
Options options = new Options(); | ||
options.set(CoreOptions.DATA_FILE_THIN_MODE, false); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Default value of CoreOptions.DATA_FILE_THIN_MODE
is false
, why do we need this change? Even if the default value changes to true
, will it break the test?
} else if (SpecialFields.isSystemField(field)) { | ||
} else if (SpecialFields.isSystemField(field) | ||
|| | ||
// If we config METADATA_STATS_MODE to true, we need to maintain the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
METADATA_STATS_MODE?
@@ -192,6 +192,8 @@ public void testRewriteSuccess(boolean rewriteChangelog) throws Exception { | |||
|
|||
private KeyValueFileWriterFactory createWriterFactory( | |||
Path path, RowType keyType, RowType valueType) { | |||
Options options = new Options(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need this change now.
@@ -143,6 +143,7 @@ private void recreateMergeTree(long targetFileSize) { | |||
options.set( | |||
CoreOptions.NUM_SORTED_RUNS_STOP_TRIGGER, | |||
options.get(CoreOptions.NUM_SORTED_RUNS_COMPACTION_TRIGGER) + 1); | |||
options.set(CoreOptions.DATA_FILE_THIN_MODE, false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this change?
Purpose
Writing primary key table, now, we generates spare column.
For example:
One table with column a,b,c,d,e,f,g primary-key: a,b
In which, _KEY_a and _KEY_b are totally equal to a, b.
Therefore, we introduce
data-file.thin-mode
here:Notice that: If thin-mode is on, the files it write could not be read by former paimon sdk version. Linked issue: #4651