[core] Add auto increase sequence padding #2512

Zouxxyy · 2023-12-15T01:38:15Z

Purpose

Linked issue: close #2471

Add a new seq pad mode inc-seq, pads the sequence field with auto increase sequence, to solve the following problems:

When user defines sequence.field

For the same seq (for example, sequence.field is timestamp of second, and records in the same second), the order of records cannot be determined.
When performing row level changes, sequence.field must be the same, and its order cannot be guaranteed.

A better solution is to introduce a new comparison field, this PR is a trade-off, it reuses the _SEQUENCE_NUMBER and divides it into two parts: one part stores the auto-increment sequence (upper 32-bits) and the other part stores the user-defined sequence.field (transform to long) (lower 32-bits).

Notice !!!

Since we only use 32-bits to store user seq, it will be truncated to 32 bits. This will result in loss of accuracy, e.g. for timestamp, only second accuracy is supported, and supports up to 2106-02-07T06:28:15
The upper limit of the auto-increment sequence length is 32 bits.

Tests

API and Format

Documentation

JingsongLi · 2023-12-18T02:10:59Z

paimon-common/src/main/java/org/apache/paimon/CoreOptions.java

@@ -459,7 +459,9 @@ public class CoreOptions implements Serializable {
                                            text(
                                                    "3. \"millis-to-micro\": Pads the sequence field that indicates time with precision of milli-second to micro-second."),
                                            text(
-                                                    "4. Composite pattern: for example, \"second-to-micro,row-kind-flag\"."))
+                                                    "4. \"inc-seq\": Pads the sequence field with auto increase sequence."),


So for inc-seq, int, long must be a second? You should document this.

JingsongLi · 2023-12-18T02:11:43Z

paimon-common/src/main/java/org/apache/paimon/CoreOptions.java

-                                                    "4. Composite pattern: for example, \"second-to-micro,row-kind-flag\"."))
+                                                    "4. \"inc-seq\": Pads the sequence field with auto increase sequence."),
+                                            text(
+                                                    "5. Composite pattern: for example, \"second-to-micro,row-kind-flag\"."))


inc-seq, row-kind-flag is supported? I think maybe this should be most popular composite pattern

inc-seq + row-kind-flag will further reduce the accuracy, will there be a situation where +U come first, and then -U come, at the same second?

Should throw exception...

JingsongLi · 2023-12-18T02:12:25Z

paimon-core/src/main/java/org/apache/paimon/mergetree/MergeTreeWriter.java

+        if (maxSequenceNumber == -1) {
+            this.nextIncSequenceNumber = 0;
+        } else if (incSeqPadding) {
+            this.nextIncSequenceNumber = (maxSequenceNumber & INC_SEQ_MASK) + INC;


We should merge this into SequenceGenerator. There should be some refactor.

YannByron · 2023-12-18T06:46:36Z

...ark/paimon-spark-common/src/test/scala/org/apache/paimon/spark/sql/DeleteFromTableTest.scala

+
+    spark.sql("DELETE FROM T WHERE id = 1")
+
+    val rows1 = spark.sql("SELECT * FROM T").collectAsList()


please use checkDataset or checkAnswer to replace collectAsList + assertThat.

JingsongLi · 2023-12-19T09:10:29Z

paimon-common/src/main/java/org/apache/paimon/CoreOptions.java

@@ -459,7 +459,9 @@ public class CoreOptions implements Serializable {
                                            text(
                                                    "3. \"millis-to-micro\": Pads the sequence field that indicates time with precision of milli-second to micro-second."),
                                            text(
-                                                    "4. Composite pattern: for example, \"second-to-micro,row-kind-flag\"."))
+                                                    "4. \"inc-seq\": Pads the sequence field with auto increase sequence."),


seconds-and-inc

hekaifei · 2023-12-28T09:46:46Z

@Zouxxyy If the amount of data is relatively large, the number of data items reaches 5 billion, and auto-increment is only 32 bits. What will happen if it is not enough?

Zouxxyy · 2023-12-29T02:48:42Z

@Zouxxyy If the amount of data is relatively large, the number of data items reaches 5 billion, and auto-increment is only 32 bits. What will happen if it is not enough?

@hekaifei System will crash, the current design only supports a single bucket with 2,147,483,647 items. You have 5 billion one bucket or total?

hekaifei · 2023-12-29T06:58:45Z

@Zouxxyy If the amount of data is relatively large, the number of data items reaches 5 billion, and auto-increment is only 32 bits. What will happen if it is not enough?

@hekaifei System will crash, the current design only supports a single bucket with 2,147,483,647 items. You have 5 billion one bucket or total?

@Zouxxyy A few million per bucket right now. Worry about job keeps running and the data is updated frequently, 32 bit is not enough

hekaifei · 2023-12-29T08:18:58Z

@Zouxxyy There are several tables with more than 5 billion data, divided into 1,000 to 2,000 buckets

hekaifei · 2023-12-29T09:54:57Z

@Zouxxyy I think userSeq should be upper 32-bits, otherwise the same key inserted later and userSeq is smaller than before, Then the before data will be overwritten, This is not expected

Zouxxyy · 2024-01-02T15:38:32Z

@hekaifei

A few million per bucket right now. Worry about job keeps running and the data is updated frequently, 32 bit is not enough

yes, frequent modifications will also cause the solution to collapse

otherwise the same key inserted later and userSeq is smaller than before, Then the before data will be overwritten

when you want to use row level update, you must use 'merge-engine' = 'deduplicate', means that the larger userSeq overrides the smaller one

hekaifei · 2024-01-03T02:32:58Z

@Zouxxyy

when you want to use row level update, you must use 'merge-engine' = 'deduplicate', means that the larger userSeq overrides the smaller one

In the inc-seq solution, the upper 32 bits are auto-increasing sequence, which will bring about the smaller userSeq overrides the larger one

JingsongLi · 2024-02-23T07:47:48Z

The final solution should be #2811

I will close this PR, this PR is a temporary solution.

Zouxxyy marked this pull request as draft December 15, 2023 01:38

update

dc47462

Zouxxyy force-pushed the dev/fix-seq-col branch from ec2dbfa to dc47462 Compare December 15, 2023 06:25

Zouxxyy marked this pull request as ready for review December 15, 2023 06:26

JingsongLi reviewed Dec 18, 2023

View reviewed changes

YannByron reviewed Dec 18, 2023

View reviewed changes

JingsongLi reviewed Dec 19, 2023

View reviewed changes

JingsongLi closed this Feb 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[core] Add auto increase sequence padding #2512

[core] Add auto increase sequence padding #2512

Zouxxyy commented Dec 15, 2023 •

edited

Loading

JingsongLi Dec 18, 2023

JingsongLi Dec 18, 2023

Zouxxyy Dec 19, 2023

JingsongLi Dec 19, 2023

JingsongLi Dec 18, 2023

YannByron Dec 18, 2023

JingsongLi Dec 19, 2023

hekaifei commented Dec 28, 2023

Zouxxyy commented Dec 29, 2023 •

edited

Loading

hekaifei commented Dec 29, 2023

hekaifei commented Dec 29, 2023

hekaifei commented Dec 29, 2023

Zouxxyy commented Jan 2, 2024

hekaifei commented Jan 3, 2024

JingsongLi commented Feb 23, 2024


		spark.sql("DELETE FROM T WHERE id = 1")

		val rows1 = spark.sql("SELECT * FROM T").collectAsList()

[core] Add auto increase sequence padding #2512

[core] Add auto increase sequence padding #2512

Conversation

Zouxxyy commented Dec 15, 2023 • edited Loading

Purpose

Tests

API and Format

Documentation

JingsongLi Dec 18, 2023

Choose a reason for hiding this comment

JingsongLi Dec 18, 2023

Choose a reason for hiding this comment

Zouxxyy Dec 19, 2023

Choose a reason for hiding this comment

JingsongLi Dec 19, 2023

Choose a reason for hiding this comment

JingsongLi Dec 18, 2023

Choose a reason for hiding this comment

YannByron Dec 18, 2023

Choose a reason for hiding this comment

JingsongLi Dec 19, 2023

Choose a reason for hiding this comment

hekaifei commented Dec 28, 2023

Zouxxyy commented Dec 29, 2023 • edited Loading

hekaifei commented Dec 29, 2023

hekaifei commented Dec 29, 2023

hekaifei commented Dec 29, 2023

Zouxxyy commented Jan 2, 2024

hekaifei commented Jan 3, 2024

JingsongLi commented Feb 23, 2024

Zouxxyy commented Dec 15, 2023 •

edited

Loading

Zouxxyy commented Dec 29, 2023 •

edited

Loading