Merge branch 'master' into flink_hilbert

apache · Mar 7, 2024 · 2584871 · 2584871
2 parents 4d09b59 + ec5ebeb
commit 2584871
Show file tree

Hide file tree

Showing 159 changed files with 4,367 additions and 2,955 deletions.
diff --git a/.asf.yaml b/.asf.yaml
@@ -18,7 +18,7 @@
 # See: https://cwiki.apache.org/confluence/display/INFRA/git+-+.asf.yaml+features
 
 github:
-  description: "Apache Paimon(incubating) is a streaming data lake platform that supports high-speed data ingestion, change data tracking and efficient real-time analytics."
+  description: "Apache Paimon(incubating) is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations."
   homepage: https://paimon.apache.org/
   labels:
     - paimon

diff --git a/.github/workflows/utitcase-flink.yml b/.github/workflows/utitcase-flink.yml
@@ -51,6 +51,11 @@ jobs:
           . .github/workflows/utils.sh
           jvm_timezone=$(random_timezone)
           echo "JVM timezone is set to $jvm_timezone"
-          mvn -T 1C -B clean install -pl 'org.apache.paimon:paimon-flink-common' -Duser.timezone=$jvm_timezone
+          test_modules=""
+          for suffix in 1.14 1.15 1.16 1.17 1.18 common; do 
+            test_modules+="org.apache.paimon:paimon-flink-${suffix}," 
+          done
+          test_modules="${test_modules%,}"
+          mvn -T 1C -B clean install -pl "${test_modules}" -Duser.timezone=$jvm_timezone
         env:
           MAVEN_OPTS: -Xmx4096m
diff --git a/LICENSE b/LICENSE
@@ -237,6 +237,8 @@ paimon-service/paimon-service-client/src/main/java/org/apache/paimon/service/net
 paimon-service/paimon-service-client/src/main/java/org/apache/paimon/service/network/NetworkServer.java
 from http://flink.apache.org/ version 1.17.0
 
+paimon-common/src/main/java/org/apache/paimon/client/ClientPool.java
+paimon-core/src/main/java/org/apache/paimon/jdbc/JdbcCatalog.java
 paimon-core/src/main/java/org/apache/paimon/utils/ZOrderByteUtils.java
 paimon-core/src/test/java/org/apache/paimon/utils/TestZOrderByteUtil.java
 paimon-hive/paimon-hive-common/src/test/java/org/apache/paimon/hive/TestHiveMetastore.java

diff --git a/README.md b/README.md
@@ -3,12 +3,14 @@
 [![License](https://img.shields.io/badge/license-Apache%202-4EB1BA.svg)](https://www.apache.org/licenses/LICENSE-2.0.html)
 [![Get on Slack](https://img.shields.io/badge/slack-join-orange.svg)](https://the-asf.slack.com/archives/C053Q2NCW8G)
 
-Paimon is a streaming data lake platform that supports high-speed data ingestion, change data tracking and efficient real-time analytics.
+Apache Paimon(incubating) is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark 
+for both streaming and batch operations. Paimon innovatively combines lake format and LSM structure, bringing realtime 
+streaming updates into the lake architecture.
 
 Background and documentation are available at https://paimon.apache.org
 
-`Paimon`'s former name was `Flink Table Store`, developed from the Flink community. The architecture refers to some design concepts of Iceberg.
-Thanks to Apache Flink and Apache Iceberg.
+`Paimon`'s former name was `Flink Table Store`, developed from the Flink community. The architecture refers to some 
+design concepts of Iceberg. Thanks to Apache Flink and Apache Iceberg.
 
 ## Collaboration
 
@@ -64,8 +66,6 @@ You can join the Paimon community on Slack. Paimon channel is in ASF Slack works
 - If you don't have an @apache.org email address, you can email to `[email protected]` to apply for an
   [ASF Slack invitation](https://infra.apache.org/slack.html). Then join [Paimon channel](https://the-asf.slack.com/archives/C053Q2NCW8G).
 
-Don’t forget to introduce yourself in channel.
-
 ## Building
 
 JDK 8/11 is required for building the project.

diff --git a/docs/content/_index.md b/docs/content/_index.md
@@ -24,15 +24,22 @@ under the License.
 
 # Apache Paimon
 
-Apache Paimon(incubating) is a streaming data lake platform that supports high-speed data ingestion, change data tracking and efficient real-time analytics.
+Apache Paimon(incubating) is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark 
+for both streaming and batch operations. Paimon innovatively combines lake format and LSM (Log-structured merge-tree) 
+structure, bringing realtime streaming updates into the lake architecture.
 
 Paimon offers the following core capabilities:
 
-- Unified Batch & Streaming: Paimon supports batch write and batch read, as well as streaming write changes and streaming read table changelogs.
-- Data Lake: As a data lake storage, Paimon has the following advantages: low cost, high reliability, and scalable metadata.
-- Merge Engines: Paimon supports rich Merge Engines. By default, the last entry of the primary key is reserved. You can also use the "partial-update" or "aggregation" engine.
-- Changelog producer: Paimon supports rich Changelog producers, such as "lookup" and "full-compaction". The correct changelog can simplify the construction of a streaming pipeline.
-- Append Only Tables: Paimon supports Append Only tables, automatically compact small files, and provides orderly stream reading. You can use this to replace message queues.
+- Realtime updates:
+  - Primary key table supports writing of large-scale updates, has very high update performance, typically through Flink Streaming.
+  - Support defining Merge Engines, update records however you like. Deduplicate to keep last row, or partial-update, or aggregate records, or first-row, you decide.
+  - Support defining changelog-producer, produce correct and complete changelog in updates for merge engines, simplifying your streaming analytics.
+- Huge Append Data Processing:
+  - Append table (no primary-key) provides large scale batch & streaming processing capability. Automatic Small File Merge.
+  - Supports Data Compaction with z-order sorting to optimize file layout, provides fast queries based on data skipping using indexes such as minmax.
+- Data Lake Capabilities: 
+  - Scalable metadata: supports storing Petabyte large-scale datasets and storing a large number of partitions.
+  - Supports ACID Transactions & Time Travel & Schema Evolution.
 
 {{< columns >}}
 

diff --git a/docs/content/concepts/basic-concepts.md b/docs/content/concepts/basic-concepts.md
@@ -26,34 +26,50 @@ under the License.
 
 # Basic Concepts
 
+## File Layouts
+
+All files of a table are stored under one base directory. Paimon files are organized in a layered style. The following image illustrates the file layout. Starting from a snapshot file, Paimon readers can recursively access all records from the table.
+
+{{< img src="/img/file-layout.png">}}
+
 ## Snapshot
 
-A snapshot captures the state of a table at some point in time. Users can access the latest data of a table through the latest snapshot. By time traveling, users can also access the previous state of a table through an earlier snapshot.
+All snapshot files are stored in the `snapshot` directory.
 
-## Partition
+A snapshot file is a JSON file containing information about this snapshot, including
 
-Paimon adopts the same partitioning concept as Apache Hive to separate data.
+* the schema file in use
+* the manifest list containing all changes of this snapshot
 
-Partitioning is an optional way of dividing a table into related parts based on the values of particular columns like date, city, and department. Each table can have one or more partition keys to identify a particular partition.
+A snapshot captures the state of a table at some point in time. Users can access the latest data of a table through the
+latest snapshot. By time traveling, users can also access the previous state of a table through an earlier snapshot.
 
-By partitioning, users can efficiently operate on a slice of records in the table. See [file layouts]({{< ref "concepts/file-layouts" >}}) for how files are divided into multiple partitions.
+## Manifest Files
 
-{{< hint info >}}
-If you need cross partition upsert (primary keys not contain all partition fields), see [Cross partition Upsert]({{< ref "concepts/primary-key-table/data-distribution#cross-partitions-upsert-dynamic-bucket-mode">}}) mode.
-{{< /hint >}}
+All manifest lists and manifest files are stored in the `manifest` directory.
 
-## Bucket
+A manifest list is a list of manifest file names.
 
-Unpartitioned tables, or partitions in partitioned tables, are sub-divided into buckets, to provide extra structure to the data that may be used for more efficient querying.
+A manifest file is a file containing changes about LSM data files and changelog files. For example, which LSM data file is created and which file is deleted in the corresponding snapshot.
 
-The range for a bucket is determined by the hash value of one or more columns in the records. Users can specify bucketing columns by providing the [`bucket-key` option]({{< ref "maintenance/configurations#coreoptions" >}}). If no `bucket-key` option is specified, the primary key (if defined) or the complete record will be used as the bucket key.
+## Data Files
 
-A bucket is the smallest storage unit for reads and writes, so the number of buckets limits the maximum processing parallelism. This number should not be too big, though, as it will result in lots of small files and low read performance. In general, the recommended data size in each bucket is about 200MB - 1GB.
+Data files are grouped by partitions. Currently, Paimon supports using orc (default), parquet and avro as data file's format.
+
+## Partition
+
+Paimon adopts the same partitioning concept as Apache Hive to separate data.
+
+Partitioning is an optional way of dividing a table into related parts based on the values of particular columns like date, city, and department. Each table can have one or more partition keys to identify a particular partition.
 
-See [file layouts]({{< ref "concepts/file-layouts" >}}) for how files are divided into buckets. Also, see [rescale bucket]({{< ref "maintenance/rescale-bucket" >}}) if you want to adjust the number of buckets after a table is created.
+By partitioning, users can efficiently operate on a slice of records in the table.
 
 ## Consistency Guarantees
 
-Paimon writers use two-phase commit protocol to atomically commit a batch of records to the table. Each commit produces at most two [snapshots]({{< ref "concepts/basic-concepts#snapshot" >}}) at commit time.
+Paimon writers use two-phase commit protocol to atomically commit a batch of records to the table. Each commit produces
+at most two [snapshots]({{< ref "concepts/basic-concepts#snapshot" >}}) at commit time.
 
-For any two writers modifying a table at the same time, as long as they do not modify the same bucket, their commits can occur in parallel. If they modify the same bucket, only snapshot isolation is guaranteed. That is, the final table state may be a mix of the two commits, but no changes are lost.
+For any two writers modifying a table at the same time, as long as they do not modify the same partition, their commits 
+can occur in parallel. If they modify the same partition, only snapshot isolation is guaranteed. That is, the final table 
+state may be a mix of the two commits, but no changes are lost.
+See [dedicated compaction job]({{< ref "maintenance/dedicated-compaction#dedicated-compaction-job" >}}) for more info.
diff --git a/docs/content/concepts/file-layouts.md b/docs/content/concepts/file-layouts.md
diff --git a/docs/content/concepts/overview.md b/docs/content/concepts/overview.md
@@ -26,9 +26,7 @@ under the License.
 
 # Overview
 
-Apache Paimon(incubating) is a streaming data lake platform that supports high-speed data ingestion, change data tracking and efficient real-time analytics.
-
-## Architecture
+Apache Paimon(incubating)'s Architecture:
 
 {{< img src="/img/architecture.png">}}
 
@@ -39,14 +37,17 @@ As shown in the architecture above:
   - from historical snapshots (in batch mode),
   - from the latest offset (in streaming mode), or 
   - reading incremental snapshots in a hybrid way.
-- For writes, it supports streaming synchronization from the changelog of databases (CDC) or batch
-  insert/overwrite from offline data.
+- For writes, it supports
+  - streaming synchronization from the changelog of databases (CDC)
+  - batch insert/overwrite from offline data.
 
 **Ecosystem:** In addition to Apache Flink, Paimon also supports read by other computation
 engines like Apache Hive, Apache Spark and Trino.
 
-**Internal:** Under the hood, Paimon stores the columnar files on the filesystem/object-store and uses
-the LSM tree structure to support a large volume of data updates and high-performance queries.
+**Internal:**
+- Under the hood, Paimon stores the columnar files on the filesystem/object-store
+- The metadata of the file is saved in the manifest file, providing large-scale storage and data skipping.
+- For primary key table, uses the LSM tree structure to support a large volume of data updates and high-performance queries.
 
 ## Unified Storage
 

diff --git a/docs/content/concepts/primary-key-table/changelog-producer.md b/docs/content/concepts/primary-key-table/changelog-producer.md
@@ -52,7 +52,7 @@ will be very costly and should be avoided. (You can force removing "normalize" o
 
 ## Input
 
-By specifying `'changelog-producer' = 'input'`, Paimon writers rely on their inputs as a source of complete changelog. All input records will be saved in separated [changelog files]({{< ref "concepts/file-layouts" >}}) and will be given to the consumers by Paimon sources.
+By specifying `'changelog-producer' = 'input'`, Paimon writers rely on their inputs as a source of complete changelog. All input records will be saved in separated changelog files and will be given to the consumers by Paimon sources.
 
 `input` changelog producer can be used when Paimon writers' inputs are complete changelog, such as from a database CDC, or generated by Flink stateful computation.
 

diff --git a/docs/content/concepts/primary-key-table/data-distribution.md b/docs/content/concepts/primary-key-table/data-distribution.md
@@ -31,7 +31,7 @@ By default, Paimon table only has one bucket, which means it only provides singl
 Please configure the bucket strategy to your table.
 {{< /hint >}}
 
-A bucket is the smallest storage unit for reads and writes, each bucket directory contains an [LSM tree]({{< ref "concepts/file-layouts#lsm-trees" >}}).
+A bucket is the smallest storage unit for reads and writes, each bucket directory contains an [LSM tree]({{< ref "concepts/primary-key-table/overview#lsm-trees" >}}).
 
 ## Fixed Bucket