[core] Rename paimon: Table Store to Paimon

lmagic233 · Mar 18, 2023 · 8856b61 · 8856b61
1 parent 0524726
commit 8856b61
Show file tree

Hide file tree

Showing 64 changed files with 173 additions and 172 deletions.
diff --git a/README.md b/README.md
@@ -1,8 +1,8 @@
-# Flink Table Store
+# Paimon
 
-Flink Table Store is a data lake storage for streaming updates/deletes changelog ingestion and high-performance queries in real time.
+Paimon is a data lake storage for streaming updates/deletes changelog ingestion and high-performance queries in real time.
 
-Flink Table Store is developed under the umbrella of [Apache Flink](https://flink.apache.org/).
+Paimon is developed under the umbrella of [Apache Flink](https://flink.apache.org/).
 
 ## Documentation & Getting Started
 

diff --git a/docs/README.md b/docs/README.md
@@ -1,7 +1,7 @@
 This README gives an overview of how to build and contribute to the
-documentation of Flink Table Store.
+documentation of Paimon.
 
-The documentation is included with the source of Flink Table Store in order to ensure
+The documentation is included with the source of Paimon in order to ensure
 that you always have docs corresponding to your checked-out version.
 
 # Requirements
@@ -85,7 +85,7 @@ the page:
 
 ### ShortCodes 
 
-Flink Table Store uses [shortcodes](https://gohugo.io/content-management/shortcodes/) to add
+Paimon uses [shortcodes](https://gohugo.io/content-management/shortcodes/) to add
 custom functionality to its documentation markdown.
 
 Its implementation and documentation can be found at

diff --git a/docs/content/_index.md b/docs/content/_index.md
@@ -1,5 +1,5 @@
 ---
-title: Apache Flink Table Store
+title: Apache Paimon
 type: docs
 bookToc: false
 ---
@@ -22,26 +22,26 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-# Apache Flink Table Store
+# Apache Paimon
 
-Flink Table Store is a unified storage to build dynamic tables for both streaming and
+Paimon is a unified storage to build dynamic tables for both streaming and
 batch processing in Flink, supporting high-speed data ingestion and timely data query.
-Table Store offers the following core capabilities:
+Paimon offers the following core capabilities:
 - Support storage of large datasets and allow read/write in both batch and streaming mode.
 - Support streaming queries with minimum latency down to milliseconds.
 - Support Batch/OLAP queries with minimum latency down to the second level.
 - Support incremental snapshots for stream consumption by default. So users do not need to combine different pipelines by themself.
 
 {{< columns >}}
-## Try Table Store
+## Try Paimon
 
-If you’re interested in playing around with Flink Table Store, check out our
+If you’re interested in playing around with Paimon, check out our
 quick start guide with [Flink]({{< ref "engines/flink" >}}), [Spark]({{< ref "engines/spark3" >}}) or [Hive]({{< ref "engines/hive" >}}). It provides a step by
 step introduction to the APIs and guides you through real applications.
 
 <--->
 
-## Get Help with Table Store
+## Get Help with Paimon
 
 If you get stuck, check out our [community support
 resources](https://flink.apache.org/community.html). In particular, Apache
@@ -50,5 +50,5 @@ any Apache project, and is a great way to get help quickly.
 
 {{< /columns >}}
 
-Flink Table Store is developed under the umbrella of
+Paimon is developed under the umbrella of
 [Apache Flink](https://flink.apache.org/).
diff --git a/docs/content/concepts/basic-concepts.md b/docs/content/concepts/basic-concepts.md
@@ -32,7 +32,7 @@ A snapshot captures the state of a table at some point in time. Users can access
 
 ## Partition
 
-Table Store adopts the same partitioning concept as Apache Hive to separate data.
+Paimon adopts the same partitioning concept as Apache Hive to separate data.
 
 Partitioning is an optional way of dividing a table into related parts based on the values of particular columns like date, city, and department. Each table can have one or more partition keys to identify a particular partition.
 
@@ -56,6 +56,6 @@ See [file layouts]({{< ref "concepts/file-layouts" >}}) for how files are divide
 
 ## Consistency Guarantees
 
-Table Store writers uses two-phase commit protocol to atomically commit a batch of records to the table. Each commit produces at most two [snapshots]({{< ref "concepts/basic-concepts#snapshot" >}}) at commit time.
+Paimon writers uses two-phase commit protocol to atomically commit a batch of records to the table. Each commit produces at most two [snapshots]({{< ref "concepts/basic-concepts#snapshot" >}}) at commit time.
 
 For any two writers modifying a table at the same time, as long as they do not modify the same bucket, their commits are serializable. If they modify the same bucket, only snapshot isolation is guaranteed. That is, the final table state may be a mix of the two commits, but no changes are lost.
diff --git a/docs/content/concepts/external-log-systems.md b/docs/content/concepts/external-log-systems.md
@@ -26,7 +26,7 @@ under the License.
 
 # External Log Systems
 
-Aside from [underlying table files]({{< ref "concepts/primary-key-table#changelog-producers" >}}), changelog of Table Store can also be stored into or consumed from an external log system, such as Kafka. By specifying `log.system` table property, users can choose which external log system to use.
+Aside from [underlying table files]({{< ref "concepts/primary-key-table#changelog-producers" >}}), changelog of Paimon can also be stored into or consumed from an external log system, such as Kafka. By specifying `log.system` table property, users can choose which external log system to use.
 
 If an external log system is used, all records written into table files will also be written into the log system. Changes produced by the streaming queries will thus come from the log system instead of table files.
 
@@ -36,7 +36,7 @@ By default, changes in the log systems are visible to consumers only after a sna
 
 However, users can also specify the table property `'log.consistency' = 'eventual'` so that changelog written into the log system can be immediately consumed by the consumers, without waiting for the next snapshot. This behavior decreases the latency of changelog, but it can only guarantee the at-least-once semantics (that is, consumers might see duplicated records) due to possible failures.
 
-If `'log.consistency' = 'eventual'` is set, in order to achieve correct results, Table Store source in Flink will automatically adds a "normalize" operator for deduplication. This operator persists the values of each key in states. As one can easily tell, this operator will be very costly and should be avoided.
+If `'log.consistency' = 'eventual'` is set, in order to achieve correct results, Paimon source in Flink will automatically adds a "normalize" operator for deduplication. This operator persists the values of each key in states. As one can easily tell, this operator will be very costly and should be avoided.
 
 ## Supported Log Systems
 

diff --git a/docs/content/concepts/file-layouts.md b/docs/content/concepts/file-layouts.md
@@ -26,7 +26,7 @@ under the License.
 
 # File Layouts
 
-All files of a table are stored under one base directory. Table Store files are organized in a layered style. The following image illustrates the file layout. Starting from a snapshot file, Table Store readers can recursively access all records from the table.
+All files of a table are stored under one base directory. Paimon files are organized in a layered style. The following image illustrates the file layout. Starting from a snapshot file, Paimon readers can recursively access all records from the table.
 
 {{< img src="/img/file-layout.png">}}
 
@@ -53,7 +53,7 @@ Data files are grouped by partitions and buckets. Each bucket directory contains
 
 ## LSM Trees
 
-Table Store adapts the LSM tree (log-structured merge-tree) as the data structure for file storage. This documentation briefly introduces the concepts about LSM trees.
+Paimon adapts the LSM tree (log-structured merge-tree) as the data structure for file storage. This documentation briefly introduces the concepts about LSM trees.
 
 ### Sorted Runs
 
@@ -73,6 +73,6 @@ When more and more records are written into the LSM tree, the number of sorted r
 
 To limit the number of sorted runs, we have to merge several sorted runs into one big sorted run once in a while. This procedure is called compaction.
 
-However, compaction is a resource intensive procedure which consumes a certain amount of CPU time and disk IO, so too frequent compaction may in turn result in slower writes. It is a trade-off between query and write performance. Table Store currently adapts a compaction strategy similar to Rocksdb's [universal compaction](https://github.com/facebook/rocksdb/wiki/Universal-Compaction).
+However, compaction is a resource intensive procedure which consumes a certain amount of CPU time and disk IO, so too frequent compaction may in turn result in slower writes. It is a trade-off between query and write performance. Paimon currently adapts a compaction strategy similar to Rocksdb's [universal compaction](https://github.com/facebook/rocksdb/wiki/Universal-Compaction).
 
-By default, when Table Store writers append records to the LSM tree, they'll also perform compactions as needed. Users can also choose to perform all compactions in a dedicated compaction job. See [dedicated compaction job]({{< ref "maintenance/write-performance#dedicated-compaction-job" >}}) for more info.
+By default, when Paimon writers append records to the LSM tree, they'll also perform compactions as needed. Users can also choose to perform all compactions in a dedicated compaction job. See [dedicated compaction job]({{< ref "maintenance/write-performance#dedicated-compaction-job" >}}) for more info.
diff --git a/docs/content/concepts/overview.md b/docs/content/concepts/overview.md
@@ -26,7 +26,7 @@ under the License.
 
 # Overview
 
-Flink Table Store is a unified storage to build dynamic tables for both streaming and
+Paimon is a unified storage to build dynamic tables for both streaming and
 batch processing in Flink, supporting high-speed data ingestion and timely data query.
 
 ## Architecture
@@ -35,18 +35,18 @@ batch processing in Flink, supporting high-speed data ingestion and timely data
 
 As shown in the architecture above:
 
-**Read/Write:** Table Store supports a versatile way to read/write data and perform OLAP queries.
+**Read/Write:** Paimon supports a versatile way to read/write data and perform OLAP queries.
 - For reads, it supports consuming data
   - from historical snapshots (in batch mode),
   - from the latest offset (in streaming mode), or 
   - reading incremental snapshots in a hybrid way.
 - For writes, it supports streaming synchronization from the changelog of databases (CDC) or batch
   insert/overwrite from offline data.
 
-**Ecosystem:** In addition to Apache Flink, Table Store also supports read by other computation
+**Ecosystem:** In addition to Apache Flink, Paimon also supports read by other computation
 engines like Apache Hive, Apache Spark and Trino.
 
-**Internal:** Under the hood, Table Store uses a hybrid storage architecture with a lake format to store
+**Internal:** Under the hood, Paimon uses a hybrid storage architecture with a lake format to store
 historical data and a queue system to store incremental data. The former stores the columnar files on
 the filesystem/object-store and uses the LSM tree structure to support a large volume of data updates
 and high-performance queries. The latter uses Apache Kafka to capture data in real-time.
@@ -62,7 +62,7 @@ There are three types of connectors in Flink SQL.
 - Batch storage, such as Apache Hive, it supports various operations
   of the traditional batch processing, including `INSERT OVERWRITE`.
 
-Flink Table Store provides table abstraction. It is used in a way that
+Paimon provides table abstraction. It is used in a way that
 does not differ from the traditional database:
 - In Flink `batch` execution mode, it acts like a Hive table and
   supports various operations of Batch SQL. Query it to see the

diff --git a/docs/content/concepts/primary-key-table.md b/docs/content/concepts/primary-key-table.md
@@ -28,13 +28,13 @@ under the License.
 
 Changelog table is the default table type when creating a table. Users can insert, update or delete records in the table.
 
-Primary keys are a set of columns that are unique for each record. Table Store imposes an ordering of data, which means the system will sort the primary key within each bucket. Using this feature, users can achieve high performance by adding filter conditions on the primary key.
+Primary keys are a set of columns that are unique for each record. Paimon imposes an ordering of data, which means the system will sort the primary key within each bucket. Using this feature, users can achieve high performance by adding filter conditions on the primary key.
 
 By [defining primary keys]({{< ref "how-to/creating-tables#tables-with-primary-keys" >}}) on a changelog table, users can access the following features.
 
 ## Merge Engines
 
-When Table Store sink receives two or more records with the same primary keys, it will merge them into one record to keep primary keys unique. By specifying the `merge-engine` table property, users can choose how records are merged together.
+When Paimon sink receives two or more records with the same primary keys, it will merge them into one record to keep primary keys unique. By specifying the `merge-engine` table property, users can choose how records are merged together.
 
 {{< hint info >}}
 Set `table.exec.sink.upsert-materialize` to `NONE` always in Flink SQL TableConfig, sink upsert-materialize may
@@ -44,15 +44,15 @@ result in strange behavior. When the input is out of order, we recommend that yo
 
 ### Deduplicate
 
-`deduplicate` merge engine is the default merge engine. Table Store will only keep the latest record and throw away other records with the same primary keys.
+`deduplicate` merge engine is the default merge engine. Paimon will only keep the latest record and throw away other records with the same primary keys.
 
 Specifically, if the latest record is a `DELETE` record, all records with the same primary keys will be deleted.
 
 ### Partial Update
 
 By specifying `'merge-engine' = 'partial-update'`, users can set columns of a record across multiple updates and finally get a complete record. Specifically, value fields are updated to the latest data one by one under the same primary key, but null values are not overwritten.
 
-For example, let's say Table Store receives three records:
+For example, let's say Paimon receives three records:
 - `<1, 23.0, 10, NULL>`-
 - `<1, NULL, NULL, 'This is a book'>`
 - `<1, 25.2, NULL, NULL>`
@@ -126,7 +126,7 @@ The `changelog-producer` table property only affects changelog from files. It do
 
 ### None
 
-By default, no extra changelog producer will be applied to the writer of table. Table Store source can only see the merged changes across snapshots, like what keys are removed and what are the new values of some keys.
+By default, no extra changelog producer will be applied to the writer of table. Paimon source can only see the merged changes across snapshots, like what keys are removed and what are the new values of some keys.
 
 However, these merged changes cannot form a complete changelog, because we can't read the old values of the keys directly from them. Merged changes require the consumers to "remember" the values of each key and to rewrite the values without seeing the old ones. Some consumers, however, need the old values to ensure correctness or efficiency.
 
@@ -138,9 +138,9 @@ To conclude, `none` changelog producers are best suited for consumers such as a
 
 ### Input
 
-By specifying `'changelog-producer' = 'input'`, Table Store writers rely on their inputs as a source of complete changelog. All input records will be saved in separated [changelog files]({{< ref "concepts/file-layouts" >}}) and will be given to the consumers by Table Store sources.
+By specifying `'changelog-producer' = 'input'`, Paimon writers rely on their inputs as a source of complete changelog. All input records will be saved in separated [changelog files]({{< ref "concepts/file-layouts" >}}) and will be given to the consumers by Paimon sources.
 
-`input` changelog producer can be used when Table Store writers' inputs are complete changelog, such as from a database CDC, or generated by Flink stateful computation.
+`input` changelog producer can be used when Paimon writers' inputs are complete changelog, such as from a database CDC, or generated by Flink stateful computation.
 
 {{< img src="/img/changelog-producer-input.png">}}
 
@@ -152,7 +152,7 @@ This is an experimental feature.
 
 If your input can’t produce a complete changelog but you still want to get rid of the costly normalized operator, you may consider using the `'lookup'` changelog producer.
 
-By specifying `'changelog-producer' = 'lookup'`, Table Store will generate changelog through `'lookup'` before committing the data writing.
+By specifying `'changelog-producer' = 'lookup'`, Paimon will generate changelog through `'lookup'` before committing the data writing.
 
 {{< img src="/img/changelog-producer-lookup.png">}}
 
@@ -194,7 +194,7 @@ Lookup will cache data on the memory and local disk, you can use the following o
 If you think the resource consumption of 'lookup' is too large, you can consider using 'full-compaction' changelog producer,
 which can decouple data writing and changelog generation, and is more suitable for scenarios with high latency (For example, 10 minutes).
 
-By specifying `'changelog-producer' = 'full-compaction'`, Table Store will compare the results between full compactions and produce the differences as changelog. The latency of changelog is affected by the frequency of full compactions.
+By specifying `'changelog-producer' = 'full-compaction'`, Paimon will compare the results between full compactions and produce the differences as changelog. The latency of changelog is affected by the frequency of full compactions.
 
 By specifying `changelog-producer.compaction-interval` table property (default value `0s`), users can define the maximum interval between two full compactions to ensure latency. This is set to 0 by default, so each checkpoint will have a full compression and generate a change log.
 

diff --git a/docs/content/engines/flink.md b/docs/content/engines/flink.md
@@ -26,11 +26,11 @@ under the License.
 
 # Flink
 
-This documentation is a guide for using Table Store in Flink.
+This documentation is a guide for using Paimon in Flink.
 
-## Preparing Table Store Jar File
+## Preparing Paimon Jar File
 
-Table Store currently supports Flink 1.16, 1.15 and 1.14. We recommend the latest Flink version for a better experience.
+Paimon currently supports Flink 1.16, 1.15 and 1.14. We recommend the latest Flink version for a better experience.
 
 {{< stable >}}
 
@@ -48,7 +48,7 @@ You can also manually build bundled jar from the source code.
 
 {{< unstable >}}
 
-You are using an unreleased version of Table Store so you need to manually build bundled jar from the source code.
+You are using an unreleased version of Paimon so you need to manually build bundled jar from the source code.
 
 {{< /unstable >}}
 
@@ -69,7 +69,7 @@ If you haven't downloaded Flink, you can [download Flink 1.16](https://flink.apa
 tar -xzf flink-*.tgz
 ```
 
-**Step 2: Copy Table Store Bundled Jar**
+**Step 2: Copy Paimon Bundled Jar**
 
 Copy paimon bundled jar to the `lib` directory of your Flink home.
 
@@ -111,7 +111,7 @@ You can now start Flink SQL client to execute SQL scripts.
 **Step 5: Create a Catalog and a Table**
 
 ```sql
--- if you're trying out Table Store in a distributed environment,
+-- if you're trying out Paimon in a distributed environment,
 -- warehouse path should be set to a shared file system, such as HDFS or OSS
 CREATE CATALOG my_catalog WITH (
     'type'='paimon',