[hive] Introduce metastore.tag-to-partition for Hive metastore #2134
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Purpose
Linked issue: close #2125
The Tag will maintain the manifests and data files of the snapshot.
A typical usage is creating tags daily, then you can maintain the historical data of each day for batch reading.
When using primary key tables, a non-partitioned approach is often used to maintain updates, in order to mirror and
synchronize tables from upstream database tables. This allows users to query the latest data. The tradition of Hive
data warehouses is not like this. Offline data warehouses require an immutable view every day to ensure the idempotence
of calculations. So we created a Tag mechanism to output these views.
However, the traditional use of Hive data warehouses is more accustomed to using partitions to specify the query's Tag,
and is more accustomed to using Hive computing engines.
So, we introduce
'metastore.tag-to-partition'
to mapping a non-partitioned primary key table to the partition tablein Hive metastore, and mapping the partition field to the name of the Tag to be fully compatible with Hive.
Example
Step 1: Create table and tag in Flink SQL
Step 2: Query table in Hive with Partition Pruning