-
Notifications
You must be signed in to change notification settings - Fork 1k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[hive] Introduce metastore.tag-to-partition for Hive metastore
- Loading branch information
1 parent
d2ddaa9
commit 32f0842
Showing
20 changed files
with
560 additions
and
42 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
--- | ||
title: Migration | ||
icon: <i class="fa fa-briefcase title maindish" aria-hidden="true"></i> | ||
bold: true | ||
bookCollapseSection: true | ||
weight: 8 | ||
--- | ||
<!-- | ||
Licensed to the Apache Software Foundation (ASF) under one | ||
or more contributor license agreements. See the NOTICE file | ||
distributed with this work for additional information | ||
regarding copyright ownership. The ASF licenses this file | ||
to you under the Apache License, Version 2.0 (the | ||
"License"); you may not use this file except in compliance | ||
with the License. You may obtain a copy of the License at | ||
http://www.apache.org/licenses/LICENSE-2.0 | ||
Unless required by applicable law or agreed to in writing, | ||
software distributed under the License is distributed on an | ||
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
KIND, either express or implied. See the License for the | ||
specific language governing permissions and limitations | ||
under the License. | ||
--> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,99 @@ | ||
--- | ||
title: "Upsert To Partitioned" | ||
weight: 1 | ||
type: docs | ||
aliases: | ||
- /migration/upsert-to-partitioned.html | ||
--- | ||
<!-- | ||
Licensed to the Apache Software Foundation (ASF) under one | ||
or more contributor license agreements. See the NOTICE file | ||
distributed with this work for additional information | ||
regarding copyright ownership. The ASF licenses this file | ||
to you under the Apache License, Version 2.0 (the | ||
"License"); you may not use this file except in compliance | ||
with the License. You may obtain a copy of the License at | ||
http://www.apache.org/licenses/LICENSE-2.0 | ||
Unless required by applicable law or agreed to in writing, | ||
software distributed under the License is distributed on an | ||
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
KIND, either express or implied. See the License for the | ||
specific language governing permissions and limitations | ||
under the License. | ||
--> | ||
|
||
# Upsert To Partitioned | ||
|
||
The [Tag Management]({{< ref "maintenance/manage-tags" >}}) will maintain the manifests and data files of the snapshot. | ||
A typical usage is creating tags daily, then you can maintain the historical data of each day for batch reading. | ||
|
||
When using primary key tables, a non-partitioned approach is often used to maintain updates, in order to mirror and | ||
synchronize tables from upstream database tables. This allows users to query the latest data. The tradition of Hive | ||
data warehouses is not like this. Offline data warehouses require an immutable view every day to ensure the idempotence | ||
of calculations. So we created a Tag mechanism to output these views. | ||
|
||
However, the traditional use of Hive data warehouses is more accustomed to using partitions to specify the query's Tag, | ||
and is more accustomed to using Hive computing engines. | ||
|
||
So, we introduce `'metastore.tag-to-partition'` to mapping a non-partitioned primary key table to the partition table | ||
in Hive metastore, and mapping the partition field to the name of the Tag to be fully compatible with Hive. | ||
|
||
## Example | ||
|
||
**Step 1: Create table and tag in Flink SQL** | ||
|
||
{{< tabs "Create table and tag in Flink SQL" >}} | ||
{{< tab "Flink" >}} | ||
```sql | ||
CREATE CATALOG my_hive WITH ( | ||
'type' = 'paimon', | ||
'metastore' = 'hive', | ||
'uri' = 'thrift://<hive-metastore-host-name>:<port>', | ||
-- 'hive-conf-dir' = '...', this is recommended in the kerberos environment | ||
-- 'hadoop-conf-dir' = '...', this is recommended in the kerberos environment | ||
'warehouse' = 'hdfs:///path/to/warehouse' | ||
); | ||
|
||
USE CATALOG my_hive; | ||
|
||
CREATE TABLE mydb.T ( | ||
pk INT, | ||
col1 STRING, | ||
col2 STRING | ||
) WITH ( | ||
'bucket' = '-1', | ||
'metastore.tag-to-partition' = 'dt' | ||
); | ||
|
||
INSERT INTO t VALUES (1, '10', '100'), (2, '20', '200'); | ||
|
||
-- create tag '2023-10-16' for snapshot 1 | ||
CALL my_hive.system.create_tag('mydb.T', '2023-10-16', 1); | ||
``` | ||
|
||
{{< /tab >}} | ||
{{< /tabs >}} | ||
|
||
**Step 2: Query table in Hive with Partition Pruning** | ||
|
||
{{< tabs "Query table in Hive with Partition Pruning" >}} | ||
{{< tab "Hive" >}} | ||
```sql | ||
SHOW PARTITIONS T; | ||
/* | ||
OK | ||
dt=2023-10-16 | ||
*/ | ||
|
||
SELECT * FROM T WHERE dt='2023-10-16'; | ||
/* | ||
OK | ||
1 10 100 2023-10-16 | ||
2 20 200 2023-10-16 | ||
*/ | ||
``` | ||
|
||
{{< /tab >}} | ||
{{< /tabs >}} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
51 changes: 51 additions & 0 deletions
51
paimon-core/src/main/java/org/apache/paimon/metastore/AddPartitionTagCallback.java
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,51 @@ | ||
/* | ||
* Licensed to the Apache Software Foundation (ASF) under one | ||
* or more contributor license agreements. See the NOTICE file | ||
* distributed with this work for additional information | ||
* regarding copyright ownership. The ASF licenses this file | ||
* to you under the Apache License, Version 2.0 (the | ||
* "License"); you may not use this file except in compliance | ||
* with the License. You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
|
||
package org.apache.paimon.metastore; | ||
|
||
import org.apache.paimon.table.sink.TagCallback; | ||
|
||
import java.util.LinkedHashMap; | ||
|
||
/** A {@link TagCallback} to add newly created partitions to metastore. */ | ||
public class AddPartitionTagCallback implements TagCallback { | ||
|
||
private final MetastoreClient client; | ||
private final String partitionField; | ||
|
||
public AddPartitionTagCallback(MetastoreClient client, String partitionField) { | ||
this.client = client; | ||
this.partitionField = partitionField; | ||
} | ||
|
||
@Override | ||
public void notifyCreation(String tagName) { | ||
LinkedHashMap<String, String> partitionSpec = new LinkedHashMap<>(); | ||
partitionSpec.put(partitionField, tagName); | ||
try { | ||
client.addPartition(partitionSpec); | ||
} catch (Exception e) { | ||
throw new RuntimeException(e); | ||
} | ||
} | ||
|
||
@Override | ||
public void close() throws Exception { | ||
client.close(); | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.