Skip to content

Commit

Permalink
[core][hive] decrease the size of FileIo serialization (#3348)
Browse files Browse the repository at this point in the history
  • Loading branch information
wg1026688210 authored May 20, 2024
1 parent 9efb58d commit b0b634f
Show file tree
Hide file tree
Showing 3 changed files with 16 additions and 3 deletions.
4 changes: 3 additions & 1 deletion docs/content/engines/hive.md
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,9 @@ There are several ways to add this jar to Hive.

NOTE:

* If you are using HDFS, make sure that the environment variable `HADOOP_HOME` or `HADOOP_CONF_DIR` is set.
* If you are using HDFS :
* Make sure that the environment variable `HADOOP_HOME` or `HADOOP_CONF_DIR` is set.
* You can set `paimon.hadoop-load-default-config` =`false` to disable loading the default value from `core-default.xml``hdfs-default.xml`, which may lead smaller size for split.
* With hive cbo, it may lead to some incorrect query results, such as to query `struct` type with `not null` predicate, you can disable the cbo by `set hive.cbo.enable=false;` command.

## Hive SQL: access Paimon Tables already in Hive metastore
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,13 @@ public class HadoopUtils {
.defaultValue(HadoopConfigLoader.ALL)
.withDescription("Specifies the way of loading hadoop config.");

public static final ConfigOption<Boolean> HADOOP_LOAD_DEFAULT_CONFIG =
key("hadoop-load-default-config")
.booleanType()
.defaultValue(true)
.withDescription(
"Specifies whether load the default configuration from core-default.xml、hdfs-default.xml, which may lead larger size for the serialization of table.");

private static final String[] CONFIG_PREFIXES = {"hadoop."};
public static final String HADOOP_HOME_ENV = "HADOOP_HOME";
public static final String HADOOP_CONF_ENV = "HADOOP_CONF_DIR";
Expand All @@ -59,7 +66,11 @@ public static Configuration getHadoopConfiguration(Options options) {
// Instantiate an HdfsConfiguration to load the hdfs-site.xml and hdfs-default.xml
// from the classpath

Configuration result = new HdfsConfiguration();
Boolean loadDefaultConfig = options.get(HADOOP_LOAD_DEFAULT_CONFIG);
if (loadDefaultConfig) {
LOG.debug("Load the default value for configuration.");
}
Configuration result = new HdfsConfiguration(loadDefaultConfig);
boolean foundHadoopConfiguration = false;

// We need to load both core-site.xml and hdfs-site.xml to determine the default fs path and
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ public class HiveUtils {
public static FileStoreTable createFileStoreTable(JobConf jobConf) {
Options options = extractCatalogConfig(jobConf);
options.set(CoreOptions.PATH, LocationKeyExtractor.getPaimonLocation(jobConf));
CatalogContext catalogContext = CatalogContext.create(options, jobConf);
CatalogContext catalogContext = CatalogContext.create(options);
return FileStoreTableFactory.create(catalogContext);
}

Expand Down

0 comments on commit b0b634f

Please sign in to comment.