[core] Add parquet write page limit parameter #4632
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Purpose
Linked issue: close #4586
Added the
parquet.page.row.count.limit
parameter for Parquet file writing. The writing of Parquet files is influenced by bothparquet.page.size
andparquet.page.row.count.limit
. If onlyparquet.page.size
is set, it may not have an effect or could lead to misalignment of pages, impacting performance.新增parquet 文件写入
parquet.page.row.count.limit
参数传递,parquet 文件写入时是通过parquet.page.size
与parquet.page.row.count.limit
共同影响的, 如果单一设置parquet.page.size
可能没有产生作用,或者导致 page 不对齐影响性能Tests
API and Format
Documentation