Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core] Append-Only table use zstd compression default reduce storage in parquet format #3359

Closed
2 tasks done
xuzifu666 opened this issue May 21, 2024 · 1 comment
Closed
2 tasks done
Labels
enhancement New feature or request

Comments

@xuzifu666
Copy link
Member

xuzifu666 commented May 21, 2024

Search before asking

  • I searched in the issues and found nothing similar.

Motivation

When we test append only table in stream scences, we found the condition paimon table storage is 1.3x size of hive table,this is a litter high,then we change compression from snappy(wich is default) to zstd, storage from 1.3x to 1.04x which is expected to us.
From other lake engine we also do the change and other lake engine also reduce storage and stream job is stable as the same.
So should we make the default compression in parquet to zstd in append only sences?
1716275038422.png

snappy : 499GB
hive: 372GB
snappy / hive = 1.34

1716275033959.png
zstd: 390GB
hive: 372GB

zstd / hive = 1.04

Solution

Set compression with zstd in append only table

Anything else?

none

Are you willing to submit a PR?

  • I'm willing to submit a PR!
@xuzifu666 xuzifu666 added the enhancement New feature or request label May 21, 2024
@xuzifu666 xuzifu666 changed the title [Core] Append-Only table use zstd compression default instead of snappy for reduce storage in parquet format [Core] Append-Only table use zstd compression default reduce storage in parquet format May 21, 2024
@xuzifu666
Copy link
Member Author

xuzifu666 commented May 22, 2024

close it first due to doc had remind it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant