Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Introduce secondary index for paimon #2925

Open
2 tasks done
leaves12138 opened this issue Feb 29, 2024 · 2 comments
Open
2 tasks done

[Feature] Introduce secondary index for paimon #2925

leaves12138 opened this issue Feb 29, 2024 · 2 comments
Labels
enhancement New feature or request

Comments

@leaves12138
Copy link
Contributor

Search before asking

  • I searched in the issues and found nothing similar.

Motivation

Up to now, Paimon use zorder & order sort compaction to speed up query. After sort compaction, files will be sorted by the order of specified columns. But in some situations, for example, we have tens of columns that should be added in the filter column, sometimes all of them come up together, sometimes, just a few of them. Zorder or order compaction can't handle this situation, because too many columns will reduce the effect of sorting. So if the column base number of these columns is small, we can use bloomfilter or other indexes to speed up queries. That's why this PIP comes up. I want to introduce a index framework to support paimon with flexible index system. 

Solution

No response

Anything else?

No response

Are you willing to submit a PR?

  • I'm willing to submit a PR!
@leaves12138 leaves12138 added the enhancement New feature or request label Feb 29, 2024
@zyl891229
Copy link
Contributor

The index should at the rowgroup(parquet) or stripe(orc) level for better
or
it can be configured at the file or row group level ?

@FangYongs
Copy link
Contributor

Hi @leaves12138 , what's the status of this feature?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants