Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Improve catalog lock for paimon #2824

Open
1 of 2 tasks
FangYongs opened this issue Jan 31, 2024 · 7 comments
Open
1 of 2 tasks

[Feature] Improve catalog lock for paimon #2824

FangYongs opened this issue Jan 31, 2024 · 7 comments
Labels
enhancement New feature or request

Comments

@FangYongs
Copy link
Contributor

Search before asking

  • I searched in the issues and found nothing similar.

Motivation

Currently paimon only support hive lock for hive catalog, it should support unified lock for filesystem catalog

Solution

No response

Anything else?

No response

Are you willing to submit a PR?

  • I'm willing to submit a PR!
@chenxi0599
Copy link
Contributor

Great!I want this feature, too.

@legendtkl
Copy link
Contributor

+1 for this.

@JingsongLi
Copy link
Contributor

+1 thanks @FangYongs for driving!

@JingsongLi
Copy link
Contributor

JingsongLi commented Mar 27, 2024

Hi, after reviewing #3076

I feel more description should be provided here, otherwise it may lead to incorrect design by contributors.

  • What is the overall options API? Do option keys require a lock prefix?
  • How does FileSystemCatalog build Lock? How should JdbcLock handle without a jdbc lock context? Do we need to make some modifications to JDBCLock and HiveLock?

I suggest having a clear design before creating subtasks. What do you think? @FangYongs

@FangYongs
Copy link
Contributor Author

Thanks @JingsongLi , and I completely agree with your opinion! I will create a PIP for this feature and start a discussion thread later.

@sunxiaojian
Copy link
Contributor

sunxiaojian commented Mar 27, 2024

@FangYongs @JingsongLi

  1. I think that the parameters for LockContext to be open to the public should only be 'option', so there is no need to reference the connection implementation between jdbc and hive in the Filesystem Catalog, which is the future ClientPool

2.I think the parameters required for lock should be distinguished by adding a prefix of "lock." This will not cause any inconvenience to users. However, metastores like jdbc and hive, which come with their own Lock implementation, need to be compatible and do not require redeclaration of lock specific configuration parameters

  1. The core issue is the issue of connection. If we need to create a new connection every time and close it after use, it is inevitable to have too many connections, and it may not be possible to seize them during busy times. Therefore, it is recommended to use cache to cache and set the eviction time

If possible, I can refactor according to the above description first.

@rmotapar
Copy link

rmotapar commented Jul 2, 2024

+1 for this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

6 participants