-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[KYUUBI #6034] Kyuubi Server HA&ZK get server from serverHosts support more strategy #6213
[KYUUBI #6034] Kyuubi Server HA&ZK get server from serverHosts support more strategy #6213
Conversation
LGTM in general, can we add some tests? |
i'm not sure did this improvement could execute unit test? But i will try |
function level unit test is OK |
...i-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/strategy/zk/PollingChooseStrategy.java
Outdated
Show resolved
Hide resolved
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #6213 +/- ##
======================================
Coverage 0.00% 0.00%
======================================
Files 684 687 +3
Lines 42354 42439 +85
Branches 5776 5793 +17
======================================
- Misses 42354 42439 +85 ☔ View full report in Codecov by Sentry. |
...i-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/strategy/zk/PollingChooseStrategy.java
Outdated
Show resolved
Hide resolved
...i-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/strategy/zk/PollingChooseStrategy.java
Outdated
Show resolved
Hide resolved
...i-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/strategy/zk/PollingChooseStrategy.java
Outdated
Show resolved
Hide resolved
d6c58ee
to
b83b2fd
Compare
@@ -79,6 +79,7 @@ public class JdbcConnectionParams { | |||
// Use ZooKeeper for indirection while using dynamic service discovery | |||
static final String SERVICE_DISCOVERY_MODE_ZOOKEEPER = "zooKeeper"; | |||
static final String ZOOKEEPER_NAMESPACE = "zooKeeperNamespace"; | |||
static final String ZOOKEEPER_STRATEGY = "zooKeeperStrategy"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
static final String ZOOKEEPER_STRATEGY = "zooKeeperStrategy"; | |
static final String ZOOKEEPER_SELECT_POLICY = "zooKeeperSelectPolicy"; |
Would it be a better name for the parameter? cc @pan3793
The term select policy
is similar with existed config kyuubi.engine.pool.selectPolicy
of kyuubi server.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"select policy" is a better phrase here.
technically, the mechanism could be promoted to another service discovery impls, so we should not include "zookeeper" in the parameter name.
so my suggested parameter name is "serverSelectPolicy", and also rename interface "ChooseServerStrategy" to "ServerSelectStrategy"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM. Agree with Pan's suggestion.
84c24e8
to
e194ea6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks.
…t more strategy # 🔍 Description ## Issue References 🔗 This pull request fixes #6034 ## Describe Your Solution 🔧 Currently, use beeline to connect kyuubiServer with HA mode, the strategy only support random, this will lead to a high load on the machine. So i make this pr to support choose strategy. [description] First, we need know, beeline connect kyuubiServer dependency on kyuubi-hive-jdbc, it is isolated from the kyuubi cluster, so the code only support random choose serverHost from zk node /${namespace}. Because kyuubi-hive-jdbc is a stateless module, only run once, cannot store var about get serverHost from zk node. [Solution] This pr, we could implement a interface named ChooseServerStrategy to choose serverHost. I implement two strategy 1. poll: it will create a zk node named ${namespace}-counter, when a beeline client want connect kyuubiServer, the node will increment 1, use this value to take the remainder from serverHosts, like counter % serverHost.size, so we could get a order serverHost 2. random: random get serverHost from serverHosts 3. User Definied Class: implemented the ChooseServerStrategy, then put the jar to beeline-jars, it can use your strategy to choose serverHost ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [x] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Test the Strategy in my test Cluster #### Behavior Without This Pull Request ⚰️ ![image](https://github.com/apache/kyuubi/assets/51512358/d65b14c1-1b02-4436-8843-27b2e55d27ce) ![image](https://github.com/apache/kyuubi/assets/51512358/0524a30c-c2c3-464e-8453-84f3f1a74fb1) ![image](https://github.com/apache/kyuubi/assets/51512358/12feb93e-b743-4a43-821d-454f3c1af336) #### Behavior With This Pull Request 🎉 [Use Case] 1. poll: `bin/beeline -u 'jdbc:hive2://xxx:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=kyuubi;zooKeeperStrategy=poll?spark.yarn.queue=root.kylin;spark.app.name=testspark;spark.shuffle.useOldFetchProtocol=true' -n mfw_hadoop --verbose=true --showNestedErrs=true` 2. random: `bin/beeline -u 'jdbc:hive2://xxx:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=kyuubi;zooKeeperStrategy=random?spark.yarn.queue=root.kylin;spark.app.name=testspark;spark.shuffle.useOldFetchProtocol=true' -n mfw_hadoop --verbose=true --showNestedErrs=true` or `bin/beeline -u 'jdbc:hive2://xxx:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=kyuubi?spark.yarn.queue=root.kylin;spark.app.name=testspark;spark.shuffle.useOldFetchProtocol=true' -n mfw_hadoop --verbose=true --showNestedErrs=true` 3. YourStrategy: `bin/beeline -u 'jdbc:hive2://xxx:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=kyuubi;zooKeeperStrategy=xxx.xxx.xxx.XxxChooseServerStrategy?spark.yarn.queue=root.kylin;spark.app.name=testspark;spark.shuffle.useOldFetchProtocol=true' -n mfw_hadoop --verbose=true --showNestedErrs=true` [Result: The Cluster have two Server (221,233)] 1. poll: 1.1. zkNode: counterValue ![image](https://github.com/apache/kyuubi/assets/51512358/5cbd15f9-bba4-4b23-bbfb-d61ed46f931f) 1.2. result: ![image](https://github.com/apache/kyuubi/assets/51512358/5a867167-8b06-49ed-aa44-b70726f3ae97) ![image](https://github.com/apache/kyuubi/assets/51512358/404b05e8-c828-458c-a9c4-97a323bf6ce7) ![image](https://github.com/apache/kyuubi/assets/51512358/3182e92b-6976-4931-a899-5e0d89cd2ac2) ![image](https://github.com/apache/kyuubi/assets/51512358/a55450ff-49cf-4b4a-9b90-91dd02982aa5) 2. random: ![image](https://github.com/apache/kyuubi/assets/51512358/d65b14c1-1b02-4436-8843-27b2e55d27ce) ![image](https://github.com/apache/kyuubi/assets/51512358/0524a30c-c2c3-464e-8453-84f3f1a74fb1) ![image](https://github.com/apache/kyuubi/assets/51512358/12feb93e-b743-4a43-821d-454f3c1af336) 3. YourStrategy(the test case only get the first serverHost): ![image](https://github.com/apache/kyuubi/assets/51512358/2e6395c2-6496-4516-9cf6-90abc921de7f) ![image](https://github.com/apache/kyuubi/assets/51512358/72975513-48d2-4f41-8a95-95cde0302c5b) ![image](https://github.com/apache/kyuubi/assets/51512358/487951fd-de45-4e1c-861a-94e0e5564e37) #### Related Unit Tests There is no Unit Tests. --- # Checklist 📝 - [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) **Be nice. Be informative.** Closes #6213 from davidyuan1223/ha_zk_support_more_strategy. Closes #6034 961d3e9 [Bowen Liang] rename ServerStrategyFactory to ServerSelectStrategyFactory 353f940 [Bowen Liang] repeat 8822ad4 [Bowen Liang] repeat 6193394 [Bowen Liang] nit e94f9e9 [Bowen Liang] nit 40f427a [Bowen Liang] rename StrategyFactory to StrategyFactoryServerStrategyFactory 7668f99 [Bowen Liang] test name e194ea6 [Bowen Liang] remove ZooKeeperHiveClientException from method signature of chooseServer 265965e [Bowen Liang] polling b39c567 [Bowen Liang] style 1ab79b4 [Bowen Liang] strategyName 8f8ca28 [Bowen Liang] nit 228bf10 [Bowen Liang] rename parameter zooKeeperStrategy to serverSelectStrategy 125c823 [Bowen Liang] rename ChooseServerStrategy to ServerSelectStrategy b4aeb3d [Bowen Liang] repeat testing on pollingChooseStrategy 4655480 [davidyuan] update 09a84f1 [david yuan] remove the distirbuted lock 93f4a26 [davidyuan] remove reset 7b0c1b8 [davidyuan] fix var not valid and counter getAndIncrement c95382a [davidyuan] fix var not valid and counter getAndIncrement 9ed2cac [david yuan] remove test comment 8eddd76 [davidyuan] Add Strategy Unit Test Case and fix the polling strategy counter begin with 0 73952f8 [davidyuan] Kyuubi Server HA&ZK get server from serverHosts support more strategy 97b9597 [davidyuan] Kyuubi Server HA&ZK get server from serverHosts support more strategy ee5a9ad [davidyuan] Kyuubi Server HA&ZK get server from serverHosts support more strategy 6a04453 [davidyuan] Kyuubi Server HA&ZK get server from serverHosts support more strategy 1892f14 [davidyuan] add common method to get session level config 7c0c605 [yuanfuyuan] fix_4186 Lead-authored-by: Bowen Liang <[email protected]> Co-authored-by: davidyuan <[email protected]> Co-authored-by: davidyuan <[email protected]> Co-authored-by: david yuan <[email protected]> Co-authored-by: yuanfuyuan <[email protected]> Signed-off-by: Bowen Liang <[email protected]> (cherry picked from commit 8862767) Signed-off-by: Bowen Liang <[email protected]>
🔍 Description
Issue References 🔗
This pull request fixes #6034
Describe Your Solution 🔧
Currently, use beeline to connect kyuubiServer with HA mode, the strategy only support random, this will lead to a high load on the machine. So i make this pr to support choose strategy.
[description]
First, we need know, beeline connect kyuubiServer dependency on kyuubi-hive-jdbc, it is isolated from the kyuubi cluster, so the code only support random choose serverHost from zk node /${namespace}. Because kyuubi-hive-jdbc is a stateless module, only run once, cannot store var about get serverHost from zk node.
[Solution]
This pr, we could implement a interface named ChooseServerStrategy to choose serverHost. I implement two strategy
Types of changes 🔖
Test Plan 🧪
Test the Strategy in my test Cluster
Behavior Without This Pull Request ⚰️
Behavior With This Pull Request 🎉
[Use Case]
bin/beeline -u 'jdbc:hive2://xxx:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=kyuubi;zooKeeperStrategy=poll?spark.yarn.queue=root.kylin;spark.app.name=testspark;spark.shuffle.useOldFetchProtocol=true' -n mfw_hadoop --verbose=true --showNestedErrs=true
bin/beeline -u 'jdbc:hive2://xxx:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=kyuubi;zooKeeperStrategy=random?spark.yarn.queue=root.kylin;spark.app.name=testspark;spark.shuffle.useOldFetchProtocol=true' -n mfw_hadoop --verbose=true --showNestedErrs=true
orbin/beeline -u 'jdbc:hive2://xxx:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=kyuubi?spark.yarn.queue=root.kylin;spark.app.name=testspark;spark.shuffle.useOldFetchProtocol=true' -n mfw_hadoop --verbose=true --showNestedErrs=true
bin/beeline -u 'jdbc:hive2://xxx:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=kyuubi;zooKeeperStrategy=xxx.xxx.xxx.XxxChooseServerStrategy?spark.yarn.queue=root.kylin;spark.app.name=testspark;spark.shuffle.useOldFetchProtocol=true' -n mfw_hadoop --verbose=true --showNestedErrs=true
[Result: The Cluster have two Server (221,233)]
1.1. zkNode: counterValue
1.2. result:
random:
YourStrategy(the test case only get the first serverHost):
Related Unit Tests
There is no Unit Tests.
Checklist 📝
Be nice. Be informative.