Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SNOW-954150] Use map of clients with different configurations instead of one client for multiple connectors configurations #744

Merged
merged 53 commits into from
Dec 1, 2023

Conversation

sfc-gh-rcheng
Copy link
Collaborator

@sfc-gh-rcheng sfc-gh-rcheng commented Nov 10, 2023

Summary

If two sink connectors with the one client optimization and different connection properties such as the user's role, ingestion is stopped on the connector that was last added. The fix is to map client properties to existing client in the one client optimization.

Changes

  1. Register multiple clients in StreamingClientProvider with a LoadingCache backed with Caffeine
  2. Added a StreamingClientProperties object that is converted from a subset of the given connector configurations. This is used in StreamingClientProvider to determine client equality (if a new client needs to be created)
    1. Multiple connectors that have the same client properties (authentication, target db and schema, role, etc) can use the same client. This means that connectors with different configs like buffer flush thresholds, target table name or JMX configurations can use the same client.
    2. This should not leak any important configurations across connectors.
  3. StreamingClientProvider does its best to provide valid clients by recreating clients when the registered client is invalid.
  4. Refactored StreamingClientProviderTest to be UT and mock the StreamingClientHandler entirely and split off a StreamingClientProviderIT

Initial Issue Investigation

ConnectorA with the admin role and ConnectorB with the public role

  1. ConnectorA(admin) is ingesting data successfully
  2. ConnectorB(public) is added
    a. If the one client optimization is enabled, ConnectorB(public) tries using ConnectorA(admin)'s client which has the admin role. As a result ConnectorB(public) fails with 400 unauthorized and does not begin ingestion.

Testing

  • Got repro and confirmed fix manually and with the KC release testing suite
    • Added Streaming, Schematization and Snowpipe ingestion to the release test suite to use two sink connectors at the same time
  • Added sink service level IT to test multiple tpChannels with different role configurations
image

Caused by this jira: https://snowflakecomputing.atlassian.net/browse/SNOW-954150

// note this test relies on testrole_kafka and testrole_kafka_1 roles being granted to test_kafka
// user
@Test
public void testStreamingIngest_multipleChannel_distinctClients() throws Exception {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

confirmed this test fails with the same error as jira when run on master, passes with this pr's changes

@sfc-gh-rcheng sfc-gh-rcheng changed the title Rcheng clientfix Use map of clients with different configurations instead of one client for multiple connectors configurations Nov 13, 2023
@sfc-gh-rcheng sfc-gh-rcheng marked this pull request as ready for review November 13, 2023 17:41
@sfc-gh-rcheng sfc-gh-rcheng changed the title Use map of clients with different configurations instead of one client for multiple connectors configurations [SNOW-954150] Use map of clients with different configurations instead of one client for multiple connectors configurations Nov 13, 2023
@sfc-gh-xhuang
Copy link
Collaborator

Is the role issue related to this?
#571
https://snowflakecomputing.atlassian.net/browse/SNOW-532834

@sfc-gh-rcheng
Copy link
Collaborator Author

Is the role issue related to this? #571 https://snowflakecomputing.atlassian.net/browse/SNOW-532834

Unrelated, but it could factor into why this wasn't caught until now

@sfc-gh-japatel
Copy link
Collaborator

Overall lgtm, will wait for one for revision. Also, it might be worth looking into LoadingCache too since we are using map which has loading, eviction technique etc. Not saying we should use either since both has pros and cons.

@sfc-gh-tzhang
Copy link
Contributor

The fix is to map client properties to existing client in the one client optimization.

Could you update the description to explain how the map is done?

Copy link
Contributor

@sfc-gh-tzhang sfc-gh-tzhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments, PTAL!

@sfc-gh-rcheng
Copy link
Collaborator Author

We definitely need improved test infra for running multiple connectors and ingesting them at same time. Lets create a JIRA for warsaw team.. could be a good task.
Created jira!

@sfc-gh-rcheng
Copy link
Collaborator Author

Updated description with changes, rerunning release testing framework for e2e test with multiple connectors now

@sfc-gh-rcheng sfc-gh-rcheng marked this pull request as ready for review November 22, 2023 23:38
LOGGER.info("Initializing Streaming Client...");

// get streaming properties from config
Properties streamingClientProps = new Properties();
Copy link
Collaborator Author

@sfc-gh-rcheng sfc-gh-rcheng Nov 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this and getNewClientName is moved into StreamingClientProperties.java

@@ -47,72 +52,137 @@ public static StreamingClientProvider getStreamingClientProviderInstance() {
return StreamingClientProviderSingleton.streamingClientProvider;
}

/** ONLY FOR TESTING - to get a provider with injected properties */
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved these methods to bottom of class for readability


@After
public void tearDown() {
this.streamingClientHandler.closeClient(this.client1);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is UT it should not actually create the client, so I refactored this test to use mocks and pulled most of the complexity into the new IT. caught by @sfc-gh-japatel

Copy link
Contributor

@sfc-gh-tzhang sfc-gh-tzhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some minor comments, PTAL, otherwise LGTM, thanks!

* @return A formatted string with the loggable properties
*/
public String getLoggableClientProperties() {
return this.clientProperties == null | this.clientProperties.isEmpty()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can clientProperties be null or empty? I thought we have check to make sure some of the configurations are required?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it shouldn't ever be null or empty, but I added the check just in case

Comment on lines 118 to 119
"Streaming client optimization is enabled per worker node. Reusing valid clients when"
+ " possible");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should we move this log after the client creation is succeeded? Similar to the logic below when one client optimization is not enabled, then you can combine two log lines together.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point. Moved line to end of method so that we only have one log and added a warn log for when the registered client is invalid because ideally it should always be valid

// invalidations are processed on the next get or in the background, so we still need to close
// the client here
this.registeredClients.invalidate(clientProperties);
this.streamingClientHandler.closeClient(registeredClient);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

duplicate as line 162?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also do you need to check whether the client is still valid before closing it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We call close on the given client and the client registered in the cache. Technically they should be the same client, however I prefer to call close on both just in case the given client is different or somehow the registered cache was corrupted

The streamingClientHandler will check if the client is valid before calling close, so the extra close call will no-op if it is invalid

@@ -35,9 +28,6 @@
/** This class handles all calls to manage the streaming ingestion client */
public class StreamingClientHandler {
private static final KCLogger LOGGER = new KCLogger(StreamingClientHandler.class.getName());
private static final String STREAMING_CLIENT_PREFIX_NAME = "KC_CLIENT_";
private static final String TEST_CLIENT_NAME = "TEST_CLIENT";

private AtomicInteger createdClientId = new AtomicInteger(0);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this back (was removed in previous PR iterations) in case removing it causes concurrency issues with client naming

@sfc-gh-rcheng sfc-gh-rcheng merged commit 869a90b into master Dec 1, 2023
30 checks passed
@sfc-gh-rcheng sfc-gh-rcheng deleted the rcheng-clientfix branch December 1, 2023 19:44
sfc-gh-rcheng added a commit that referenced this pull request Dec 6, 2023
…d of one client for multiple connectors configurations (#744)
sfc-gh-rcheng added a commit that referenced this pull request Dec 7, 2023
…d of one client for multiple connectors configurations (#744)
sfc-gh-rcheng added a commit that referenced this pull request Dec 8, 2023
…d of one client for multiple connectors configurations (#744)
EduardHantig pushed a commit to streamkap-com/snowflake-kafka-connector that referenced this pull request Feb 1, 2024
…d of one client for multiple connectors configurations (snowflakedb#744)
sudeshwasnik pushed a commit to confluentinc/snowflake-kafka-connector that referenced this pull request Feb 16, 2024
…d of one client for multiple connectors configurations (snowflakedb#744)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants