Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merging to release-5.3: [TT-12417] Do not delete keys on synchronization (#6642) #6667

Conversation

buger
Copy link
Member

@buger buger commented Oct 24, 2024

User description

TT-12417 Do not delete keys on synchronization (#6642)

TT-12417
Summary Api keys are lost in worker gateways when keyspace sync interrupted
Type Bug Bug
Status In Dev
Points N/A
Labels customer_bug, jira_escalated

Description

Avoiding key deletion when synchronizing. This will avoid having
inconsistent key data between master and slave Redis.

Related Issue

https://tyktech.atlassian.net/browse/TT-12417?atlOrigin=eyJpIjoiYWNiZTdlNmYwODY5NDI1ZDkzYmQ1MWFlZjM5NGQ3ZTgiLCJwIjoiaiJ9

Motivation and Context

https://tyktech.atlassian.net/browse/TT-12417?atlOrigin=eyJpIjoiYWNiZTdlNmYwODY5NDI1ZDkzYmQ1MWFlZjM5NGQ3ZTgiLCJwIjoiaiJ9

How This Has Been Tested

Screenshots (if appropriate)

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing
    functionality to change)
  • Refactoring or add test (improvements in base code or adds test
    coverage to functionality)

Checklist

  • I ensured that the documentation is up to date
  • I explained why this PR updates go.mod in detail with reasoning
    why it's required
  • I would like a code coverage CI quality gate exception and have
    explained why

PR Type

Bug fix, Enhancement


Description

  • Introduced a singleton pattern for the SyncForcer to ensure a single instance is used throughout the application.
  • Added methods SetFirstConnection and GetIsFirstConnection to manage the initial connection state.
  • Enhanced error handling in RPCStorageHandler by initializing SyncForcer on unexpected errors.
  • Modified synchronization logic to avoid key deletion during synchronization, addressing the issue of lost API keys when keyspace sync is interrupted.

Changes walkthrough 📝

Relevant files
Enhancement
rpc_storage_handler.go
Initialize SyncForcer on RPC reload check errors                 

gateway/rpc_storage_handler.go

  • Added a new SyncForcer instance creation with SetFirstConnection set
    to true.
  • Improved error handling by initializing the SyncForcer on unexpected
    errors.
  • +2/-0     
    synchronization_forcer.go
    Implement singleton pattern and enhance SyncForcer logic 

    rpc/synchronization_forcer.go

  • Implemented singleton pattern for SyncForcer instance creation.
  • Added SetFirstConnection and GetIsFirstConnection methods.
  • Modified GroupLoginCallback to use isFirstConnection for force sync
    logic.
  • +33/-8   

    💡 PR-Agent usage: Comment /help "your question" on any pull request to receive relevant information

    <details open>
    <summary><a href="https://tyktech.atlassian.net/browse/TT-12417"
    title="TT-12417" target="_blank">TT-12417</a></summary>
      <br />
      <table>
        <tr>
          <th>Summary</th>
    <td>Api keys are lost in worker gateways when keyspace sync
    interrupted</td>
        </tr>
        <tr>
          <th>Type</th>
          <td>
    <img alt="Bug"
    src="https://tyktech.atlassian.net/rest/api/2/universal_avatar/view/type/issuetype/avatar/10303?size=medium"
    />
            Bug
          </td>
        </tr>
        <tr>
          <th>Status</th>
          <td>In Dev</td>
        </tr>
        <tr>
          <th>Points</th>
          <td>N/A</td>
        </tr>
        <tr>
          <th>Labels</th>
    <td><a
    href="https://tyktech.atlassian.net/issues?jql=project%20%3D%20TT%20AND%20labels%20%3D%20customer_bug%20ORDER%20BY%20created%20DESC"
    title="customer_bug">customer_bug</a>, <a
    href="https://tyktech.atlassian.net/issues?jql=project%20%3D%20TT%20AND%20labels%20%3D%20jira_escalated%20ORDER%20BY%20created%20DESC"
    title="jira_escalated">jira_escalated</a></td>
        </tr>
      </table>
    </details>
    <!--
      do not remove this marker as it will break jira-lint's functionality.
      added_by_jira_lint
    -->
    
    ---
    
    <!-- Provide a general summary of your changes in the Title above -->
    
    Avoiding key deletion when synchronizing. This will avoid having
    inconsistent key data between master and slave Redis.
    <!-- Describe your changes in detail -->
    
    https://tyktech.atlassian.net/browse/TT-12417?atlOrigin=eyJpIjoiYWNiZTdlNmYwODY5NDI1ZDkzYmQ1MWFlZjM5NGQ3ZTgiLCJwIjoiaiJ9
    <!-- This project only accepts pull requests related to open issues. -->
    <!-- If suggesting a new feature or change, please discuss it in an
    issue first. -->
    <!-- If fixing a bug, there should be an issue describing it with steps
    to reproduce. -->
    <!-- OSS: Please link to the issue here. Tyk: please create/link the
    JIRA ticket. -->
    
    https://tyktech.atlassian.net/browse/TT-12417?atlOrigin=eyJpIjoiYWNiZTdlNmYwODY5NDI1ZDkzYmQ1MWFlZjM5NGQ3ZTgiLCJwIjoiaiJ9
    <!-- Why is this change required? What problem does it solve? -->
    
    <!-- Please describe in detail how you tested your changes -->
    <!-- Include details of your testing environment, and the tests -->
    <!-- you ran to see how your change affects other areas of the code,
    etc. -->
    <!-- This information is helpful for reviewers and QA. -->
    
    <!-- What types of changes does your code introduce? Put an `x` in all
    the boxes that apply: -->
    
    - [ ] Bug fix (non-breaking change which fixes an issue)
    - [ ] New feature (non-breaking change which adds functionality)
    - [ ] Breaking change (fix or feature that would cause existing
    functionality to change)
    - [ ] Refactoring or add test (improvements in base code or adds test
    coverage to functionality)
    
    <!-- Go over all the following points, and put an `x` in all the boxes
    that apply -->
    <!-- If there are no documentation updates required, mark the item as
    checked. -->
    <!-- Raise up any additional concerns not covered by the checklist. -->
    
    - [ ] I ensured that the documentation is up to date
    - [ ] I explained why this PR updates go.mod in detail with reasoning
    why it's required
    - [ ] I would like a code coverage CI quality gate exception and have
    explained why
    
    (cherry picked from commit cea1df4)
    @mativm02 mativm02 marked this pull request as ready for review October 24, 2024 14:54
    Copy link
    Contributor

    github-actions bot commented Oct 24, 2024

    API Changes

    --- prev.txt	2024-10-24 14:55:14.112053711 +0000
    +++ current.txt	2024-10-24 14:55:11.080028306 +0000
    @@ -10802,11 +10802,15 @@
         NewSyncForcer returns a new syncforcer with a connected redis with a key
         prefix synchronizer-group- for group synchronization control.
     
    +func (sf *SyncronizerForcer) GetIsFirstConnection() bool
    +
     func (sf *SyncronizerForcer) GroupLoginCallback(userKey string, groupID string) interface{}
         GroupLoginCallback checks if the groupID key exists in the storage to turn
         on/off ForceSync param. If the the key doesn't exists in the storage,
         it creates it and set ForceSync to true
     
    +func (sf *SyncronizerForcer) SetFirstConnection(isFirstConnection bool)
    +
     # Package: ./signature_validator
     
     package signature_validator // import "github.com/TykTechnologies/tyk/signature_validator"

    Copy link
    Contributor

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    🎫 Ticket compliance analysis ✅

    6642 - Fully compliant

    Fully compliant requirements:

    • Ensure that API keys are not deleted during synchronization interruptions.
    • Maintain consistency of key data between master and slave Redis instances.
    ⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
    🧪 No relevant tests
    🔒 No security concerns identified
    ⚡ Recommended focus areas for review

    Error Handling
    The error handling logic might cause recursive calls without proper exit conditions, potentially leading to stack overflow.

    Singleton Implementation
    The singleton pattern implementation for SyncronizerForcer might not be thread-safe due to the check outside the synchronized block.

    Copy link
    Contributor

    PR Code Suggestions ✨

    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Score
    Best practice
    Handle errors from the store connection attempt to ensure robustness

    Consider handling the error returned by sf.store.Connect() to ensure that the
    connection is established successfully before proceeding.

    rpc/synchronization_forcer.go [27]

    -sf.store.Connect()
    +if err := sf.store.Connect(); err != nil {
    +  return nil, err
    +}
    Suggestion importance[1-10]: 9

    Why: Handling the error from sf.store.Connect() is crucial for ensuring the connection is established successfully. This suggestion improves robustness by preventing further operations if the connection fails.

    9
    Possible bug
    Prevent potential infinite recursion in the CheckForReload method

    Avoid potential infinite recursion by implementing a maximum recursion depth or a
    different mechanism to prevent CheckForReload from calling itself indefinitely.

    gateway/rpc_storage_handler.go [797-798]

    -if rpc.Login() {
    +if rpc.Login() && recursionDepth < maxRecursionDepth {
       r.CheckForReload(orgId)
     }
    Suggestion importance[1-10]: 8

    Why: The suggestion addresses a potential infinite recursion issue by proposing a mechanism to limit recursion depth, which is important for preventing stack overflow errors.

    8
    Performance
    Implement a backoff strategy to optimize resource usage during retries

    Consider adding a backoff strategy for the retry mechanism in CheckForReload to
    avoid potential rapid resource exhaustion.

    gateway/rpc_storage_handler.go [806]

    -time.Sleep(1 * time.Second)
    +time.Sleep(backoffDuration)
    Suggestion importance[1-10]: 7

    Why: Introducing a backoff strategy can help prevent rapid resource exhaustion during retries, enhancing the performance and stability of the retry mechanism.

    7

    Copy link

    Quality Gate Failed Quality Gate failed

    Failed conditions
    76.9% Coverage on New Code (required ≥ 80%)
    C Reliability Rating on New Code (required ≥ A)

    See analysis details on SonarCloud

    Catch issues before they fail your Quality Gate with our IDE extension SonarLint

    @mativm02 mativm02 merged commit c63b4cb into release-5.3 Oct 24, 2024
    16 of 18 checks passed
    @mativm02 mativm02 deleted the merge/release-5.3/cea1df441fbf763fb3fd81e2f5a0a38ccadecae6 branch October 24, 2024 15:09
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    2 participants