Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TT-10676 Upgrade Resurface Pump backend #773

Merged
merged 11 commits into from
Jan 26, 2024
Merged

TT-10676 Upgrade Resurface Pump backend #773

merged 11 commits into from
Jan 26, 2024

Conversation

mativm02
Copy link
Contributor

This is a copy of the PR created by @monrax: #731

Upgrade the Resurface backend for the Tyk Pump. This upgrade makes the data writing process asynchronous through the use of bounded channels. In addition, the Shutdown method has been defined to support graceful shutdown.

Description

  • Upgrade logger-go dependency to version 3.3.1, which includes improvements in goroutine management, as well as a new Stop method for graceful shutdown.
  • Add support for async data writing, by adding a bounded channel to buffer data records and process them concurrently in the background.
  • Add Shutdown method for graceful shutdown of ResurfacePump backend.

Related Issue

Issues #729 and #730

Motivation and Context

  • Version 3.2.1 of the logger-go dependency is susceptible to memory runaway issues under heavy load. Version 3.3.1 provides a fix for this problem, as well as a new Stop method for graceful shutdown of the data capture process.
  • With synchronous record processing, WriteData might take a long time to return (e.g. longer than the configured redis TTL). This can be a bottleneck that can affect the entire data capture process.

How This Has Been Tested

  • Clone Tyk Pro demo repo
  • Add both Resurface and HTTP Bin services to the docker-compose.yml file:
  tyk-resurface:
    image: resurfaceio/resurface:3.5.4
    container_name: tyk-resurface
    ports:
      - "7700:7700"
      - "7701:7701"
    tmpfs:
      - /db
    networks:
      - tyk

  tyk-httpbin:
    image: kennethreitz/httpbin
    container_name: tyk-httpbin
    ports:
      - "80:80"
    networks:
      - tyk
  • Modify the pump.conf file to configure the Resurface pump backend without any timeout:
    "resurfaceio": {
      "type": "resurfaceio",
      "meta": {
        "capture_url": "http://tyk-resurface:7701/message",
        "rules": "include debug"
      }
    }
  • Import the following HTTP Bin Tyk API definition:
{
    "name": "httpbin",
    "slug": "httpbin",
    "api_id": "httpbin-api",
    "org_id": "test-org",
    "use_keyless": true,
    "version_data": {
        "not_versioned": true,
        "versions": {
            "Default": {
                "name": "Default",
                "expires": ""
            }
        }
    },
    "proxy": {
        "listen_path": "/http-bin/",
        "target_url": "http://httpbin"
    },
    "active": true,
    "enable_detailed_recording": true
}
  • Use a load testing application like Locust or bombardier to send a several concurrent requests to the /image/png endpoint aiming for >4000 RPS (Configurations used: Locust: 2 workers, 50 spawn rate, and 6000 users; Bombardier: concurrency level of 100 with 500k requests)
  • Wait for a few minutes. If using the current Tyk Pump release: the number of purged records will start to build up to a point where the pump can't keep up with the new records being generated. The gaps created by this issue will be reflected in Resurface. No errors or gaps are present when performing the same test with the changes introduced by this PR.

Screenshots (if appropriate)

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist

  • Make sure you are requesting to pull a topic/feature/bugfix branch (right side). If pulling from your own
    fork, don't request your master!
  • Make sure you are making a pull request against the master branch (left side). Also, you should start
    your branch off our latest master.
  • My change requires a change to the documentation.
    • If you've changed APIs, describe what needs to be updated in the documentation.
  • I have updated the documentation accordingly.
  • Modules and vendor dependencies have been updated; run go mod tidy && go mod vendor
  • I have added tests to cover my changes.
  • All new and existing tests passed.
  • Check your code additions will not fail linting checks:
    • go fmt -s
    • go vet

Copy link

sweep-ai bot commented Jan 22, 2024

Sweeping

Fixing PR: track the progress here.

I'm currently fixing this PR to address the following:

[Sweep GHA Fix] The GitHub Actions run failed with the following error logs:

The command:
Run $(go env GOPATH)/bin/golangci-lint run --out-format checkstyle --timeout=300s --max-issues-per-linter=0 --max-same-issues=0 --new-from-rev=origin/master ./... > golanglint.xml
yielded the following error:
##[error]Process completed with exit code 1.

Here are the logs:


[!CAUTION]

An error has occurred: 422 {"message": "Validation Failed", "errors": [{"resource": "PullRequest", "code": "custom", "message": "No commits between TT-10676 and sweep/sweep_gha_fix_the_github_actions_run_fai_3efa2"}], "documentation_url": "https://docs.github.com/rest/pulls/pulls#create-a-pull-request"} (tracking ID: 02f771dc1e)

@buger
Copy link
Member

buger commented Jan 22, 2024

API tests result - postgres15-sha256 env: success
Branch used: refs/heads/master
Commit:
Triggered by: schedule (@ermirizio)
Execution page

@buger
Copy link
Member

buger commented Jan 22, 2024

API tests result - mongo44-sha256 env: success
Branch used: refs/heads/master
Commit:
Triggered by: schedule (@ermirizio)
Execution page

rp.log.Info(rp.GetName() + " Initialized")
return nil
}

func (rp *ResurfacePump) initWorker() {
rp.data = make(chan []interface{}, 5)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@monrax could I ask you again what's the reason why the size of this channel is hardcoded to 5?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @mativm02 , sorry for the late reply -- Good catch! Ideally, it would be 1 instead of 5. Although, after some load testing I found this to be a good queue depth. I do agree it would be better to make it configurable by defining a new parameter for this pump backend. Please, let me know if this is a priority for y'all and I'll make sure to submit a new PR with this change. Thanks!

@mativm02 mativm02 enabled auto-merge (squash) January 26, 2024 16:37
@mativm02 mativm02 merged commit 8ca8646 into master Jan 26, 2024
22 checks passed
@mativm02 mativm02 deleted the TT-10676 branch January 26, 2024 16:52
@buger
Copy link
Member

buger commented Jan 26, 2024

API tests result - postgres15-murmur64 env: success
Branch used: refs/heads/master
Commit:
Triggered by: schedule (@ermirizio)
Execution page

@buger
Copy link
Member

buger commented Jan 26, 2024

API tests result - mongo44-murmur64 env: success
Branch used: refs/heads/master
Commit:
Triggered by: schedule (@ermirizio)
Execution page

@mativm02
Copy link
Contributor Author

/release to release-1.9

Copy link

tykbot bot commented Jan 29, 2024

Working on it! Note that it can take a few minutes.

tykbot bot pushed a commit that referenced this pull request Jan 29, 2024
* feat(resurface): upgrade ResurfacePump backend

* fix(resurface): flush messages before checks

* feat(resurface): upgrade ResurfacePump backend

* linting

* improving context handling

* linting

---------

Co-authored-by: Ramón Márquez <[email protected]>

(cherry picked from commit 8ca8646)
Copy link

tykbot bot commented Jan 29, 2024

@mativm02 Succesfully merged PR

buger added a commit that referenced this pull request Jan 29, 2024
TT-10676 Upgrade Resurface Pump backend (#773)

* feat(resurface): upgrade ResurfacePump backend

* fix(resurface): flush messages before checks

* feat(resurface): upgrade ResurfacePump backend

* linting

* improving context handling

* linting

---------

Co-authored-by: Ramón Márquez <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants