Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

General slowness observed on anaconda.org (web, API, file serving) #899

Closed
2 tasks done
mbargull opened this issue Mar 27, 2024 · 13 comments
Closed
2 tasks done

General slowness observed on anaconda.org (web, API, file serving) #899

mbargull opened this issue Mar 27, 2024 · 13 comments
Labels
locked [bot] locked due to inactivity type::bug describes erroneous operation, use severity::* to classify the type

Comments

@mbargull
Copy link
Member

Checklist

  • I added a descriptive title
  • I searched open reports and couldn't find a duplicate

What happened?

This is a very vague issue, but I generally noticed very slow response times for anaconda.org as of late

  1. on the web interface,
  2. via API (this one I can pin down to happening at least from 2024-03-22 03:00+00 onward),
  3. for non CDN-backed channels (i.e., not conda-forge/bioconda), e.g., conda-forge/label/... channels.

For the last part, one of latest evidences would be:

Download error (28) Timeout was reached [https://conda.anaconda.org/conda-forge/label/lief_rc/noarch/repodata.json.zst]
Operation too slow. Less than 30 bytes/sec transferred the last 60 seconds

observed at https://dev.azure.com/conda-forge/84710dde-1620-425b-80d0-4cf5baca359d/_build/results?buildId=904323&view=logs&j=a70f640f-cc53-5cd3-6cdc-236a1aa90802

Conda Info

No response

Conda Config

No response

Conda list

No response

Additional Context

No response

@mbargull mbargull added the type::bug describes erroneous operation, use severity::* to classify the type label Mar 27, 2024
@github-project-automation github-project-automation bot moved this to 🆕 New in 🧭 Planning Mar 27, 2024
@mbargull
Copy link
Member Author

I just got a plain

upstream connect error or disconnect/reset before headers. reset reason: connection termination

as a response to opening https://anaconda.org/main/repo in the web browser.
After that, it took a minute or so to load for cases where it was successful.
I also got a 520 once

Web server is returning an unknown error Error code 520
Visit [cloudflare.com](https://www.cloudflare.com/5xx-error-landing?utm_source=errorcode_520&utm_campaign=anaconda.org) for more information.
2024-03-27 15:40:14 UTC

@chenghlee
Copy link
Contributor

Can you try this for some channel not conda-forge, bioconda, or Anaconda? There's some Cloudflare wizardry that happens for larger channels that splits out requests that are bound for CDN (e.g., main label) and requests that are not (e.g., other labels).

@mbargull
Copy link
Member Author

Do you happen to know any non-CDN-backed channel that is somewhat large?
To me, it seems like some server load issues are happening since very small requests/channels seem to work fine.

@chenghlee
Copy link
Contributor

Can you try https://anaconda.org/LiteX-Hub/?

@mbargull
Copy link
Member Author

Got a

upstream connect error or disconnect/reset before headers. reset reason: connection termination

right away for https://anaconda.org/LiteX-Hub/ .

@mbargull
Copy link
Member Author

Downloading https://anaconda.org/LiteX-Hub/ via curl which took about 31 seconds but worked.
And after that it is cached on the CDN for me an as such seems to be served reliably and faster (can still take up to 10 seconds, though).

@mbargull
Copy link
Member Author

Just happened incidentally:

# conda-smithy rerender
Traceback (most recent call last):
  File "/home/mbargull/code/conda/conda-forge/envs/conda-smithy/bin/conda-smithy", line 10, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/mbargull/code/conda/conda-forge/envs/conda-smithy/lib/python3.11/site-packages/conda_smithy/cli.py", line 737, in main
    args.subcommand_func(args)
  File "/home/mbargull/code/conda/conda-forge/envs/conda-smithy/lib/python3.11/site-packages/conda_smithy/cli.py", line 584, in __call__
    self._call(args, tmpdir)
  File "/home/mbargull/code/conda/conda-forge/envs/conda-smithy/lib/python3.11/site-packages/conda_smithy/cli.py", line 589, in _call
    configure_feedstock.main(
  File "/home/mbargull/code/conda/conda-forge/envs/conda-smithy/lib/python3.11/site-packages/conda_smithy/configure_feedstock.py", line 2602, in main
    check_version_uptodate("conda-smithy", __version__, True)
  File "/home/mbargull/code/conda/conda-forge/envs/conda-smithy/lib/python3.11/site-packages/conda_smithy/configure_feedstock.py", line 2310, in check_version_uptodate
    most_recent_version = get_most_recent_version(name).version
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mbargull/code/conda/conda-forge/envs/conda-smithy/lib/python3.11/site-packages/conda_smithy/configure_feedstock.py", line 2299, in get_most_recent_version
    request.raise_for_status()
  File "/home/mbargull/code/conda/conda-forge/envs/conda-smithy/lib/python3.11/site-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 524 Server Error:  for url: https://api.anaconda.org/package/conda-forge/conda-smithy

@chenghlee
Copy link
Contributor

Anaconda's infrastructure team has noticed a scaling issue with the .org back-end; that might be related, and they're currently working on resolving that issue. I'll update this ticket once they're done deploying the changes, and we can recheck to see if that fixes the issue.

@mbargull
Copy link
Member Author

Thanks keeping an eye out and thanks to the infra team for working on it! 🛠️

@rymndhng
Copy link

Can you try this for some channel not conda-forge, bioconda, or Anaconda? There's some Cloudflare wizardry that happens for larger channels that splits out requests that are bound for CDN (e.g., main label) and requests that are not (e.g., other labels).

As a datapoint, one of my local tests on the comet-ml channel had this issue as well:

❯  curl -I -X GET https://conda.anaconda.org/comet_ml/noarch/repodata.json
HTTP/2 503 
date: Wed, 27 Mar 2024 16:51:42 GMT
content-type: text/html; charset=utf-8
cf-ray: 86b0e172de228417-YVR
cf-cache-status: DYNAMIC
cache-control: no-cache, max-age=0
content-disposition: inline; filename=db_connection_failure.html
expires: Wed, 27 Mar 2024 16:51:42 GMT
last-modified: Tue, 27 Feb 2024 15:49:08 GMT
x-envoy-upstream-service-time: 63125
set-cookie: __cf_bm=XrShwBedMNXr6JIgiG4V4_a.E8uNRzG76qurvQ8rqA0-1711558302-1.0.1.1-byOHt3W7jRO7wPo0lClotR.iFh8F5YVPHUG6kqWbN4Fm09VSFkW34lffnqjQ.aSJL7_BfpEr.RbKM_0UKXc8ueGpuWEqdok1gifLj8pYQWY; path=/; expires=Wed, 27-Mar-24 17:21:42 GMT; domain=.anaconda.org; HttpOnly; Secure; SameSite=None
server: cloudflare

@mariusvniekerk
Copy link

This still seems to be happening sporadically with the rapidsai channel at the very least when using mamba

info     libmamba Transfer done for 'rapidsai/linux-64'
info     libmamba Transfer finalized, status: 200 [https://conda.anaconda.org/rapidsai/linux-64/repodata.json] 263365 bytes
info     libmamba Transfer done for 'conda-forge/noarch'
info     libmamba Transfer finalized, status: 200 [https://conda.anaconda.org/conda-forge/noarch/repodata.json] 16311864 bytes
info     libmamba Transfer done for 'conda-forge/linux-64'
info     libmamba Transfer finalized, status: 200 [https://conda.anaconda.org/conda-forge/linux-64/repodata.json] 39168964 bytes
info     libmamba Download error (28) Timeout was reached [https://conda.anaconda.org/rapidsai/noarch/repodata.json]
    Operation too slow. Less than 30 bytes/sec transferred the last 60 seconds
Download error (28) Timeout was reached [https://conda.anaconda.org/rapidsai/noarch/repodata.json]
Operation too slow. Less than 30 bytes/sec transferred the last 60 seconds

# >>>>>>>>>>>>>>>>>>>>>> ERROR REPORT <<<<<<<<<<<<<<<<<<<<<<

    Traceback (most recent call last):
      File "/opt/mambaforge/lib/python3.10/site-packages/conda/exceptions.py", line 1132, in __call__
        return func(*args, **kwargs)
      File "/opt/mambaforge/lib/python3.10/site-packages/mamba/mamba.py", line 941, in exception_converter
        raise e
      File "/opt/mambaforge/lib/python3.10/site-packages/mamba/mamba.py", line 934, in exception_converter
        exit_code = _wrapped_main(*args, **kwargs)
      File "/opt/mambaforge/lib/python3.10/site-packages/mamba/mamba.py", line 892, in _wrapped_main
        result = do_call(parsed_args, p)
      File "/opt/mambaforge/lib/python3.10/site-packages/mamba/mamba.py", line 758, in do_call
        exit_code = create(args, parser)
      File "/opt/mambaforge/lib/python3.10/site-packages/mamba/mamba.py", line 632, in create
        return install(args, parser, "create")
      File "/opt/mambaforge/lib/python3.10/site-packages/mamba/mamba.py", line 499, in install
        index = load_channels(pool, channels, repos)
      File "/opt/mambaforge/lib/python3.10/site-packages/mamba/utils.py", line 129, in load_channels
        index = get_index(
      File "/opt/mambaforge/lib/python3.10/site-packages/mamba/utils.py", line 110, in get_index
        is_downloaded = dlist.download(api.MAMBA_DOWNLOAD_FAILFAST)
    RuntimeError: Download error (28) Timeout was reached [https://conda.anaconda.org/rapidsai/noarch/repodata.json]
    Operation too slow. Less than 30 bytes/sec transferred the last 60 seconds

@jakirkham
Copy link
Member

At least in the RAPIDS case, we managed to fix issues in ( #906 )

@mbargull
Copy link
Member Author

mbargull commented Jun 2, 2024

In the lasts weeks I've only noticed some sporadic failures, again, with (Cloudflare-specific) 520 or 524 HTTP status codes.
But not as prevalent as in March/April.
Let's close this for now.
Thanks for working on stabilizing and generally keeping things going!

@mbargull mbargull closed this as completed Jun 2, 2024
@github-project-automation github-project-automation bot moved this from 🆕 New to 🏁 Done in 🧭 Planning Jun 2, 2024
@github-actions github-actions bot added the locked [bot] locked due to inactivity label Nov 30, 2024
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 30, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
locked [bot] locked due to inactivity type::bug describes erroneous operation, use severity::* to classify the type
Projects
Archived in project
Development

No branches or pull requests

5 participants