Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix release for distributed on Friday, October 30 #105

Closed
TomAugspurger opened this issue Oct 27, 2020 · 13 comments
Closed

Bugfix release for distributed on Friday, October 30 #105

TomAugspurger opened this issue Oct 27, 2020 · 13 comments

Comments

@TomAugspurger
Copy link
Member

Distributed master currently has some issues with Comms timeouts (dask/distributed#4176, and linked issues).

We'd like to issue a release of distributed after dask/distributed#4176. However, we're in the middle of our large HighLevelGraph refactor. Dask and Distributed master are both usable today, but we'd like to hold on a release of those until all the components are in place, which will likely take another couple weeks to a month.

So we plan to make a release of just distributed that is just [email protected] + dask/distributed#4176, the fix for the comms issue. We plan to do this on the 30th.

@jakirkham
Copy link
Member

cc @madsbk @rjzamora @quasiben

@pentschev
Copy link
Member

Could we also try to have dask/distributed#4184 merged in too? That would help for cases where comms (mostly UCX) don't get closed correctly when workers' processes terminate too early.

@TomAugspurger
Copy link
Member Author

@quasiben mentioned dask/distributed#4184 as a possible option for backporting. In the end I think he weakly recommended against it, I think to lower the cost of the release and because rapids (or UCX-py?) was already running against distributed master. I may have missed part of the discussion on what all to backport though.

@jrbourbeau
Copy link
Member

I think @quasiben mentioned that RAPIDS folks were mostly working off distributed's master branch, so it was less important to include dask/distributed#4184 in a release as long as the PR is merged (@quasiben feel free to correct me if my understanding is incorrect)

@kkraus14
Copy link
Member

While our CI / developers typically work off of master, our users typically use the latest release. Is that issue something that impacts end user usage?

@quasiben
Copy link
Member

As @TomAugspurger said, I was trying to lower the cost for devs who are responsible for doing the worker by minimizing the number of PRs which went into the release. With said, that PR is quite helpful for those using UCX. As a test, I tried cherry picking both PRs and everything went smoothly

git checkout 2.30.0
# https://github.com/dask/distributed/pull/4176/commits -- allow retries
git cherry-pick 6ceed9e37cc181c08782ed62412bbfb6061ef358
git cherry-pick 1065a0191afb2f42ce4dad98f4d7aab28d76eae1
git cherry-pick 39206a548d3073cf98d3cdf3dec16fa654797582
git cherry-pick 1e1f66e08a22384d8cecb7ab607c5fe5e9bd03e6
git cherry-pick 4db6ca02ca33e5f05e2a4b616cfb3f89c55f29a3

# https://github.com/dask/distributed/pull/4184/commits -- replace async
git cherry-pick e300b2df09622363da2a35436294bba62577f21c
git cherry-pick 8be71dfd9acc6e016cf6972dff7f00720ecf84fa
git cherry-pick fe1887b22d35e2f91a1ff37e71bc69d9de5307c0
git cherry-pick d61f92932a9aa775d436f05125f5096a8cd2cf1a
git cherry-pick f41bb3ea396f0a676f428ad062406e2ef3308f20

@jakirkham
Copy link
Member

FWIW one can also generate patches from PRs (or really any GitHub comparison or commit). We use this in conda-forge frequently when applying patches to releases. So one can do something like this as well.

git checkout 2.30.0
curl -L https://github.com/dask/distributed/pull/4176.patch -o PR_4176.patch
curl -L https://github.com/dask/distributed/pull/4184.patch -o PR_4184.patch
git am PR_4176.patch
git am PR_4184.patch
rm PR_4176.patch PR_4184.patch

@jrbourbeau
Copy link
Member

Brief update: we ended up not releasing last Friday due to some new (but hopefully unrelated) CI test failures. See dask/distributed#4204 (comment) for more context. Given our current state, I'm inclined not to release distributed == 2.30.1 until we're able to confidently say the release would pass CI

@jennakwon06
Copy link

Hello all - I see that PR#4176 (dask/distributed#4176) is merged into master. May I get an update on when distributed 2.30.1 would be released on pypi?

Thank you all!

@jrbourbeau
Copy link
Member

We're currently waiting on CI builds to pass for the latest commit here. After that, we will release 2.30.1

@jrbourbeau
Copy link
Member

@jennakwon06 distributed==2.30.1 is on PyPI now and should be conda-forge in the next couple of hours

@fjetter
Copy link
Member

fjetter commented Nov 4, 2020

Thanks a lot, this is really helpful!

@jrbourbeau
Copy link
Member

Closing as 2.30.1 is on PyPI and conda-forge. Thanks all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants