Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IsolatedAsyncioTestCase does not call asyncio.set_event_loop before setUp anymore, asyncio.Runner+PidFdChildWatcher leaves zombie processes #95736

Closed
graingert opened this issue Aug 6, 2022 · 23 comments
Assignees
Labels
3.11 only security fixes 3.12 bugs and security fixes topic-asyncio type-bug An unexpected behavior, bug, or error

Comments

@graingert
Copy link
Contributor

graingert commented Aug 6, 2022

demo reproducer:

import unittest
import asyncio

class DemoTestCase1(unittest.IsolatedAsyncioTestCase):
    def setUp(self):
        self.loop = asyncio.get_event_loop_policy().get_event_loop()

    async def test_demo(self):
        print("hello!")


class DemoTestCase2(unittest.IsolatedAsyncioTestCase):
    def setUp(self):
        self.loop = asyncio.get_event_loop_policy().get_event_loop()

    async def test_demo(self):
        print("hello!")

if __name__ == "__main__":
    unittest.main()

see also #93896

 graingert@conscientious  testing310  ~/projects  python3.10 foo.py 
hello!
.hello!
.
----------------------------------------------------------------------
Ran 2 tests in 0.002s

OK
 graingert@conscientious  testing310  ~/projects  python3.11 foo.py 
hello!
.E
======================================================================
ERROR: test_demo (__main__.DemoTestCase2.test_demo)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python3.11/unittest/async_case.py", line 82, in _callSetUp
    self._asyncioTestContext.run(self.setUp)
  File "/home/graingert/projects/foo.py", line 14, in setUp
    self.loop = asyncio.get_event_loop_policy().get_event_loop()
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/asyncio/events.py", line 677, in get_event_loop
    raise RuntimeError('There is no current event loop in thread %r.'
RuntimeError: There is no current event loop in thread 'MainThread'.

----------------------------------------------------------------------
Ran 2 tests in 0.002s

FAILED (errors=1)

it looks like this is because asyncio.Runner calls asyncio.set_event_loop( in Runner.run

@graingert graingert added the type-bug An unexpected behavior, bug, or error label Aug 6, 2022
@graingert graingert changed the title IsolatedAsyncioTestCase does not call asyncio.set_event_loop in setUp anymore in 3.11 IsolatedAsyncioTestCase does not call asyncio.set_event_loop in before setUp anymore in 3.11 Aug 6, 2022
@graingert graingert changed the title IsolatedAsyncioTestCase does not call asyncio.set_event_loop in before setUp anymore in 3.11 IsolatedAsyncioTestCase does not call asyncio.set_event_loop before setUp anymore in 3.11 Aug 6, 2022
@graingert graingert added topic-asyncio 3.11 only security fixes 3.12 bugs and security fixes labels Aug 6, 2022
@graingert
Copy link
Contributor Author

graingert commented Aug 6, 2022

it looks like asyncio.set_event_loop is also being called an additional time per call to asyncio.Runner.run which will call get_child_watcher().attach_loop(loop) every call which causes the PidfdChildWatcher to blow away its callbacks

@graingert
Copy link
Contributor Author

graingert commented Aug 6, 2022

yep that's exactly what's happening here:

import asyncio
import sys

asyncio.set_child_watcher(asyncio.PidfdChildWatcher())

tasks = set()
pids = set()


async def subprocess():
    proc = await asyncio.create_subprocess_exec(
        sys.executable,
        "-c",
        "import time; print('hello!'); import time; time.sleep(2); print('goodbye')",
    )
    pids.add(proc.pid)
    print(await proc.communicate())


async def main():
    task = asyncio.create_task(subprocess())
    tasks.add(task)
    await asyncio.sleep(1)


async def main2():
    try:
        async with asyncio.timeout(3):
           await asyncio.gather(*tasks)
    except TimeoutError:
        import psutil
        for pid in pids:
            print(psutil.Process(pid))
    else:
        print("success!")

if __name__ == "__main__":
    with asyncio.Runner() as runner:
        runner.run(main())
        runner.run(main2())

output

hello!
goodbye
psutil.Process(pid=406749, name='python', status='zombie', started='10:13:40')

@graingert graingert changed the title IsolatedAsyncioTestCase does not call asyncio.set_event_loop before setUp anymore in 3.11 IsolatedAsyncioTestCase does not call asyncio.set_event_loop before setUp anymore, asyncio.Runner+PidFdChildWatcher leaves zombie processes Aug 6, 2022
@graingert graingert changed the title IsolatedAsyncioTestCase does not call asyncio.set_event_loop before setUp anymore, asyncio.Runner+PidFdChildWatcher leaves zombie processes IsolatedAsyncioTestCase does not call asyncio.set_event_loop before setUp anymore, asyncio.Runner+PidFdChildWatcher leaves zombie processes Aug 6, 2022
@graingert
Copy link
Contributor Author

#95584 is vaguely related

@graingert
Copy link
Contributor Author

The issue here is that asyncio.Runner calls set_event_loop too late and too often

In addition child watchers that use attach_loop are prone to unusual behaviour when the main thread switches loop

@graingert
Copy link
Contributor Author

It looks like IsolatedAsyncioTestCase when the Pidfdchildwatcher is used will break a subprocess started in asyncSetUp

@graingert
Copy link
Contributor Author

graingert commented Aug 11, 2022

import asyncio
import sys
import unittest

import asyncio

asyncio.set_child_watcher(asyncio.PidfdChildWatcher())


def create_free_port():
    return 4  # chosen by a fair dice roll


class TestProc(unittest.IsolatedAsyncioTestCase):
    async def asyncSetUp(self):
        self.port = create_free_port()
        self.proc = await asyncio.create_subprocess_exec(
            sys.executable,
            "-c",  # more realistically this might be an http server or database
            f"import time; print('listening on {self.port}'); import time; time.sleep(2); print('goodbye')",
        )

    async def testProc(self):
        print(f"connecting to {self.port}")

    async def asyncTearDown(self):
        await self.proc.communicate()  # hangs forever on 3.11


if __name__ == "__main__":
    unittest.main()

on python3.10 this produces:

connecting to 4
listening on 4
goodbye
.
----------------------------------------------------------------------
Ran 1 test in 2.022s

OK

on python3.11 it hangs forever in await self.proc.communicate()

@graingert
Copy link
Contributor Author

graingert commented Aug 11, 2022

I think IsolatedAsyncioTestCase can be fixed by backing out the changes to use asyncio.Runner for now in 3.11 and then in 3.12, copy out all of unittest's TestCase verbatim, upgrade all the methods to async def, then use asyncio.run(self.run()) rather than trying to go back in and out of the the asyncio eventloop.

this would also allow people to use yield-safe dependent contextmanagers like asyncio.timeout and asyncio.TaskGroup in the enterAsyncContext() method

@kumaraditya303
Copy link
Contributor

kumaraditya303 commented Aug 11, 2022

Please create a separate issue for the PidfdChildWatcher issue to avoid confusion (done #95899) Will take a look on this now.

@gvanrossum
Copy link
Member

I think IsolatedAsyncioTestCase can be fixed by backing out the changes to use asyncio.Runner for now in 3.11 and then in 3.12, copy out all of unittest's TestCase verbatim, upgrade all the methods to async def, then use asyncio.run(self.run()) rather than trying to go back in and out of the the asyncio eventloop.

I'm not sure I follow. Are you saying that IsolatedAsyncioTestCase should just be a textual copy of TestCase with some small changes (def -> async def)? That seems brittle -- if anything gets added to TestCase we'd have to remember to repeat the change in IsolatedAsyncioTestCase.

this would also allow people to use yield-safe dependent contextmanagers like asyncio.timeout and asyncio.TaskGroup in the enterAsyncContext() method

What does "yield-safe dependent" mean? I'm guessing that ATM self.enterAsyncContext(timeout(2)) doesn't work? Why not? What goes wrong? (Asking you before I go on a wild goose chase trying to construct a test case.)

@graingert
Copy link
Contributor Author

graingert commented Aug 13, 2022

self.enterAsyncContext(timeout(2)) enter and exit gets run in two different tasks, and it always suppresses the exception thrown into it.

There's more to yield safety than just being in the same task and propagating exceptions, @njsmith has a draft PEP

@graingert
Copy link
Contributor Author

graingert commented Aug 13, 2022

That seems brittle -- if anything gets added to TestCase we'd have to remember to repeat the change in IsolatedAsyncioTestCase.

If this was something you wanted me to work on, I'd implement the same testing strategy as AsyncExitStack where a SyncExitStack is built out of AsyncExitStack and run through the same test suite as the real ExitStack see #95902 for a plan on how to fix that same issue with asynccontextmanager and contextmanager

@gvanrossum
Copy link
Member

Okay let's back out of the speculative projects and focus on this PR. ATM I have no idea how the one line Kumar added fixes the test, and it feels too much like "programming by random modification" to me to just approve because it makes the test pass. :-) I will get back to this.

@graingert
Copy link
Contributor Author

graingert commented Aug 16, 2022

gvanrossum pushed a commit that referenced this issue Aug 16, 2022
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Aug 16, 2022
…re calling setup functions (pythonGH-95898)

(cherry picked from commit 9d51599)

Co-authored-by: Kumar Aditya <[email protected]>
@tiran
Copy link
Member

tiran commented Aug 16, 2022

The changeset breaks wasm32-emscripten tests. The issue is TestSendfile in test_os. The entire test class is skipped because wasm32-emscripten does not have os.sendfile.

test_flags (test.test_os.TestSendfile.test_flags) ... skipped 'test needs os.sendfile()'
test_headers (test.test_os.TestSendfile.test_headers) ... skipped 'test needs os.sendfile()'
test_headers_overflow_32bits (test.test_os.TestSendfile.test_headers_overflow_32bits) ... skipped 'test needs os.sendfile()'
test_invalid_offset (test.test_os.TestSendfile.test_invalid_offset) ... skipped 'test needs os.sendfile()'
test_keywords (test.test_os.TestSendfile.test_keywords) ... skipped 'test needs os.sendfile()'
test_offset_overflow (test.test_os.TestSendfile.test_offset_overflow) ... skipped 'test needs os.sendfile()'
test_send_at_certain_offset (test.test_os.TestSendfile.test_send_at_certain_offset) ... skipped 'test needs os.sendfile()'
test_send_whole_file (test.test_os.TestSendfile.test_send_whole_file) ... skipped 'test needs os.sendfile()'
test_trailers (test.test_os.TestSendfile.test_trailers) ... skipped 'test needs os.sendfile()'
test_trailers_overflow_32bits (test.test_os.TestSendfile.test_trailers_overflow_32bits) ... skipped 'test needs os.sendfile()'

Despite the class level skip, the test runner runs some setup routine. The recent change caused the async test runner to set up an event loop. This does not work on wasm32-emscripten. There is no socketpair() implementation and you cannot create a listening socket.

Here is a traceback of a wasm32-emscripten run with an extra ``return NULL;insocket.listen()`. As you can see it is trying to set up the event loop although the entire test class is skipped.

test test_os crashed -- Traceback (most recent call last):
  File "/home/heimes/dev/python/cpython/Lib/test/libregrtest/runtest.py", line 360, in _runtest_inner
    refleak = _runtest_inner2(ns, test_name)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/heimes/dev/python/cpython/Lib/test/libregrtest/runtest.py", line 317, in _runtest_inner2
    test_runner()
  File "/home/heimes/dev/python/cpython/Lib/test/libregrtest/runtest.py", line 281, in _test_module
    support.run_unittest(tests)
  File "/home/heimes/dev/python/cpython/Lib/test/support/__init__.py", line 1216, in run_unittest
    _run_suite(suite)
  File "/home/heimes/dev/python/cpython/Lib/test/support/__init__.py", line 1090, in _run_suite
    result = runner.run(suite)
             ^^^^^^^^^^^^^^^^^
  File "/home/heimes/dev/python/cpython/Lib/unittest/runner.py", line 208, in run
    test(result)
  File "/home/heimes/dev/python/cpython/Lib/unittest/suite.py", line 84, in __call__
    return self.run(*args, **kwds)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/heimes/dev/python/cpython/Lib/unittest/suite.py", line 122, in run
    test(result)
  File "/home/heimes/dev/python/cpython/Lib/unittest/suite.py", line 84, in __call__
    return self.run(*args, **kwds)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/heimes/dev/python/cpython/Lib/unittest/suite.py", line 122, in run
    test(result)
  File "/home/heimes/dev/python/cpython/Lib/unittest/suite.py", line 84, in __call__
    return self.run(*args, **kwds)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/heimes/dev/python/cpython/Lib/unittest/suite.py", line 122, in run
    test(result)
  File "/home/heimes/dev/python/cpython/Lib/unittest/case.py", line 678, in __call__
    return self.run(*args, **kwds)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/heimes/dev/python/cpython/Lib/unittest/async_case.py", line 129, in run
    self._setupAsyncioRunner()
  File "/home/heimes/dev/python/cpython/Lib/unittest/async_case.py", line 122, in _setupAsyncioRunner
    runner.get_loop()
  File "/home/heimes/dev/python/cpython/Lib/asyncio/runners.py", line 82, in get_loop
    self._lazy_init()
  File "/home/heimes/dev/python/cpython/Lib/asyncio/runners.py", line 136, in _lazy_init
    self._loop = events.new_event_loop()
                 ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/heimes/dev/python/cpython/Lib/asyncio/events.py", line 805, in new_event_loop
    return get_event_loop_policy().new_event_loop()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/heimes/dev/python/cpython/Lib/asyncio/events.py", line 695, in new_event_loop
    return self._loop_factory()
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/heimes/dev/python/cpython/Lib/asyncio/unix_events.py", line 64, in __init__
    super().__init__(selector)
  File "/home/heimes/dev/python/cpython/Lib/asyncio/selector_events.py", line 56, in __init__
    self._make_self_pipe()
  File "/home/heimes/dev/python/cpython/Lib/asyncio/selector_events.py", line 107, in _make_self_pipe
    self._ssock, self._csock = socket.socketpair()
                               ^^^^^^^^^^^^^^^^^^^
  File "/home/heimes/dev/python/cpython/Lib/socket.py", line 634, in socketpair
    lsock.listen()
SystemError: <method 'listen' of '_socket.socket' objects> returned NULL without setting an exception

@gvanrossum
Copy link
Member

I propose to roll back the PR (#95898) since I have no idea how to deal with this.

@gvanrossum
Copy link
Member

FWIW the problem is that _setupAsyncioRunner() is called even when the test is supposed to be skipped.

I suppose a fix would be to test for the skip flags (see TestCase.run()) before calling it.

serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this issue Aug 16, 2022
It should be created before calling the setUp() method, but after
checking for skipping a test.
@gvanrossum
Copy link
Member

While the demo no longer fails, we need to keep this open until #96033 lands. Given all the complexity this issue has already experienced I don't think we can expect to backport this to 3.11. I guess we'll have to recommend that affected user tests in 3.11 can call self._asyncioRunner.get_loop() in setUp() -- not ideal because of the protected variable but okay if combined with a check for 3.11. We might backport to 3.11.1, once 3.11.0 has been released.

@graingert
Copy link
Contributor Author

I guess we'll have to recommend that affected user tests in 3.11 can call self._asyncioRunner.get_loop() in setUp()

The fix I'm thinking of for aiohttp is to move to asyncio.get_running_loop() in the asyncSetUp

https://github.com/aio-libs/aiohttp/pull/6877/files#diff-70599d14cae2351e35e46867bce26e325e84f3b84ce218718239c4bfeac4dcf5R437

@gvanrossum
Copy link
Member

I guess we'll have to recommend that affected user tests in 3.11 can call self._asyncioRunner.get_loop() in setUp()

The fix I'm thinking of for aiohttp is to move to asyncio.get_running_loop() in the asyncSetUp

https://github.com/aio-libs/aiohttp/pull/6877/files#diff-70599d14cae2351e35e46867bce26e325e84f3b84ce218718239c4bfeac4dcf5R437

It looks like you're also moving from setUp to asyncSetUp -- doesn't the latter always run after the runner has been initialized?

@graingert
Copy link
Contributor Author

graingert commented Aug 17, 2022

Yes, I chose to do this because it's in an asyncio async function so it's guaranteed the loop is available. Here aiohttp is effectively passing on the bug to it's downstream users.

Therefore maybe what aiohttp should do is:

  • not land this workaround
  • wait for the patch in CPython
  • deprecate this TestCase.loop property and tell people to always use get_running_loop themselves

I think it was also discovered that this whole aiohttp TestCase class was possibly supposed to have been deprecated years ago so maybe aiohttp should just do that

miss-islington pushed a commit that referenced this issue Aug 17, 2022
It should be created before calling the setUp() method, but after
checking for skipping a test.

Automerge-Triggered-By: GH:tiran
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Aug 17, 2022
…ythonGH-96033)

It should be created before calling the setUp() method, but after
checking for skipping a test.

Automerge-Triggered-By: GH:tiran
(cherry picked from commit 3651710)

Co-authored-by: Serhiy Storchaka <[email protected]>
kumaraditya303 added a commit to kumaraditya303/cpython that referenced this issue Aug 17, 2022
…er before calling setup functions (pythonGH-95898)

(cherry picked from commit 9d51599)

Co-authored-by: Kumar Aditya <[email protected]>
kumaraditya303 added a commit to kumaraditya303/cpython that referenced this issue Aug 17, 2022
…er before calling setup functions (pythonGH-95898)

(cherry picked from commit 9d51599)

Co-authored-by: Kumar Aditya <[email protected]>
Co-authored-by: Serhiy Storchaka [email protected]
kumaraditya303 added a commit to kumaraditya303/cpython that referenced this issue Aug 17, 2022
…er before calling setup functions (pythonGH-95898)

(cherry picked from commit 9d51599)

Co-authored-by: Kumar Aditya <[email protected]>
Co-authored-by: Serhiy Storchaka [email protected]
kumaraditya303 added a commit to kumaraditya303/cpython that referenced this issue Aug 17, 2022
…er before calling setup functions (pythonGH-95898)

(cherry picked from commit 9d51599)

Co-authored-by: Kumar Aditya <[email protected]>
Co-authored-by: Serhiy Storchaka <[email protected]>
kumaraditya303 added a commit to kumaraditya303/cpython that referenced this issue Aug 17, 2022
…er before calling setup functions (pythonGH-95898)

(cherry picked from commit 9d51599)

Co-authored-by: Kumar Aditya <[email protected]>
Co-authored-by: Serhiy Storchaka <[email protected]>
@gvanrossum
Copy link
Member

@graingert I hope you will make aiohttp compatible with 3.11 regardless of whether this fix gets merged into 3.11. As a base dependency of many other packages it would block 3.11 support for those other packages.

@graingert
Copy link
Contributor Author

graingert commented Aug 17, 2022

@gvanrossum your question has, as always, helped me crystallize the trade off I chose by accident, and some other options that are available to aiohttp and I'm going to try to enumerate them so that aiohttp can make the right choice here

@gvanrossum
Copy link
Member

Whoops, reopening since the 3.11 backport didn’t get merged yet.

@gvanrossum gvanrossum reopened this Aug 18, 2022
pablogsal pushed a commit that referenced this issue Aug 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.11 only security fixes 3.12 bugs and security fixes topic-asyncio type-bug An unexpected behavior, bug, or error
Projects
Status: Todo
Development

No branches or pull requests

4 participants