-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
asyncio.Event for graceful/early termination #97
Comments
in the spirit of composability, I came up with this wrapper to handle early termination using stop event: _STOP_SENTINEL: t.Any = object()
async def async_gen_loop(
*streams: aio.core.Stream[TGenValue], stop_event: asyncio.Event | None = None
) -> t.AsyncIterator[TGenValue]:
"""Async generator loop wrapper with stop event."""
stop_event = stop_event or asyncio.Event()
@aio.operator
async def stop_event_gen():
await stop_event.wait()
yield _STOP_SENTINEL
stream = aio.stream.merge(*streams)
chained_stream = aio.stream.chain(stream, aio.stream.just(_STOP_SENTINEL))
merged_stream = aio.stream.merge(chained_stream, stop_event_gen())
async with merged_stream.stream() as iterator:
async for value in iterator:
if value is _STOP_SENTINEL:
break
yield value Unit tests: class TestAsyncGens:
@pytest.mark.parametrize("stop", [False, True])
async def test_async_gen_loop(self, stop: bool):
stop_event = asyncio.Event()
interval_stream = aio.stream.count() | aio.pipe.take(2)
async_loop = util.async_gen_loop(interval_stream, stop_event=stop_event)
items = []
async for item in async_loop:
items.append(item)
if stop:
stop_event.set()
assert items == [0] if stop else items == [0, 1], f"{items}"
async def test_async_gen_loop_multiple(self):
odd_numbers = aio.stream.range(1, 10, 2)
even_numbers = aio.stream.range(2, 10, 2)
async_loop = util.async_gen_loop(odd_numbers, even_numbers)
items = await aio.stream.list(async_loop)
assert set(items) == set(range(1, 10)), f"{items}" |
Thanks @mbbyn for creating this issue :)
Interesting. So basically you would need a way to insert a breakpoint in the processing pipeline where the producer can be stopped or cancelled, but the consumer would still be able to run to completion with the items produced before the stop signal. For instance, with the following operation pipeline:
a stop signal would cancel the task running In this case, I would write the import asyncio
from typing import AsyncIterable, AsyncIterator, TypeVar
from aiostream import stream, pipe, pipable_operator, streamcontext, aiter_utils
T = TypeVar("T")
@pipable_operator
async def stop_when(source: AsyncIterable[T], event: asyncio.Event) -> AsyncIterator[T]:
async with streamcontext(source) as streamer:
while True:
if event.is_set():
return
try:
task: asyncio.Task[T] = asyncio.create_task(aiter_utils.anext(streamer))
event_task = asyncio.create_task(event.wait())
(done, _) = await asyncio.wait(
[task, event_task], return_when=asyncio.FIRST_COMPLETED
)
if task in done:
yield task.result()
else:
task.cancel()
return
except StopAsyncIteration:
return Here's a test that demonstrates its behavior: @pytest.mark.asyncio
async def test_stop_when():
stop_event = asyncio.Event()
async def process(item: int) -> int:
await asyncio.sleep(0.4)
return item
xs = (
stream.count(interval=0.1)
| stop_when.pipe(stop_event)
| pipe.map(process)
)
items = []
async with xs.stream() as streamer:
async for item in streamer:
items.append(item)
if item == 10:
stop_event.set()
assert items == list(range(14)) Note how the Did I get this right? |
Very nice, I think you reframed the idea to better fit the library's conventions. I haven't thought that far, TBH. I was also surprised when running your sample code that it worked the way it did, producing 14 elements. I was using the library assuming there is back-pressure support where the producer would only generate an item after the whole pipe is executed (i.e. With that said, maybe there is space for two ideas:
But as someone who knows the importance of keeping OSS libraries focused and consistent, the first option seems to better fit the library and would work for my scenario as well. I just need to convince myself first why your unit test passes, and think about how the streams work accordingly. |
This is a good assumption in general. I would go as far as calling what you describe as sequential execution, in the sense that by default, there is no concurrency involved over a single pipeline (even though two pipelines could run concurrently, as they are async-friendly). On the other hand, I would call back-pressure a mechanism that would limit the producing of items if the consuming process is slower, despite both producer and consumer running concurrently (similar to a linux pipe when running So now, what about aiostream? By default, and contrary to linux pipes, chaining aiostream operators does not imply that operators run concurrently. As you correctly assumed, combining operators is similar to nesting However, some aiostream operators do add concurrency to the pipeline execution. This is the case for:
For For
Since the sample code uses Now you might think that the fact that some operators do add concurrency to the pipeline and some don't is confusing, and that's my opinion as well. I had a plan to write a sequential version of those
That's an interesting use case too :) Here's a possible implementation for it: @pipable_operator
def shortest(source: AsyncIterable[T], *more_sources: AsyncIterable[T]) -> AsyncIterator[T]:
sentinel = object()
new_sources = [
stream.chain.raw(source, stream.just.raw(sentinel))
for source in [source, *more_sources]
]
merged = stream.merge.raw(*new_sources)
result = stream.takewhile.raw(merged, lambda x: x is not sentinel)
return cast(AsyncIterator[T], result)
@pytest.mark.asyncio
async def test_shortest():
items = []
xs = stream.range(0, 5, interval=0.1)
ys = stream.range(10, 15, interval=0.2)
zs = shortest(xs, ys)
async with zs.stream() as streamer:
async for item in streamer:
items.append(item)
assert items == [0, 10, 1, 2, 11, 3, 4] Wow that was a long post, let me know what you think about all that :) |
Oh and as you noticed @pipable_operator
def stop_when(source: AsyncIterable[T], event: asyncio.Event) -> AsyncIterator[T]:
stop = stream.call.raw(event.wait)
filtered = stream.filter.raw(stop, lambda x: False)
return shortest.raw(source, filtered) |
Makes perfect sense, appreciate the elaborate rundown. The nuances of reactive streams are always tricky, leading to hidden assumptions that are not so obvious for the uninitiated (👋) (by hidden, I mean not present in the typing system nor naming convention, ... etc). I'm glad you have it already in mind. The fact that implementing the proposed use cases on the fly is a testament to the versatile design. With this realization, I think I figured out how to implement a use case I had in mind but slated for later because I thought it wouldn't fit the design well: Scaling an async generator using an |
Oh you mean It can easily be turned into an operator as well: import asyncio
from functools import partial
from aiostream import pipable_operator, stream, pipe
from aiostream.stream.combine import SmapCallable
from typing import AsyncIterator, AsyncIterable, TypeVar
T = TypeVar("T")
U = TypeVar("U")
@pipable_operator
def executor_map(
source: AsyncIterable[T],
corofn: SmapCallable[T, U],
*more_sources: AsyncIterable[T],
ordered: bool = True,
task_limit: int | None = None,
) -> AsyncIterator[U]:
loop = asyncio.get_event_loop()
async def wrapped(*args: T) -> U:
return await loop.run_in_executor(None, partial(corofn, *args))
return stream.amap.raw(
source, wrapped, *more_sources, ordered=ordered, task_limit=task_limit
) Here's the corresponding test: import time
import random
import pytest
@pytest.mark.asyncio
async def test_executor_map():
def target(x: int, *_) -> int:
time.sleep(random.random())
return x
xs = (
stream.iterate(range(10))
| executor_map.pipe(target, task_limit=5)
| pipe.list()
)
assert await xs == list(range(10)) Now that I think about it, this nicely completes the following map operator matrix:
Maybe all those could be exposed using a |
Thanks for building this awesome library!
I was wondering if you'd be open to using
asyncio.Event
in combination withasyncio.wait_for
to replaceasyncio.sleep
?My use case is that I would like to be able to gracefully stop async operations at the iterator level only, never affecting the body. By using an event, I can set it, and be sure that the loop would terminate at the exact spot I expect it to. The alternative, is to use cancel, which might affect another async operation that might be running in the body of a loop.
e.g.
By wrapping
bg_task
in anasyncio.Task
, I would be able to set the stop_event, and be sure the long running task would be affected.Perhaps I could use
asyncio.shield
, but this is a simple example, and I feelstop_event
would be useful in general.The text was updated successfully, but these errors were encountered: