Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

0.5.0 (#84) made backwards incompatible changes #95

Closed
AndreasBackx opened this issue Apr 24, 2024 · 6 comments
Closed

0.5.0 (#84) made backwards incompatible changes #95

AndreasBackx opened this issue Apr 24, 2024 · 6 comments

Comments

@AndreasBackx
Copy link

First off, the typing changes are amazing to have and I don't want this issue to take away from that.

I was migrating from 0.4.4 to 0.5.2 which should be backwards compatible with older code according to semver and to the changelogs that were published in the releases. Though it made the following backwards-incompatible change that broke some of our tests:

Before:

@operator(pipable=True)
def merge(*sources):

After:

@pipable_operator
def merge(
    source: AsyncIterable[T], *more_sources: AsyncIterable[T]
) -> AsyncIterator[T]:

This makes calling stream.merge() without sources not possible anymore. I assume this change was made as a result of creating @pipable_operator. aiostream.merge is also not the only function with the issue, though is being used here as an example.

The ideal solution is that we could change the signature back to:

@pipable_operator
def merge(
    *sources: AsyncIterable[T]
) -> AsyncIterator[T]:

This would make this backwards compatible again and additionally support calling it with no arguments. Calling with no arguments can occur when using aiostream.merge(*some_list). It would be nicer to be able to make aiostream.merge(*[]) be a no-op instead of having to wrap this with if foo and possibly needing to juggle with having to create an empty AsyncIterator to make the return type the same.

The release info for 0.5.0 should, if possible, be updated highlighting this backwards incompatible change in 0.5.0-0.5.2. You might want to pull those versions from pypi, though that decision is up to you.

However, this "ideal solution" might not be possible as there might be a reason why this is not possible. That might be fine, though could a new major (1.x.x) release be created highlighting the backwards incompatible change and a note in the release information? (Again possibly pull 0.5.0-0.5.2 from pypi, but up to you, I feel in this case it would make more sense to do so to prevent people from moving from 0.4.x to 0.5.x and have something break.)

I am patching our code, though this might prevent others from having issues in the future. Please let me know if anything is unclear.

@vxgmichel
Copy link
Owner

Thanks a lot @AndreasBackx for the report, that was really helpful :)

The release info for 0.5.0 should, if possible, be updated highlighting this backwards incompatible change in 0.5.0-0.5.2.

Done, let me know if the workaround mentioned here works for you: release v0.5.0.

You might want to pull those versions from pypi, though that decision is up to you.

Hmm maybe, I'll look into that once the problem is fixed.

The ideal solution is that we could change the signature back to: [...]

I agree, here's an attempt at fixing it: #96

The problem with this PR is that it breaks type inference for the zip and merge operators. I suspect this due to mypy not handling too well the mix of generic callback protocols with variadic arguments. Another explanation would be that I just got it wrong.

At that point I'm also considering the solution of not using meta-programming (i.e the operator decorator) for those 3 operators and simply unroll the logic manually by writing the dedicated operator classes.

Let me know if you have any opinion on that matter.

@AndreasBackx
Copy link
Author

Thank you for getting back!

Done, let me know if the workaround mentioned here works for you: release v0.5.0.

Thanks, that's good for now. The pulling versions and possibly issuing a 1.x.x tag can be reviewed once there is a fix.

The problem with this PR is that it breaks type inference for the zip and merge operators. I suspect this due to mypy not handling too well the mix of generic callback protocols with variadic arguments. Another explanation would be that I just got it wrong.

You seem to run mypy 1.6.1 that is shipping with Microsoft's VS Code extension (as that's the only way I get the same single error as you). If you update to 1.9 or the recently released 1.10, you'll see many more errors. 😅 Most seem to be related to the Never typing change. It might be worth looking into that generally as my assumption is that some were there before the PR.

At that point I'm also considering the solution of not using meta-programming (i.e the operator decorator) for those 3 operators and simply unroll the logic manually by writing the dedicated operator classes.

The only thing I can say on the matter now, is that I find the operator code very fragile-looking and hard to read. There are a lot of things being done manually in there that imo should be avoided. Though, I'm not here to criticise how you decided to implement aiostream.

@vxgmichel
Copy link
Owner

vxgmichel commented May 8, 2024

You seem to run mypy 1.6.1 that is shipping with Microsoft's VS Code extension (as that's the only way I get the same single error as you). If you update to 1.9 or the recently released 1.10, you'll see many more errors. 😅 Most seem to be related to the Never typing change. It might be worth looking into that generally as my assumption is that some were there before the PR.

Oh right, the pre-commit configuration was running mypy 1.6.1.

I opened an issue about the mypy 1.7 incompatibility here: #105

Sadly there's not much I can do for the moment, but the good news is that we're still compatible with the latest pyright (which is used by pylance). By the way, pyright is now checked in the pre-commit configuration: #99

The only thing I can say on the matter now, is that I find the operator code very fragile-looking and hard to read. There are a lot of things being done manually in there that imo should be avoided.

You're right, I tried improve the readability of the operator code in #106. I think it's much better now, although we still have to rely on meta-programming to remove boiler plate code from the operator declaration (without it, you would need a class declaration with about 10 to 20 lines of code for each operator).

Also I came up with a better way to implement the sources_operator: #108

It does not have the issue I mentioned earlier about breaking type inference. The only downside is that it adds a Any to the pipe method signature:

aiostream/aiostream/core.py

Lines 289 to 292 in ba773f9

@staticmethod
def pipe(
*args: P.args, **kwargs: P.kwargs
) -> Callable[[AsyncIterable[Any]], Stream[T]]: ...

This Any should be P.args but there's no proper way to type this correctly at the moment.
Anyway I'm pretty happy with it now, so feel free to leave a review if you want to. Otherwise I'll go ahead and merge it, and then release aiostream 0.6.0 :)

@AndreasBackx
Copy link
Author

I tried to spend some time reviewing your code and trying to figure out whether we can get rid of the inspect uses in many locations. Also figuring out whether relying on typing without the need to do so much code generation is possible. I however realise that this is going to cost a few hours to actually understand, which I unfortunately do not have. I would try and see if this is possible, you might already know whether or not it is.

There might be some startup time performance to be gained due to no longer needing to do a large amount of meta programming, and doing this for each and every decorator invocation. Instead reusing a few functions?

Have you ever done any measurements on the import time of aiostream? I know this might not be something that'll be much of an issue for you, though we've seen import time when Python projects in monorepos get out of control.


Regardless, if the user facing side is well-typed, I think that's the priority alongside with hopefully making 0.6.x backwards-compatible with <0.5.0. I am no big aiostream user so cannot really tell from seeing the PRs.

I'd love to help you out with some typing, though currently I'd need to spend too much time trying to understand the meta programming and I unfortunately don't have that time outside of work. Please let me know if there are any other questions you might have though.

@vxgmichel
Copy link
Owner

I tried to spend some time reviewing your code and trying to figure out whether we can get rid of the inspect uses in many locations.

The use for inspect in this code somewhat optional. However, it has two important uses:

  • Allow for inspection tools to report the right signature for the exposed methods (instead of a generic *args/**kwargs signature). This is useful for IPython introspection (e.g. try run stream.take? in an IPython console) and documentation auto-generation (using sphinx autofunction).
  • Detecting invalid patterns at declaration time (e.g decorating a method, or having the first argument being or not being variadic depending on the operator kind). IMO detecting those errors as soon as possible really helps with debugging.

Also figuring out whether relying on typing without the need to do so much code generation is possible.

Some tool will actually rely on typing rather than introspection, as it is the case for pylance. For instance, when opening a parenthesis as in stream.take(..., pylance will use the type of stream.take, and solve for the param spec P to properly display the correct prototype with the docstring of the function that solves P.

My point is that some tools use introspection and some use typing, so it's good to properly support both.

I however realise that this is going to cost a few hours to actually understand, which I unfortunately do not have.

Sure, I understand that there's a limit to the time you're willing to put into this, no problem at all :)

There might be some startup time performance to be gained due to no longer needing to do a large amount of meta programming, and doing this for each and every decorator invocation. Instead reusing a few functions?

In my opinion, the amount of meta-programming is not too important here. Each operator function is merely transformed in a callable singleton with a couple of extra methods for convenience. For instance, here's what pipable_operator looks like once you get rid of all the instrospection and extra checking features:

def pipable_operator(
    func: Callable[Concatenate[AsyncIterable[X], P], AsyncIterator[T]],
) -> PipableOperator[X, P, T]:
    
    class PipableOperatorImplementation:

        original = staticmethod(func)

        @staticmethod
        def raw(
            arg: AsyncIterable[X], /, *args: P.args, **kwargs: P.kwargs
        ) -> AsyncIterator[T]:
            assert_async_iterable(arg)
            return func(arg, *args, **kwargs)

        def __call__(
            self, arg: AsyncIterable[X], /, *args: P.args, **kwargs: P.kwargs
        ) -> Stream[T]:
            factory = functools.partial(self.raw, arg, *args, **kwargs)
            return Stream(factory)

        @staticmethod
        def pipe(
            *args: P.args,
            **kwargs: P.kwargs,
        ) -> Callable[[AsyncIterable[X]], Stream[T]]:
            return lambda source: operator_instance(source, *args, **kwargs)


    return PipableOperatorImplementation()

Have you ever done any measurements on the import time of aiostream? I know this might not be something that'll be much of an issue for you, though we've seen import time when Python projects in monorepos get out of control.

Good point, I've never tested that before. Although in this case, there doesn't seem to be any major overhead:

$ python -m timeit -n 1 -r 1 "import aiostream.stream"
1 loop, best of 1: 28.9 msec per loop
$ python -m timeit -n 1 -r 1 "import asyncio"         
1 loop, best of 1: 23.2 msec per loop

Regardless, if the user facing side is well-typed, I think that's the priority alongside with hopefully making 0.6.x backwards-compatible with <0.5.0.

Agreed! Thank you for your input, I'll go ahead and merge #108 then :)

vxgmichel added a commit that referenced this issue May 9, 2024
@vxgmichel
Copy link
Owner

Fixed in v0.6.0 🎉

Thanks again for the report :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants