Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Stdlib] Add PythonObject.__contains__ #3101

Closed
wants to merge 4 commits into from

Conversation

rd4com
Copy link
Contributor

@rd4com rd4com commented Jun 22, 2024

Simple and small implementation,

Improving the 🐍 Python experience !

Example usage:

x = PythonObject([1,2,3])
if 1 in x: 
   print("1 in x")

That method should be replaced by a c-python function if possible.

@rd4com rd4com requested a review from a team as a code owner June 22, 2024 04:38
@rd4com rd4com force-pushed the pythonobject_dunder_contains branch from b8315e3 to 8571772 Compare June 22, 2024 04:43
@rd4com rd4com requested a review from a team as a code owner June 22, 2024 05:04
@rd4com rd4com force-pushed the pythonobject_dunder_contains branch 2 times, most recently from 5a8c2ad to b4fc045 Compare June 26, 2024 16:14
Copy link
Collaborator

@JoeLoser JoeLoser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Do you mind adding some unit tests please?

@JoeLoser JoeLoser self-assigned this Jun 26, 2024
@rd4com rd4com force-pushed the pythonobject_dunder_contains branch from b4fc045 to 8d364f9 Compare June 28, 2024 22:33
@rd4com
Copy link
Contributor Author

rd4com commented Jun 28, 2024

🆕 unit tests

👍 yep it is important, should there be more ?

Python throws an error if the object is not iterable,

But mojo gets an internal error: PyObject_GetAttrString failed from it.

Current PR have a debug_assert when we have the error and returns False.

Is it possible to make __contains__ re-raise the same error as python would have thrown ?

@rd4com rd4com requested a review from JoeLoser June 28, 2024 22:49
@JoeLoser
Copy link
Collaborator

@rd4com sorry this one fell off my queue, I will look at this on Monday and your comment about the re-raising an error and see what we can do about that.

@JoeLoser
Copy link
Collaborator

!sync

@rd4com rd4com force-pushed the pythonobject_dunder_contains branch from 8d364f9 to 3c96d04 Compare July 16, 2024 03:51
@rd4com
Copy link
Contributor Author

rd4com commented Jul 16, 2024

Edit: see next comments, this one was for previous commit

Yes, it works better, but it is not yet the same error as python:

Unhandled exception caught during execution: 'int' object has no attribute '__contains__'

python

TypeError: argument of type 'int' is not iterable

It could be because we are using __getattr__, see:

Documentation: datamodel

We might need to use something like object.__contains__,

Any idea ?

@rd4com
Copy link
Contributor Author

rd4com commented Jul 16, 2024

edit: implemented in next commits, see next comments

Ok, according to python specification/docs Documentation: membership-test-details :

We could use a new PyObject_HasAttr("__contains__") ,
if it has, we can use it to determine the return value,
else, we can iterate self.__iter__() in the function to determine the return value.
(which would give a chance to Cpython to raise it's iterable error)

CPython also says that it would also try __getitem__() if the object is not iterable.
(but when, since int do not have __iter__, and return an iterable error anyway)

I'll join cpython chat channel if they have one to ask.

The question would be: How to run an operator membership test from cypthon (in, not in)

In addition, in a perfect world, we'd need to return an PythonObject of class bool.

What do you think ?

@rd4com rd4com force-pushed the pythonobject_dunder_contains branch from 3c96d04 to b81e384 Compare July 16, 2024 15:23
@rd4com
Copy link
Contributor Author

rd4com commented Jul 16, 2024

Nice 👍 , we partially got this with last commit, it behaves more like python:

in python

class MyClass:
    x = 1
print(123 in MyClass())

TypeError: argument of type 'MyClass' is not iterable

in python

class MyClass:
    x = 1
    def __contains__(self, rhs):
        return True
print(123 in MyClass())

True

in mojo

from python import Python
def main():
    m = Python.import_module("myclass")
    x = m.MyClass()
    print(1 in x)

True

if we remove the __contains__ dunder in myclass.py

Unhandled exception caught during execution: 'MyClass' object is not iterable

 

The difficulty is now to implement the last step of the "in" membership operator test.
(which is using __getitem__)

  1. Any suggestion on how to create a test that uses the class created above in mojo?
    (to test classes with import_module or not in test_python_object.mojo)
  2. Should we return a PythonObject of class bool?

stdlib/src/python/_cpython.mojo Outdated Show resolved Hide resolved
@@ -1102,6 +1102,28 @@ struct PythonObject(
"""
return self._call_zero_arg_method("__invert__")

fn __contains__(self, rhs: PythonObject) raises -> Bool:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, in one of your comments, you're asking about the return type here, correct? E.g. Bool vs PythonObject? I'm fine with Bool as-is. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the return type of the function 👍

It should be good, Bool can probably implicitly convert to PythonObject if user do chaining.

(but it is something to keep in mind for the future)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly, I had the composability question in mind, and I think we're fine with Bool for now.

@JoeLoser
Copy link
Collaborator

Nice 👍 , we partially got this with last commit, it behaves more like python:

in python

class MyClass:
    x = 1
print(123 in MyClass())

TypeError: argument of type 'MyClass' is not iterable

in python

class MyClass:
    x = 1
    def __contains__(self, rhs):
        return True
print(123 in MyClass())

True

in mojo

from python import Python
def main():
    m = Python.import_module("myclass")
    x = m.MyClass()
    print(1 in x)

True

if we remove the __contains__ dunder in myclass.py

Unhandled exception caught during execution: 'MyClass' object is not iterable

 

The difficulty is now to implement the last step of the "in" membership operator test. (which is using __getitem__)

  1. Any suggestion on how to create a test that uses the class created above in mojo?
    (to test classes with import_module or not in test_python_object.mojo)
  2. Should we return a PythonObject of class bool?

For (1), are you asking about a user-defined type that implements __iter__ but not __contains__? Or are you asking about a type that defines neither __iter__ or __contains, but does define __getitem__? This last one is the case of

Lastly, the old-style iteration protocol is tried: if a class defines getitem(), x in y is True if and only if there is a non-negative integer index i such that x is y[i] or x == y[i], and no lower integer index raises the IndexError exception. (If any other exception is raised, it is as if in raised that exception).

from https://docs.python.org/3/reference/expressions.html#membership-test-details

@rd4com
Copy link
Contributor Author

rd4com commented Jul 17, 2024

Yes, we need to test the whole function (combinations of instances that: __getitem__ __contains__, __iter__).
For that, we need a way to create an instance of a class as PythonObject:
Python.import_module test_classes_for_contains for example.

(In a way that does not pollute the python environment for other tests of PythonObject)

@JoeLoser
Copy link
Collaborator

JoeLoser commented Jul 17, 2024

Yes, we need to test the whole function (combinations of instances that: __getitem__ __contains__, __iter__). For that, we need a way to create an instance of a class as PythonObject: Python.import_module test_classes_for_contains for example.

(In a way that does not pollute the python environment for other tests of PythonObject)

One naive approach is just to have the different instances we want to test in their own test file with their own Python environment/CPython interpreter running. It's far from great, but an option.

Could we leverage context managers and have a context managed CPython interpreter environment for each of these to avoid test pollution?

@rd4com
Copy link
Contributor Author

rd4com commented Jul 17, 2024

edit: next commit have tests to assert the environement is clean between test_*.mojo

 

That is a great idea, context managers are good for that 👍

The difficulty is that Python() returns _get_global_python_itf().cpython():

from python import Python
def main():
    A = Python()
    B = Python()
    A.eval("x = 123")

    print(A.evaluate("x")) 
    print(B.evaluate("x"))

The PythonObject returned by import_module is there:

from python import  Python
def main():
    A = Python()
    B = Python()
    m = A.import_module("myclass")
    print(A.import_module("sys").modules.keys()) #myclass
    print(B.import_module("sys").modules.keys()) #myclass

So if we import for testing it would contaminate the environment,
maybe multiples test files is the good option,
what do you think ?

(we should test assert that the environment stays clean with multiple test files too)

Edit: Assertion done, see next commit

@rd4com rd4com force-pushed the pythonobject_dunder_contains branch from b81e384 to a8c0453 Compare July 22, 2024 20:50
@rd4com
Copy link
Contributor Author

rd4com commented Jul 22, 2024

Hello, two more tests to test the testing environment:

  • stdlib/test/python/test_testing_evironement_is_clean_A.mojo
    • sets a global variable
    • import my_module.py
    • assert that my_module is in import_module("sys").modules.keys()
  • stdlib/test/python/test_testing_evironement_is_clean_B.mojo
    • assert that the global variable raises (not exist)
    • assert that my_module is not in import_module("sys").modules.keys()

For this to work, we need to be sure that the tests are done in that order. (A then B)

What do you think ?

 

In addition:

  • The test for the dunder is moved into test_python_object_dunder_contains.mojo
  • It imports module_for_test_python_object_dunder_contains.py
    • Class_no_iterable_no_contains
    • Class_no_iterable_but_contains
    • Class_iterable_no_contains

@rd4com rd4com force-pushed the pythonobject_dunder_contains branch from a8c0453 to 50f4cfd Compare July 22, 2024 21:40
@JoeLoser
Copy link
Collaborator

Hello, two more tests to test the testing environment:

  • stdlib/test/python/test_testing_evironement_is_clean_A.mojo

    • sets a global variable
    • import my_module.py
    • assert that my_module is in import_module("sys").modules.keys()
  • stdlib/test/python/test_testing_evironement_is_clean_B.mojo

    • assert that the global variable raises (not exist)
    • assert that my_module is not in import_module("sys").modules.keys()

For this to work, we need to be sure that the tests are done in that order. (A then B)

What do you think ?

 

In addition:

  • The test for the dunder is moved into test_python_object_dunder_contains.mojo

  • It imports module_for_test_python_object_dunder_contains.py

    • Class_no_iterable_no_contains
    • Class_no_iterable_but_contains
    • Class_iterable_no_contains

I don't know if we have a way to force the testing order of A and then B — either in lit, or how we run the tests internally in our Bazel environment (CC: @keith here). Perhaps just omit those tests for now but keep the others so we can rebase and land this?

@JoeLoser JoeLoser added the waiting for response Needs action/response from contributor before a PR can proceed label Nov 22, 2024
@rd4com rd4com force-pushed the pythonobject_dunder_contains branch from 50f4cfd to 9a8c755 Compare November 22, 2024 14:06
@rd4com
Copy link
Contributor Author

rd4com commented Nov 22, 2024

Hello,
just implemented the changes 👍 :

  • ✅ reverted the A B
  • ✅ rebase

@rd4com rd4com requested a review from JoeLoser November 22, 2024 14:34
@JoeLoser JoeLoser removed the waiting for response Needs action/response from contributor before a PR can proceed label Nov 22, 2024
@JoeLoser
Copy link
Collaborator

!sync

@modularbot modularbot added the imported-internally Signals that a given pull request has been imported internally. label Nov 22, 2024
@keith
Copy link
Contributor

keith commented Nov 22, 2024

CC: @keith here). Perhaps just omit those tests for now but keep the others so we can rebase and land this?

probably not something we can enforce, probably best if we can find another way to validate this ordering. or split these tests into separate lit targets so they don't pollute the same global python environment

@modularbot
Copy link
Collaborator

✅🟣 This contribution has been merged 🟣✅

Your pull request has been merged to the internal upstream Mojo sources. It will be reflected here in the Mojo repository on the nightly branch during the next Mojo nightly release, typically within the next 24-48 hours.

We use Copybara to merge external contributions, click here to learn more.

@modularbot modularbot added the merged-internally Indicates that this pull request has been merged internally label Dec 4, 2024
modularbot pushed a commit that referenced this pull request Dec 6, 2024
[External] [Stdlib] Add `PythonObject.__contains__`

Simple and small implementation,

Improving the `🐍 Python` experience !

Example usage:

```mojo
x = PythonObject([1,2,3])
if 1 in x:
   print("1 in x")
```

That method should be replaced by a `c-python` function if possible.

Co-authored-by: rd4com <[email protected]>
Closes #3101
MODULAR_ORIG_COMMIT_REV_ID: d08c1f856feda307b8da58ec0cf3442553b3acc7
@modularbot
Copy link
Collaborator

Landed in 996fc97! Thank you for your contribution 🎉

@modularbot modularbot added the merged-externally Merged externally in public mojo repo label Dec 6, 2024
@modularbot modularbot closed this Dec 6, 2024
modularbot pushed a commit that referenced this pull request Dec 17, 2024
[External] [Stdlib] Add `PythonObject.__contains__`

Simple and small implementation,

Improving the `🐍 Python` experience !

Example usage:

```mojo
x = PythonObject([1,2,3])
if 1 in x:
   print("1 in x")
```

That method should be replaced by a `c-python` function if possible.

Co-authored-by: rd4com <[email protected]>
Closes #3101
MODULAR_ORIG_COMMIT_REV_ID: d08c1f856feda307b8da58ec0cf3442553b3acc7
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
imported-internally Signals that a given pull request has been imported internally. merged-externally Merged externally in public mojo repo merged-internally Indicates that this pull request has been merged internally
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants