Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

subclassing schema #115

Open
SoundsSerious opened this issue Sep 9, 2022 · 5 comments
Open

subclassing schema #115

SoundsSerious opened this issue Sep 9, 2022 · 5 comments

Comments

@SoundsSerious
Copy link
Contributor

SoundsSerious commented Sep 9, 2022

Hey Again!

Found a bug (of my own making!), release 0.7.0 has this code:

This was the modifed code previously:

@classmethod
def schema(cls,only:typing.List[str]=None):
    '''schema caches Marshmellow Schemas on this class to preserve memory'''
    if not hasattr(cls,'_cls_schema'):
        objects_dict = cls.get_objects_dict()
        cls._cls_schema = type(
            cls.__name__ + "Schema", (ObjectSchema,), objects_dict
        )

I'm for now hacking this in my own application code with the use of hidden class based variables, with dynamic lookup

@classmethod
def columns(cls) -> typing.List[str]:
    
    f = lambda : "__{}_col_cache".format(cls.__name__)
    print(cls.__name__)
    if not hasattr(cls, f() ):
        cls._cls_schema = None
        del cls._cls_schema
        schema = cls.schema()
        out = set(schema.fields.keys()).union(set(["pik"]))
        if "_key" in out:
            out.remove("_key")
        setattr(cls,f(),list(out)) 
    return getattr(cls,f())

I can hopefully create a PR for this change soon.

@SoundsSerious
Copy link
Contributor Author

SoundsSerious commented Sep 9, 2022

It doesn't appear there are any other places that use the cache variables so should be ok?

@SoundsSerious
Copy link
Contributor Author

Found an issue, that was resolved with the following:

    @classmethod
    def schema(cls, only: typing.List[str] = None):
        """schema caches Marshmellow Schemas on this class to preserve memory"""

        f = lambda cl: "__{}__cls_schema".format(cl.__name__)
        g = lambda cl: "__{}__cls_schema_cache".format(cl.__name__)

        if not hasattr(cls, f(cls)):
            objects_dict = cls.get_objects_dict()
            setattr(
                cls,
                f(cls),
                type(cls.__name__ + "Schema", (ObjectSchema,), objects_dict),
            )

        # Extra fields related schema configuration
        unknown = EXCLUDE
        if cls._allow_extra_fields is True:
            unknown = INCLUDE

        if not hasattr(cls, g(cls)):
            # print(f'making {cls.__name__} schema with only=None')
            SC = getattr(cls, f(cls))()
            SC.unknown = unknown
            SC.object_class = cls
            setattr(cls, g(cls), {None: SC})

        if only is not None:
            # create a unique schema for this input by using a hashable set
            if isinstance(only, (list, tuple)):
                onlyKey = str(
                    set(only)
                )  # garuntees order / uniquness and hashability
            else:
                onlyKey = only

            if onlyKey not in g(cls):
                SC = getattr(cls, f(cls))(only=only)
                SC.unknown = unknown
                SC.object_class = cls
                getattr(cls, g(cls))[
                    onlyKey
                ] = SC  # asssign the unique schema for only

            # return the unique schema with onlykey
            return getattr(cls, g(cls))[onlyKey]
        else:

            return getattr(cls, g(cls))[None
            ```

@overbost
Copy link
Contributor

overbost commented Feb 6, 2023

This could be a solution for the bug i reported here

But if you inherit a lot of times the last class will have waste data inside:

class A(collection):
    pass
class B(a):
   pass
class C(b):
   pass

C have this properties:
__A__cls_schema
__A__cls_schema_cached
__B__cls__schema
__B__cls_schema_cached
__C__cls_schema_cached
__C__cls_schema_cached

i suggest to cache the name of the class and check if che cached name is the real name of the class

@classmethod
def schema(cls,only:typing.List[str]=None):
    '''schema caches Marshmellow Schemas on this class to preserve memory'''
    if not hasattr(cls,'_cls_schema') or 
        not hasattr(cls, 'cls_name') or 
        cls.__name__ != cls.cls_name:

        objects_dict = cls.get_objects_dict()
        cls._cls_schema = type(
            cls.__name__ + "Schema", (ObjectSchema,), objects_dict
        )
        cls._cls_name = cls.__name__

@SoundsSerious
Copy link
Contributor Author

@overbost thanks for the feedback here, this is an awesome library but there do seem to be a few bugs in the meta. Mostly this relates to marshmallow for example.

I generally agree its a good idea to make this class specific. In the solution above, I used the python behavior of appending a class name to a double underscore to facilitate this. The code above specifically sets a variable per class name on the class using lazy insanitation.

I don't believe this will be set multiple times on inherited classes, however I would love to know if this is wrong :)

I just ran a simple test case taking a class I previously called schema() on and then using it as a base class so the __<cls> variable exists when subclassing. I then ran schema() on that base class and only see one set of __<cls>__... that is named for the intended class. This shows that the class variables do not transfer.

I have been meaning to create a PR for this but have been too busy. I would be happy to review and test a PR if you were willing to put one together!

@SoundsSerious
Copy link
Contributor Author

SoundsSerious commented Feb 8, 2023

A note on subclassing.

If you want to make a base class it is best to use CollectionBase to define this.

When you are ready to create an actual ORM class you would inherit Collection as well.

So your example would look like this, certainly not ideal but it works. There are similar issues in SQLAlchemy as i recall, and I think any meta-programming approach could easily have these issues as well.

class BaseClass(CollectionBase):
    pass
class ABase(BaseClass):
    pass
class BBase(ABase):
   pass
class CBase(BBase):
   pass
   
class A(ABase,Collection):
   pass
class B(BBase,Collection):
   pass
class C(CBase,Collection):
   pass

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants