Replies: 5 comments 5 replies
-
I don't have a strong opinion. I favor being explicit, but I recognize that requiring type designators seems inelegant for simple type inference use cases. Re: Pydantic. Pydantic uses its own version of a type designator but can work without it. Without a type designator, Pydantic will dumbly parse the object as the specified base class or the first matching class out of a union of options. You can change that behavior with custom, complex inference code. Here is a nice write-up: https://blog.devgenius.io/deserialize-child-classes-with-pydantic-that-gonna-work-784230e1cf83 . However, I think that even with this inference code, there are extreme cases where disambiguation is just not possible. For example, if neither |
Beta Was this translation helpful? Give feedback.
-
Somewhat related: how would you add type designators to your example? I can't seem to get it to work... Does it work only when not inlined as list? Schema: id: https://w3id.org/linkml/compliance/type-inference
name: type_inference
prefixes:
linkml: https://w3id.org/linkml/
compliance: https://w3id.org/linkml/compliance/
default_range: string
imports:
- linkml:types
classes:
Container:
attributes:
entities:
range: Entity
multivalued: true
inlined_as_list: true # <-- I added this
Entity:
attributes:
name:
type: # <-- I added this
designates_type: true # <-- I added this
Organization:
is_a: Entity
attributes:
organization_id:
Person:
is_a: Entity
attributes:
person_id: Data: entities:
- person_id: P1
type: Person # <-- I added this
- organization_id: P1
type: Organization # <-- I added this Output:
|
Beta Was this translation helpful? Give feedback.
-
Let me try to describe our use-case (since I'm one of the motivators for this idea). We are defining a schema to model devices. The goal of such a schema is providing the vocabulary that our software is going to understand and support. Support for additional devices can be added over some extensions. Those extensions can have their own vocabulary derived ( Our software only cares about our classes and slots and those derived ( Let me try to illustrate it with an example (derived from the examples you use on your tests). Given a base schema:
And a derived schema
And these data
Our software will know about Hopefully it makes it clearer what I'm trying to accomplish to better understand my use-cases. |
Beta Was this translation helpful? Give feedback.
-
I would expect the validator not to complain with the newly added flag ˋ--include-range-class-descendantsˋ, contributed by me and merged just yesterday (no release including it yet). It works like the same flag for ˋgen-json-schemaˋ.
…________________________________
From: Ryan Ly ***@***.***>
Sent: Friday, July 21, 2023 10:18:18 PM
To: linkml/linkml ***@***.***>
Cc: Cirujano Cuesta, Silvano (T CED SES-DE) ***@***.***>; Manual ***@***.***>
Subject: Re: [linkml/linkml] Allow for type inference even in the absence of type generators (Discussion #1548)
+1 for arbitrary user-defined schema extensions. That would be necessary for use in the NWB project.
Regarding type inference in this case, note that Pydantic currently supports parsing the list of event data as instances of the Event class, and not their original subclass (e.g., ExaminationEvent). However, Pydantic does not support dumping of the list of event data to contain the extra fields added by subclasses (without a fair amount of additional code). Example:
from pydantic import BaseModel
class BasePet(BaseModel):
legs: int
class Cat(BasePet):
meows: float
class Dog(BasePet):
barks: float
class Container(BaseModel):
pets: list[BasePet]
container = Container(pets=[Cat(legs=4, meows=2.718), Dog(legs=3, barks=3.14)])
# dumping model results in loss of fields from subclasses
# parsing this json will result in BasePet objects
print(container.model_dump_json())
# extra fields are OK and ignored during validation
# parsing this json will result in BasePet objects
print(Container.model_validate_json('{"pets": [{"legs": 4, "meows": 1}, {"legs": 3, "barks": 2}]}'))
Output:
{"pets":[{"legs":4},{"legs":3}]}
pets=[BasePet(legs=4), BasePet(legs=3)]
I think the LinkML validator would complain that there are additional properties, so LinkML would need to allow that, perhaps with a new key "allow_additional" or "allow_subclasses":
Person:
is_a: Entity
attributes:
name:
events:
range: Event
multivalued: true
allow_additional: true
That seems reasonable to me. But I think I still prefer being explicit with a type designator.
—
Reply to this email directly, view it on GitHub<#1548 (reply in thread)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AB74RACJLX7S3ULQKXLH3P3XRLPYVANCNFSM6AAAAAA2SB7ISQ>.
You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
It would be good to get some fully worked up examples around the inlining. I've been trying to get the schema and data on https://linkml.io/linkml/schemas/inlining.html#example to validate and cannot. I believe this to be the schema id: https://w3id.org/linkml/examples/organism
name: organism-test-model
prefixes:
linkml: https://w3id.org/linkml/
imports:
- linkml:types
classes:
Organism:
attributes:
id:
identifier: true
name:
range: string
has_subtypes:
range: Organism
multivalued: true
inlined: true and this to be the data id: NCBITaxon:40674
name: mammals
has_subtypes:
- id: NCBITaxon:9443
name: primates
has_subtypes:
- id: NCBITaxon:9606
name: humans
- id: NCBITaxon:9682
name: cats and the command to be linkml-validate -s organism-model.yaml organism-data-inline.yaml but this is the error that I receive
|
Beta Was this translation helpful? Give feedback.
-
Given a schema:
Should the following data be considered valid?
Currently it is not. The schema author has the ability to add a type_designator to allow for dynamic designation of type. UPDATE currently in the generated pydantic this pattern is followed, thanks to @rly for link
However, there is an argument that this should not be necessary. There is only one valid interpretation of the data. Reasoning can be applied at run time to determine the correct interpretation. For an example of a framework that excels at this kind of thing, see cuelang.
Note the above example is trivial. We can imagine scenarios involving chaining through a series of nested objects looking at range constraints, rules, other kinds of constraints, to arrive at a single correct interpretation.
Arguments against:
person_id
. Adding clever magic can obfuscate validationPerhaps the most linkml-esque approach here is to be pluralistic. Add a schema metaslot
uses_dynamic_type_inference
. If this is True, and you can try and generate an artefact that does not support this (e.g. current pydantic classes), a warning or error is raised. But we do not block people who want to try and implement this semantics.Beta Was this translation helpful? Give feedback.
All reactions