Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make asdf standard 1.6.0 stable (default) #1744

Merged
merged 8 commits into from
Nov 18, 2024

Conversation

braingram
Copy link
Contributor

@braingram braingram commented Jan 30, 2024

Description

I milestoned this PR for asdf 4.0 because it's a rather major change.

Making 1.6.0 stable will require:
Phase 0 (at any time):

Phase 1:

Phase 2:

Phase 3:

Unfortunately at this point most of the schema repo CIs will be broken (as will asdf-astropy CI) due to the intertwined nature of these packages.

Phase 4:

Phase 5:

  • this PR

Much of the above is due to updates to the ndarray schema:

All schemas that $ref ndarray (like quantity) then need a version bump and so on...

This is further complicated by quantity-1.1.0 currently existing in 2 released packages:

(because of an incomplete effort to split unit fits table and time out of the core).
The approach taken here is to decomission asdf-unit-schemas. asdf-standard will continue to provide updates to the unit (and other non-core) schemas. This seems sensible as these schemas are highly interdependent. More details can be found in: asdf-format/asdf-standard#422

One question that occupied a lot of my thought was "should we change some of the $refs to tags?" On one hand this could make migrations like this easier (if every ndarray $ref was instead a wildcard tag (ndarray-1.*) most of these schemas would not need to be updated. However, this links the tag to the schema which has a few downsides:

  • as tags depend on the extension version, we have effectively added a second version to the schema (a tag might only appear say in asdf-standard 1.6.0 which means that a schema that contains a tag link to that tag will only be valid for asdf-standard 1.6.0).
  • tag links make "duck typing" with a schema very difficult if not impossible.

For example unit/quantity contains a $ref link to unit/unit. This means that asdf-astropy can use a differently tagged unit (astropy/unint for non-vo units) and still produce a valid unit/quantity (see the wfi schema in rad as an example). If instead unit/quantity used a tag link to unit/unit, this same "duck typing" would not work with a differently tagged unit. asdf-astropy would have to instead:

  • use a different schema for the unit/unit tag (this would break the tag-schema mapping defined in the standard unit manifest making any file produced with this approach likely to be incompatible with any other implementation or with other libraries that might choose to implement units)
  • define a new astropy/quantity for non-vo units (and so on up the tree of schema references...)
  • add the astropy/unit tag to the quantity schema (further linking tags and schemas and requiring addition standard schema updates for "downstream" libraries).

At the moment I am of the mind that keeping the schemas as separate from the tags as possible is the better option (so $ref instead of tag). This allows the schemas to function even if they are treated as normal "jsonschema"s. Additionally the tag validator behavior seems loosely defined in the standard where it states "Implementation of this validator is optional and depends on details of the YAML parser." For similar reasons the above asdf-transform-schemas PR did not rely on the feature in asdf to use multiple schemas in a tag definition (to allow the many transforms that $ref the transform schema to instead include them in the manifest).

Both stdatamodels (datamodels) and rad use metaschemas based off of asdf-schema-1.0.0 (which is updated to 1.1.0 in asdf-format/asdf-standard#422). As neither of these packages version schemas updating the metaschema version will force these packages to use exclusively the 1.6.0 standard (which will almost certainly cause issues if old versions of asdf/asdf-standard are used). Instead, I suggest we not update the metaschemas (and keep them using the old asdf-schema-1.0.0 metaschema). The only downside is the lack of float16 support in the datatype keyword validator. This seems like an acceptable limitation for the time being. Once asdf standard 1.6.0 is stable and the asdf version that sets it as the default agrees with the minimum required version for each of those packages the metaschemas can be updated.

Closes: #1866

Checklist:

  • pre-commit checks ran successfully
  • tests ran successfully
  • for a public change, a changelog entry was added
  • for a public change, documentation was updated
  • for any new features, unit tests were added

@braingram braingram force-pushed the standard_1p6 branch 2 times, most recently from 3271e74 to 70b4888 Compare February 7, 2024 21:58
@braingram braingram changed the title TEST: make 1.6.0 stable Make asdf standard 1.6.0 stable (default) Feb 8, 2024
@braingram braingram added this to the 4.0.0 milestone Mar 13, 2024
@braingram braingram force-pushed the standard_1p6 branch 2 times, most recently from d70aa97 to b579429 Compare March 18, 2024 22:22
@braingram braingram closed this Mar 27, 2024
@braingram braingram reopened this Mar 27, 2024
@braingram braingram force-pushed the standard_1p6 branch 2 times, most recently from 613bef2 to f08b3f2 Compare March 27, 2024 16:53
@braingram
Copy link
Contributor Author

braingram commented Aug 2, 2024

Reserving a comment for regression tests.

Roman 1 expected and unrelated failure:
https://github.com/spacetelescope/RegressionTests/actions/runs/11562256445
JWST all pass:
https://github.com/spacetelescope/RegressionTests/actions/runs/11562239771

@braingram braingram force-pushed the standard_1p6 branch 2 times, most recently from 0534965 to 810e601 Compare September 19, 2024 16:03
@braingram braingram marked this pull request as ready for review October 28, 2024 20:34
@braingram braingram requested a review from a team as a code owner October 28, 2024 20:34
@braingram
Copy link
Contributor Author

The weldx downstream failure is real. I rebased the open PR and added a comment that the development branch will soon be moving to asdf 4.0. We will need to keep in mind that the asdf downstream test will fail after this PR is merged.

@braingram
Copy link
Contributor Author

braingram commented Nov 8, 2024

#1866 to track adding this change to "what's new"

EDIT: pushed commit that adds this to docs

@perrygreenfield
Copy link
Contributor

Both stdatamodels (datamodels) and rad use metaschemas based off of asdf-schema-1.0.0 (which is updated to 1.1.0 in asdf-format/asdf-standard#422). As neither of these packages version schemas updating the metaschema version will force these packages to use exclusively the 1.6.0 standard (which will almost certainly cause issues if old versions of asdf/asdf-standard are used). Instead, I suggest we not update the metaschemas (and keep them using the old asdf-schema-1.0.0 metaschema). The only downside is the lack of float16 support in the datatype keyword validator. This seems like an acceptable limitation for the time being. Once asdf standard 1.6.0 is stable and the asdf version that sets it as the default agrees with the minimum required version for each of those packages the metaschemas can be updated.

Does this mean that use of float16 in datatype won't pass validation until 1.6.0 becomes default? I want to make sure I understand the consequence (not that I disagree with it).

@braingram
Copy link
Contributor Author

Both stdatamodels (datamodels) and rad use metaschemas based off of asdf-schema-1.0.0 (which is updated to 1.1.0 in asdf-format/asdf-standard#422). As neither of these packages version schemas updating the metaschema version will force these packages to use exclusively the 1.6.0 standard (which will almost certainly cause issues if old versions of asdf/asdf-standard are used). Instead, I suggest we not update the metaschemas (and keep them using the old asdf-schema-1.0.0 metaschema). The only downside is the lack of float16 support in the datatype keyword validator. This seems like an acceptable limitation for the time being. Once asdf standard 1.6.0 is stable and the asdf version that sets it as the default agrees with the minimum required version for each of those packages the metaschemas can be updated.

Does this mean that use of float16 in datatype won't pass validation until 1.6.0 becomes default? I want to make sure I understand the consequence (not that I disagree with it).

float16 support is available in 1.6.0. We could start using it in (for example) romancal once 1.6.0 is "stable" (which I'm considering the same as making it the default). The metaschema issues won't prevent using float16 arrays in (for example) roman datamodels (schemas aren't checked against metaschemas as part of validation).

However the metaschema issue will mean that adding a datatype validator to the float16 array would cause the call to check_schema that pytest-asdf runs to fail.

To provide an example. Let's say we want to change ImageModel.data to use float16. We could change this line:
https://github.com/spacetelescope/rad/blob/f2d7af769ec0fd77283cdcbe020cca96f026a5a4/src/rad/resources/schemas/wfi_image-1.0.0.yaml#L45
to:

datatype: float16

and save files without issue (no validation errors as long as we assign a float16 array to data). However the following would fail (and is equivalent to a check that pytest-asdf runs on schema files):

>> asdf.schema.check_schema(asdf.schema.load_schema("asdf://stsci.edu/datamodels/roman/schemas/wfi_image-1.0.0"))
ValidationError: 'float16' is not valid under any of the given schemas

due to the wfi_image-1.0.0 using the asdf://stsci.edu/datamodels/roman/schemas/rad_schema-1.0.0 metaschema which refs http://stsci.edu/schemas/asdf/asdf-schema-1.0.0 which refs http://stsci.edu/schemas/asdf/core/ndarray-1.0.0#/definitions/datatype which doesn't define float16 as a valid datatype.

Note that this only prevents adding something like:

datatype: float16

to the schemas that extend asdf-schema-1.0.0. Files can contain float16 arrays (as long as they use ASDF 1.6.0).

@perrygreenfield
Copy link
Contributor

So this only affects validation of schemas against the metaschema I take it. OK, that isn't so serious.

Copy link
Contributor

@perrygreenfield perrygreenfield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@braingram braingram enabled auto-merge November 18, 2024 17:19
@braingram braingram merged commit b67f11b into asdf-format:main Nov 18, 2024
47 of 49 checks passed
@braingram braingram deleted the standard_1p6 branch November 18, 2024 17:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add ASDF standard 1.6.0 default switch to "what's new"
2 participants