-
Notifications
You must be signed in to change notification settings - Fork 59
Roadmap
This document describes proposed changes. Please see the changelog for previously release and current development version changes. If you see changes you could like to discuss please open a discussion (or issue) and when resolved update this document.
- Organize unit tests in a structure that matches the code (either before or after the API clean-up in 3.x.x)
- Release asdf-compression adding new block compression algorithms (zstd, lz4)
- Move pytest plugin to a different repository (or remove it entirely)
- Store block index as binary object (to speed up parsing of files with a large number of blocks)
- Use block offsets relative to start of blocks (instead of start of file)
- Store asdf metadata (history, extensions, etc) in a separate yaml document to free up namespace for user tree keys
By default, do not memmap arrays.
Remove featured deprecated in 3.x.x.
Switch to ASDF standard 1.6.0 as default.
Remove the pytest-asdf plugin
Disable saving "base array" by default (hopefully with a deprecation in 3.x.x).
This plugin does not provide much utility (it reads examples from the schema) and is intimately tied to pytest (which requires that pytest be installed and used to use these tests). It should be possible to replace the functionality with one or more functions in asdf.testing.helpers
to allow the plugin to be deprecated. See related issues/PRs:
- https://github.com/asdf-format/asdf/issues/924
- https://github.com/asdf-format/asdf/issues/791
- https://github.com/asdf-format/asdf/issues/790
- https://github.com/asdf-format/asdf/issues/749
- https://github.com/asdf-format/asdf/issues/689
- https://github.com/asdf-format/asdf/pull/1756
There are a number of improvements that could be made to the public API including:
- prefix non-public modules, etc with an underscore to make it more explicit that they are private
- layout plan for top-level imports and deprecate anything we don't want to include at the top-level (e.g. Stream, IntegerType)
- evaluate what parts of the public API are critical to asdf functionality and what should be provided by a different library (e.g. asdf.util.get_class_name)
- add custom exceptions to replace generic exceptions and to replace exceptions that are 'passed through' from dependencies (e.g. pyyaml.RepresenterError)
- should generic_io be private? Can it be removed/replaced?
- are there redundant options/configuration settings (see: https://github.com/asdf-format/asdf/pull/1477 https://github.com/asdf-format/asdf/pull/1476)
For example:
- can external blocks now be implemented as an extension?
- can external array reference be implemented as an extension?
- lz4 compression, this can be moved to an extension
As AsdfConfig provides a centralized and flexible way to define various asdf options we should investigate moving more options into AsdfConfig. This will likely result in a 'too large' number of options so we should consider ways to organize, nest, or in some other ways make these options easy to understand and use.
Consider some new features! These include:
- super-dictionary access to the ASDF tree with per-node lazy-loading
- partial block reading (for local and cloud-based non-chunked files)
- migrate to a new jsonschema library (perhaps jsonschema-rs)
As this is a major version change, asdf 3.0 removes several deprecated features:
- legacy extension API based on AsdfType
- AsdfInFits
- other deprecated features
In addition to the above removals, asdf 3.0 adds a few new features to the public API:
- Converter block access
- Converter deferral
- Array storage option control to AsdfConfig
Internally, asdf 3.0 will include a major rewrite of ASDF block management code that is necessary to move NDArrayType to a Converter. This rewrite fixes many bugs in ASDF block reading and writing.
Additionally, asdf 2.15.1 included internally (vendorized) jsonschema 4.17.3 to deal with jsonschema 4.18 dropping support for features required by asdf. To ease the transition for downstream packages, asdf 2.15.1 kept jsonschema as a dependency and attempts to use some exceptions from jsonschema (to allow downstream code that catches these errors to function). Asdf 3.0 will drop jsonschema as a dependency.
We also want to strongly consider mentioning in the 3.0 docs that we will be (or are at least considering) disabling memmapping as the default option when files are opened in version 4.0.
This release will remove the experimental subclass attribute serialization feature, add support for ASDF Standard 1.6.0, add a global configuration feature, and add new APIs for extending ASDF.
The experimental subclass attribute serialization feature will be removed (and its supporting schema dropped from ASDF Standard 1.6.0).
The proposed roadmap for ASDF Standard 1.6.0 entails the following new requirements:
- Schema defaults must not be added to or removed from the tree when working with an ASDF Standard 1.6.0 file. https://github.com/asdf-format/asdf/pull/860
- Refuse to write complex YAML keys such as maps or lists. https://github.com/asdf-format/asdf/pull/866
- Support for tag URI schemes beyond
tag:
. https://github.com/asdf-format/asdf/pull/854, https://github.com/asdf-format/asdf/pull/855 - Support for serializing null values. https://github.com/asdf-format/asdf/pull/863
- Support for tags whose URI prefix has changed (the current
ExtensionType
API requires that tags supported by the same class differ only in version). https://github.com/asdf-format/asdf/pull/853
Introduce a configuration mechanism that will allow certain AsdfFile
options (such as read_on_validate
) to be set globally.
- https://github.com/asdf-format/asdf/pull/819
- https://github.com/asdf-format/asdf/pull/839
- https://github.com/asdf-format/asdf/pull/844
- https://github.com/asdf-format/asdf/pull/847
The current ExtensionType
API is complicated and difficult to reason about. We'll introduce a new simplified API for handling custom tags which will also be sufficiently flexible to handle the new requirements of ASDF Standard 1.6.0.
The current AsdfExtension
API does not include any kind of extension identifier, which means we end up describing the extension by Python class name in the ASDF file's metadata, which is not a portable solution. There is also no convenient way for an extension to express that there are different versions of itself, what the default version should be, and what versions of what tags are permissible under that version.
We'll introduce a new extension API with properties that supply this missing information (and also provide a list of tag handlers associated with that extension).
- https://github.com/asdf-format/asdf/pull/850
- https://github.com/asdf-format/asdf/pull/851
- https://github.com/asdf-format/asdf/pull/853
- https://github.com/asdf-format/asdf/pull/857
- https://github.com/asdf-format/asdf/pull/874
The current AsdfExtension
API for retrieving schemas has some drawbacks:
- It is not possible to list the schemas provided by the extension. The code that maps schema URIs to file paths will happily map to filenames that don't exist.
- The schema content must be provided as a URL, which is an obstacle to storing schemas as package resources or writing schemas in the REPL during development.
We will introduce a new API for mapping schema URIs to schema content.