-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Paths as URIs #243
Paths as URIs #243
Conversation
What should we do with the paths in kerchunk references? Are they are always meant as absolute? I guess we should assume they are absolute, unless they have |
They are always meant "as interpreted by the target filesystem". The nature of that filesystem might be implied by the protocol of a path alone, but commonly additional arguments are also required. This means, that relative paths do work if the target happens to be the local filesystem (file://), but I think of the other filesystems, only ssh supports this concept at all. I would not expect this to be meaningful for basically any practical case. Note that the dir:// filesystem adds prefixes to URLs for any filesystem, if that's useful at all. |
(I am happy to require absolute paths even if it makes some tests slightly more verbose) |
Thanks @martindurant !
But is the nature of the filesystem explicitly recorded in the kerchunk references format anywhere? Obviously if the prefix is explicit (e.g.
Would this approach work then?
This might be helpful if the above approach doesn't work. |
No. The original intention was to have these in the "templates", but in practice, the remote_protocol, remote_options and fss arguments to ReferenceFileSystem are used (and often encoded in Intake prescriptions) in cases of ambiguity. |
I think that would have the same problem as
Turns out I didn't need a wrapper, I can get away with just more if...else logic for each of the possible types of paths. It's slightly less neat but it avoids the dependency issues. |
This now works! It now builds on #323 instead of #318, which allowed me test handling of relative paths for dmrpp. The typing CI is failing but only with errors inside the hdf reader (cc @sharkinsspatial), in lines I didn't touch, which I don't understand, so I'm going to punt on that so I can merge this (because it blocks earth-mover/icechunk#402). |
This PR closes #242 at the data model level - all paths are coerced to absolute URIs (i.e.
file:///directory/test.nc
ors3://bucket/test.nc
) as they go into theManifest
.As this forbids constructing manifests using relative paths, it requires minor changes to many tests (e.g.
test.nc
->/test.nc
). It also will require slightly more invasive changes to any tests that involve kerchunk references.docs/releases.rst
New functions/methods are listed inapi.rst
Sub-tasks:
.rename_paths
method automaticallyfs_root
option internallyvirtual_backend_kwargs
to all backends (see Add virtual_backend_kwargs argument to open_virtual_dataset #315)Deprecation warning forreader_options
(rename tofsspec_kwargs
?) (also see Add virtual_backend_kwargs argument to open_virtual_dataset #315)fs_root
forkerchunk
anddmrpp
readers as an option tovirtual_backend_kwargs
(requires Add virtual_backend_kwargs argument to open_virtual_dataset #315)dmrpp
EDIT: turns out I don't think it's necessary, at least not to get tests to passfs_root
kerchunk
reader (requires Refactor kerchunk reader tests to call open_virtual_dataset #317)Fordmrpp
reader (requires Add dmrpp relative path integration test #318)fs_root
forkerchunk
anddmrpp
readersfs_root
automatically in other readersFilenames containing trailing '/#\d+/' are not supported:
earth-mover/icechunk#279 and ValueError when decoding virtual reference: Filenames containing trailing '/#\d+/' are not supported earth-mover/icechunk#402)