-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ability to detect if cache is dirty and should be uploaded even if file was not opened by cloudpathlib #73
Comments
So I think pyvips.read_image(str(Path(S3Path("s3://..."))) could be slightly more concisely written as pyvips.read_image(os.fspath(S3Path("s3://...")) We could also add a method/property like |
I like I.e., do we want to implement and support something like:
|
read_only_local_path
)
Is this already supported well? I think this issue can be closed https://cloudpathlib.drivendata.org/caching/#Handling-conflicts If people don't want to use the force_overwrite_to_cloud kwarg in s3p = S3Path("s3://...")
try:
pyvips.read_image(s3p.fspath)
except OverwriteNewerCloudError:
s3p._refresh_cache()
pyvips.read_image(s3p.fspath) |
This issue is somewhat mitigated by the work that has gone into #140. This issue tracks the generic case of:
We could potentially handle this by tracking modified times and uploading the local file in the This may be a won't fix since I'm not sure there's a reliable and efficient way to do this with general logic. In that case, we could close this with an update to the docs. That said, I don't think we actually have this handled ATM. |
We all want to live in a world where every Python library hands PathLike objects. This is not that world.
Many libraries need a path to a local file—especially as a string—in order to read that file. We should expose a supported and documented way to get a path to the version in the cache.
This came up for me in working with
pyvips
where I had to do something like this:This is not ideal. One option (as noted here: #72 ) is to override
__fspath__
to do the caching and return the local path. Then something like this would work:Another option (or in addition) is to add a property like
read_only_local_path_string
(🤣 at name)The big caveat is that we replace the
.close
method on the buffer if you open for write through CloudPath so that we know you intend to change the file:https://github.com/drivendataorg/cloudpathlib/blob/master/cloudpathlib/cloudpath.py#L322-L339
Anything you do to the local path is pretty much read-only since we won't automatically upload. (At least, not without a big change—for example, overriding
__del__
on the CloudPath to do the upload if local is newer, or tracking all of the files DL'd by a Client on the Client and having it check modified times on those and upload. In general I am a little worried about automatically assuming a user wants changed files to be uploaded to overwrite things on the cloud.... )The text was updated successfully, but these errors were encountered: