-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: unify data access #226
base: develop
Are you sure you want to change the base?
Conversation
bbacfb7
to
21fb963
Compare
21fb963
to
a29cf9f
Compare
66696bc
to
209f9f3
Compare
bfe7d36
to
e43997f
Compare
e43997f
to
b4ea875
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's great work. One aspect that is not covered for now is the error handling. The interfaces do not expose exceptions. This could be added once we get more experience.
I have some comments, mainly about class responsibilities. Sometimes it does things that are usually done by the client and that makes the AccessManager
less generic.
Pulls a file from a storage to write it in the local storage. | ||
If the input storage is local, then it is a copy. Otherwise it is a download. | ||
""" | ||
storage = AccessManager.resolve_storage(href) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should make sure that dst
is in the File storage prefixes (see todo add authorised paths).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we need to force the AccessManager to be initialized with exactly one FileStorage system also, don't we? I'll add the authorized paths in a new commit. There is already a text that is done to check that the file is in local storage that is performed by the AbstractStorage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we do not need to have exactly one FileStorage. We could have several, each of them could have a list of prefix. A copy to a dst is allowed if the dst has a common prefix with one of the prefix of the FileStorages that is writable (writable could be a property of the Storage).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's just an idea.
return storage.get_storage_parameters() | ||
|
||
@staticmethod | ||
def pull(href: str, dst: str, is_dst_dir: bool): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is is_dst_dir
for? Is it really up to the client to tell whether the dst is a dir or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to factorize the code for the creation of the filename and i need to know if the dst
is a directory or not to to properly create the right filename. I will create a utilitary function to create the right filename, and will remove that argument in the method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok.
if scheme != "" and scheme != "file": | ||
raise ValueError("Destination must be on the local filesystem") | ||
|
||
def prepare_for_local_process(self, href: str) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get the difference with pull
, except the tempfile.gettempdir()
that could be done by the client. This method taints the AbstractStorage
interface with processing matters that are not necessary for what I see.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You would see instead of that, a utility function that would more transparently retrieve the data from a distant storage if needed for a processing? Or just call storage.pull
and maybe have a check in the FileStorage that if src==dst there is nothing done? So users could just pull the file if needed to be transparent
file_name = os.path.basename(href) | ||
# Get direct parent folder of href_file to zip | ||
dir_name = os.path.dirname(href) | ||
target_file_name = os.path.splitext(file_name)[0] + datetime.now().strftime("%d-%m-%Y-%H-%M-%S") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure that it is up to the AccessManager
to decide how the file should be named. I believe that this is up to the client. It might create code duplication but it makes responsabilities clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay I will add target_file_name as an input to the method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But do you also agree?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it makes more sense like that !
Quality Gate passedIssues Measures |
No description provided.