-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow for appending to existing hsds "files" using hsload #86
Conversation
Interesting -- I hadn't considered an append version of hsload! I'd suggest adding an "--append" command line flag so that users don't inadvertently append to a domain when they didn't mean too. What about root attributes (see: line 626 in utillib.py)? If there's a collision with an existing attribute, then you should skip and output a warning message like you do with links. |
I have my own version of this for some of the WTK data. Its stored across multiple (11) .h5 files. Each file has a set of unique datasets and then redundant meta and time_index datasets. In HSDS I just loaded them into a single "file" which is very useful. Great comments, will do on the --append flag. I did a pretty significant refactor locally that I can push up that fixes some potential pitfalls, like re-copying existing data. |
Ok - I just checked in some minor changes so you might want to merge those first. |
load_file -> check if obj exists before creating
consolidate object helpers into a single function to allow skipping existing objects
Added the append option (I'm used to using click so make sure I integrated it correctly with sys.argv...) and changed the object_helper approach to do a single cycle through the objects and skip existing objects (needed during append) |
I had forgotten about this! I'll take a look tomorrow. |
It kept on getting pushed down my todo list... so no rush! |
There are a few edge cases where the creating links and coping the objects would cause problems. e.g. you have multiple links pointing to the same object. Would it cause problems to go back to the multi-pass approach? |
Whats the multi-pass approach? |
It's like this in the current code:
|
I guess I still don't follow. I made two structural changes:
where |
This section of the new object_helper function is just a combination of the old object_create_helper, object_link_helper, object_copy_helper, and object_attribute_helper functions |
I'll need to do some testing to convince myself the approaches are equivalent. |
Hey @MRossol - could you merge in the latest changes from master and re-submit? |
@MRossol - Do you get this error with doing a regular (non-append) hsload?
|
@jreadey All bugs fixed:
|
Thanks @MRossol! Changes merged to master. |
I've belatedly recalled why the multi-pass approach is needed. Say you have a dataset, dset1, with attribute a1 that contains a reference to another dataset, dset2. If dset1 is created before dset2, the creation of a1 will fail because dset2 doesn't exist yet. Sounds bizarre, but this type of structure is commonly used for dimension lists... |
Change mode of fin from "x" to "a"
in obj_creation_handler check if obj exists before creating