Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow Amending NWB Files #164

Merged
merged 18 commits into from
Oct 4, 2019
Merged

Allow Amending NWB Files #164

merged 18 commits into from
Oct 4, 2019

Conversation

lawrence-mbf
Copy link
Collaborator

@lawrence-mbf lawrence-mbf commented Aug 9, 2019

fix #109 Allows for opening a preexisting NWB file and efficiently writing it back out without rewriting all the datasets.

@bendichter
Copy link
Contributor

hey is this ready to test?

@lawrence-mbf
Copy link
Collaborator Author

Yes

@bendichter
Copy link
Contributor

running ecephys and then nwbExport(nwb, 'ecephys_tutorial.nwb') yeilds this error:

Error using hdf5lib2
The HDF5 library encountered an unknown error.

Error in H5D.write (line 100)
H5ML.hdf5lib2('H5Dwrite', varargin{:});

Error in io.writeDataset (line 12)
H5D.write(did, tid, sid, sid, 'H5P_DEFAULT', data);

Error in types.core.NWBFile/export (line 521)
                io.writeDataset(fid, [fullpath
                '/file_create_date'],
                class(obj.file_create_date),
                obj.file_create_date, true);

Error in nwbfile/export (line 58)
                refs = [email protected](obj,
                output_file_id, '/', {});

Error in nwbExport (line 35)
    export(nwb(i), fn);

@lawrence-mbf
Copy link
Collaborator Author

So this looks like an error due to increasing the size of a dataset. This is due to file_create_date increasing in size every time the nwb file is exported. Unfortunately, the only way to allow dynamically resizing a dataset that's already written is to guarantee that it was saved in chunked mode (which we don't use at the moment). Considering chunking mode is optional, do you happen to know if this constraint also exists in pynwb if the dataset layout is not chunked?

@bendichter
Copy link
Contributor

Yeah good point it looks like pynwb stores file_create_date as contiguous

@oruebel
Copy link
Contributor

oruebel commented Aug 9, 2019

The chunked vs. contiguous store for file_create_date is really an HDF5 issue, i.e., if HDF5 can't expand it in Matlab than I'd expect PyNWB to see the same problem.

@lawrence-mbf
Copy link
Collaborator Author

It looks like this particular feature doesn't quite exist in pynwb yet: NeurodataWithoutBorders/pynwb#990.

Is there another way to append data to a previously written dataset?

@lawrence-mbf
Copy link
Collaborator Author

I've added a solution. For the special property file_create_date and only file_create_date, we force chunking. This will allow for the regular modification behavior. For pynwb-created files, we print a warning but ignore writing the dataset which does match pynwb behavior for file_create_date if the user doesn't modify it themselves.

@bendichter
Copy link
Contributor

That sounds like a good solution to me

@bendichter
Copy link
Contributor

@ln-vidrio could you write a test for this?

@lawrence-mbf
Copy link
Collaborator Author

@bendichter sure, I'll take a look at it.

@bendichter
Copy link
Contributor

@ln-vidrio Thanks for working on this! It looks like there are a bunch of merge conflicts now. Would you mind sorting through that?

@bendichter
Copy link
Contributor

ecephys:

Undefined function or variable 'fid'.

Error in NwbFile/export (line 54)
                obj.embedSpecifications(fid);

Error in nwbExport (line 35)
    ex
port(nwb(i), filename);

Error in ecephys (line 278)
nwbExport(nwb, 'ecephys_tutorial.nwb')

@lawrence-mbf
Copy link
Collaborator Author

yeah it's not done yet. I'll let you know when I get the tests running.

@lawrence-mbf
Copy link
Collaborator Author

@bendichter This should be okay now I think. Let me know if there are any issues. I've also added a test type for testing amend with any of the other system tests.

@bendichter
Copy link
Contributor

@ln-vidrio

I ran into a bug with links:

>> ecephys

...

Error using hdf5lib2
The HDF5 library encountered an error and produced the following
stack trace information:

    H5L_link_cb           name already exists
    H5G_traverse_real     traversal operator failed
    H5G_traverse          internal path traversal failed
    H5L_create_real       can't insert link
    H5L_create_ud         unable to register new name for object
    H5Lcreate_external    unable to create link

Error in H5L.create_external (line 34)
H5ML.hdf5lib2('H5Lcreate_external', varargin{:});

Error in types.untyped.ExternalLink/export (line 71)
            H5L.create_external(obj.filename, obj.path, fid,
            fullpath, plist, plist);

Error in types.core.NWBFile/export (line 690)
            refs = obj.general_subject.export(fid, [fullpath
            '/general/subject'], refs);

Error in NwbFile/export (line 55)
                refs = [email protected](obj, output_file_id,
                '/', {});

Error in nwbExport (line 35)
    export(nwb(i), filename);

Error in ecephys (line 350)
nwbExport(nwb3, 'link_test.nwb')

@lawrence-mbf
Copy link
Collaborator Author

works for me now.

@bendichter bendichter self-requested a review October 4, 2019 14:39
Copy link
Contributor

@bendichter bendichter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me as well. I think this is ready.

@lawrence-mbf lawrence-mbf merged commit 3353784 into master Oct 4, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

How to append a new dataset to a file
4 participants