-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
a collection of .nwb files of small sizes with various "features" #1087
Comments
It will be hard to support all flavors of NWB 1.0 as the flavors are just too different, and most files I have encountered are also only NWB 1.0 in spirit.
For testing you could just strip out the bulk data, i.e., replace the large arrays that store the majority of the data with smaller version .
Currently we create test files from the integration test suite. I think a good way may be to have tutorials that show best practices on how to generate good NWB files and then use those for testing. |
We have created something like this already. Maybe it will fit your needs. Near the bottom of https://www.nwb.org/example-datasets/ you'll find
The download link leads to this google drive folder, which has NWB files generated from previous versions of pynwb. The idea is to cache them in the CI and use them to ensure that we do not break backwards compatibility, though I admit I never followed it to see if that was implemented in the CI tests. We could add an NWB 1.0 file and make sure that we catch it and throw an informative error message (#1086). Would these work for your purposes? |
Yes! Thank you @bendichter - that sounds exactly what I desire! Adding nwb 1 file there to use in a test would be great! |
And thank you @oruebel for your feedback! I might get back to desire of getting a utility to strip bulk data from the files - might be useful to be about to share minimized versions of files found in the wild for troubleshooting purposes |
FWIW, I posted that google folder directly to git repo (top level .zip still under annex, larger than 100kb, I decided not to bother providing its content extracted since not sure if of benefit) to https://github.com/dandi-datasets/nwb_test_data |
ATM there is already a good wild range of .nwb files which could be found in the wild. They differ in version of nwb, and data types they contain etc. For demonstration and testing purposes it would be nice to collate a collection of sample .nwb files, including the ones representing NWB 1.0 flavors (possibly with conversion scripts from 1.0 to 2.0), and modern NWB files.
Unfortunately all of the ones in the wild are quite large to come up with a collection which would be feasible for use in regression tests etc.
I wondered if it would be feasible to "minimize" existing files. E.g. given an .nwb file (possibly 1.0), would there be some legit way to minimize it (strip multiple sessions/subjects to a single one, reduce # of measurements to a single one, etc) while preserving at least the top levels of the hierarchy etc.
If not -- how potentially such a collection of representative .nwb files could be established?
The text was updated successfully, but these errors were encountered: