-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Smaller variant of multicol files for 3D grids (and beyond) #348
Comments
A semi-standard format for a 3D image is OpenDX itself (or at least the subset that is used by APBS, VMD, NAMD, Pymol, etc.). Like "multicol" it contains a header and is text-based, which makes it easy to inspect. I guess that the only constraint is not having the ability to encode a "periodic" flag? Another option is the Matlab format, which is binary but at least is just as widely supported also by open-source code (Octave, SciPy, ...). Because it encodes multiple variables, there is the advantage of putting multiple fields in, e.g. mean gradients alongside histogram, periodic flag etc. It is probably possible to just reuse the open-source implementations. |
Reading more about the format from the SciPy page and other open source libraries like this one: My original idea was to register the grids currently allocated in the various biases as elements in To reduce the space used by high-dimensional grids, I would also suggest in the doc to use |
What about just using an internal format to record the boundaries and widths, and then providing an post-processing tool to convert it to the current grid format? |
That's along the lines of what I meant in the previous comments: the state file has all that information encoded. The scripting interface could easily provide access to the grid array, its boundaries, widths and periodic flags associated with it. Pros: leverages existing code. Cons: requires running the module for post-processing, which currently means using an updated VMD and a molecular structure file for the system. If these requirements are too heavy, a post-processing script could be used as well. But then, the same script would need to replicate some of the internal functionality (particularly, the multicolumn grid I/O code). |
@giacomofiorin I think NAMD with an updated colvars should be enough to read the state file and write the grid. Users can run simulation of "0" step with dummy psf and pdb files, only the colvars input and state files are from an actual simulation. It's just like merging ABF windows with |
Here is a plan:
|
@jhenin I pinged @fabsugar in person and he agrees with switching format for dimensions 3 and above. One issue I didn't think about immediately when you wrote your last comment is how would you define the behavior for Are you okay with limiting use of the DX format for output/visualization only and perhaps begin using the state file format for input? |
Yes, I agree that we don't want to add a DX parser in there. The state file format seems ok, but we'll need the same flexibility offered by multicol: specifically the option of reading multiple grids, potentially with mismatched parameters, into the same bias. |
I checked, in the Lines 959 to 966 in 7f850e0
that mimics a previous check first introduced in 2009 (before the start of the Git history), when the state file began containing the grid_parameters block to support metadynamics grid rebinning.
It's safe to say that nowadays the So I think you can safely adapt the existing logic that Line 85 in 7f850e0
(To be precise, compared to the multicolumn format the state file lacks the periodicity flag, but that's not used to do the remapping anyway). |
Multicol files in their current states are great in 1D and 2D because they are read seamlessly by many plotting programs. In 3D they can get quite bulky, and plotting becomes less straightforward anyway. To reduce the size and the time spend reading and writing them, I see two options: using DX files, which have their own constraints, or a variant of multicol that doesn't have the x values (which can be completely generated based on the data in the header). Thoughts?
The text was updated successfully, but these errors were encountered: