Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use of symbols in base classes #1508

Open
rayosborn opened this issue Nov 4, 2024 · 0 comments
Open

Use of symbols in base classes #1508

rayosborn opened this issue Nov 4, 2024 · 0 comments

Comments

@rayosborn
Copy link
Contributor

The symbols tag in NXDL files is used to define variable values that should be shared by multiple fields within the group. Most commonly, these symbols refer to the field rank and dimension sizes. This is potentially a very powerful tool in validating NeXus files against the standard but, in practice, a validator that strictly enforces many of these symbol equivalences often incorrectly identifies standard violations. For example, in the NXdetector NXDL file, the following symbols are defined:

    <symbol name="nP"><doc>number of scan points (only present in scanning measurements)</doc></symbol>
    <symbol name="i"><doc>number of detector pixels in the first (slowest) direction</doc></symbol>
    <symbol name="j"><doc>number of detector pixels in the second (faster) direction</doc></symbol>
    <symbol name="k"><doc>number of detector pixels in the third (if necessary, fastest) direction</doc></symbol>
    <symbol name="tof"><doc>number of bins in the time-of-flight histogram</doc></symbol>

The data field has the following dimensions:

    <dimensions rank="4">
      <dim index="1" value="nP" />
      <dim index="2" value="i" />
      <dim index="3" value="j" />
      <dim index="4" value="tof" />
    </dimensions>

Strict enforcement of these rules would imply that all NeXus files that do not contain four-dimensional arrays, with the first corresponding to the scan number and the last to the time-of-flight values, violate the standard.

This is obviously not what is intended. The question is what do we do about it. It could be argued that these are for guidance only, and the documentation tag indeed says that they are merely "illustrative." However, there may be cases where we really do want to specify the rank of the data, in which case the question is how we distinguish illustrative symbols from required ones.

I don't have a specific proposal to solve this issue, although I think it might be possible to make this more general. Normally, the aim is to ensure that a number of fields have the same shape, e.g., data and data_errors. In these cases it might be better to stop specifying actual dimensions but only have symbols denoting rank and shape. It would then be easy for a validator to check that those fields do match each other. I believe it would be useful for the NIAC to discuss this and perhaps establish a group to come up with potential solutions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant