Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specification of the NX_class attribute in the NXroot NXDL file #1493

Open
rayosborn opened this issue Oct 5, 2024 · 9 comments
Open

Specification of the NX_class attribute in the NXroot NXDL file #1493

rayosborn opened this issue Oct 5, 2024 · 9 comments

Comments

@rayosborn
Copy link
Contributor

NXroot is the only base class whose NXDL file explicitly added the NX_class attribute as a group attribute, with 'NXroot' as the sole enumerated value. The addition of the NX_class attribute to the HDF5 group attributes is obviously required by all base classes in order to define their class, but this is an implementation detail that is handled by the applications reading the file. The attribute is not even referenced in any of the other base classes or application definitions, apart from an anomalous comment in NXsubentry, which should also be changed for consistency.

I suspect that it was added because the NXroot class is not an HDf5 group, but the file-level object. Its presence here implies that the NX_class attribute is required as a file-level attribute by all NeXus files, but I don't believe this is (or should be) correct. In the nexusformat API, the root-level group is automatically assigned to the NXroot class if the attribute is not specified, but it is not added when writing a file, since its presence is completely redundant.

@prjemian
Copy link
Contributor

prjemian commented Oct 7, 2024

it was added because the NXroot class is not an HDf5 group, but the file-level object.

See #149, which states

To avoid invalidating existing NeXus data files, this attribute will be added as optional to NXroot.

@prjemian
Copy link
Contributor

prjemian commented Oct 7, 2024

If ever it was changed from optional, I believe that would have been a mistake. The source does not say the attribute is required and this is indicated directly in the published documentation:

<attribute name="NX_class">
<doc>
The root of any NeXus data file is an ``NXroot`` class
(no other choice is allowed for a valid NeXus data file).
This attribute cements that definition.
</doc>
<enumeration>
<item value="NXroot"></item>
</enumeration>
</attribute>

image

@prjemian
Copy link
Contributor

prjemian commented Oct 7, 2024

I'm removing the bug label. The documentation is clear this is optional.

@prjemian prjemian removed the bug label Oct 7, 2024
@rayosborn
Copy link
Contributor Author

There is nothing in the NXDL file to state that it is optional. That means that a strict validator will deem every NeXus file without it to violate the standard. At the very least, an optional tag needs to be added. I still think it's a mistake though.

@prjemian
Copy link
Contributor

prjemian commented Oct 7, 2024

Base class content is not optional?

@rayosborn
Copy link
Contributor Author

In principle, they are all optional, but when a new user reads the documentation, they are likely to assume it's required, ironically reinforcing the confusion that triggered this change in the first place. It was caused by people writing their own applications directly in h5py or the HDF5 API and tripping up because they didn't find the 'NX_class' attribute at the file level. In my view, the fix to #149 made this worse.

In #149, @stuartcampbell wrote:

From a discussion in the code repository, there is an inconsistency with the definition of NXroot. It seems that NeXus data files need an attribute at the root level declaring the root node is NXroot.

To avoid invalidating existing NeXus data files, this attribute will be added as optional to NXroot.

As @FreddieAkeroyd wrote in the mailing list, the only reason the NXroot class was added was that XML doesn't have the concept of a file-level object, so the only way of adding file attributes was to add a fictional NXroot group. When Stuart wrote that NeXus data files "need" an attribute at the root level, I think he meant that some people writing their own applications erroneously believed it was necessary, although he then said in the next paragraph that it was optional. The problem is that people writing their own applications won't realize it's optional and the applications will fail when opening files that don't contain the attribute.

At the very, very least, it should be explicitly stated in the documentation that the attribute is not required. We should even explicitly add that programs reading NeXus files should not require it.

@tacaswell
Copy link
Contributor

Technically hdf5 also does not let you put attributes on the file. See https://davis.lbl.gov/Manuals/HDF5-1.8.7/UG/13_Attributes.html in 8.3 it says "Creates a dataset as an attribute of another group, dataset, or committed datatype." and in h5py we implement file-level attributes as a short-cut to the attributes on the / group (https://github.com/h5py/h5py/blob/5c4871e37a08d7b030dc6c18df21f45cd714e6c6/h5py/_hl/files.py#L283-L290 which

    @property
    def attrs(self):
        """ Attributes attached to this object """
        # hdf5 complains that a file identifier is an invalid location for an
        # attribute. Instead of self, pass the root group to AttributeManager:
        from . import attrs
        with phil:
            return attrs.AttributeManager(self['/'])

@rayosborn
Copy link
Contributor Author

I did not know that. Interesting.

@PeterC-DLS
Copy link
Contributor

Note, we at DLS have always been writing NeXus files without this root level attribute.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants