Correct assignment of head #761

beckobert · 2024-12-20T16:25:30Z

There seems to be a problem when using preprocessed datasets in combination with multiheads.
When getting a structure from HDF5Dataset, it is first loaded into a Configuration. When initializing the Configuration, the head is not specified and, therefore, is set to "Default" by default. Currently, the correct head saved to the HDF5Dataset is then only set, if configuration.head is None, which currently is never the case.

This pull request should fix that by always setting the head to the value saved in the HDF5Dataset and to Default, if it isn't specified (in line with how heads are set when turning the configuration into AtomicData).
In principle, this assignment can also be moved into the initialization of the Configuration.

There is also - indepentent of multiheads - a problem with preprocessed test sets, if they are preprocessed with multiple processes. They were, contrary to what the documentation says and run_train.py expects, not saved in their own directory, but instead in the same directory with different file names.

beckobert added 3 commits December 20, 2024 16:22

Correct assignment of head

157c7b3

fix preprocessed test sets

321e0a6

import glob correctly

15c2231

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Correct assignment of head #761

Correct assignment of head #761

beckobert commented Dec 20, 2024 •

edited

Loading

Correct assignment of head #761

Are you sure you want to change the base?

Correct assignment of head #761

Conversation

beckobert commented Dec 20, 2024 • edited Loading

beckobert commented Dec 20, 2024 •

edited

Loading