-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes to general sorting algorithms for hoomd and lammps #762
Fixes to general sorting algorithms for hoomd and lammps #762
Conversation
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## main #762 +/- ##
==========================================
- Coverage 91.89% 91.71% -0.18%
==========================================
Files 66 66
Lines 6733 6780 +47
==========================================
+ Hits 6187 6218 +31
- Misses 546 562 +16
☔ View full report in Codecov by Sentry. |
[site.atom_type.atomclass for site in dihedral.connection_members] | ||
) | ||
unique_dihedrals[unique_members] = dihedral | ||
unique_dtypes = [ |
Check notice
Code scanning / CodeQL
Unused local variable Note library
# TODO: The range of ks is mismatched (GMSO go from k0 to k5) | ||
# May need to do a check that k0 == k5 == 0 or raise a warning | ||
container.params["-".join(dtype.member_types)] = { | ||
member_classes = sort_by_classes(dtype) |
Check notice
Code scanning / CodeQL
Unused local variable Note library
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments about the code. One high-level comment I have (probably need a broader discussion later too), so it seems right now, we are going with atom class for everything (storing bond and bond params in the snapshot and forcefield), which would work for most of our current use case, i.e., when connection types are defined between atom classes. This, however, may throw an issue if there is a connection that is defined between atom types. The latter case is rarer but, IMO, it might happen and should be supported. In other word, I think we should go with atom types for everything instead of atomclass. This will make thing a bit bulkier, but it will allow us to cover those edge case.
else: | ||
raise TypeError("Provided connection_type not supported.") | ||
|
||
|
||
def sort_connection_members(connection, sort_by="name"): | ||
"""Sort connection_members of connection.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to add a blurp here about what options sort_by
can take
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
elif isinstance(potential, ImproperType): | ||
return ( | ||
potential.member_classes[0], | ||
*potential.member_classes[1:], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*potential.member_classes[1:], | |
*[sorted(potential.member_classes[1:]), |
elif isinstance(potential, ImproperType): | ||
return ( | ||
potential.member_types[0], | ||
*potential.member_types[1:], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*potential.member_types[1:], | |
*sorted(potential.member_types[1:]), |
elif len(namesList) == 4 and improperBool: | ||
return tuple( | ||
namesList[0], | ||
sorted(*namesList[1:]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorted(*namesList[1:]), | |
*sorted(namesList[1:]), |
elif isinstance(connection, gmso.Improper): | ||
site1, site2, site3, site4 = connection.connection_members | ||
site2, site3, site4 = sorted([site2, site3, site4], key=sorting_key) | ||
return [site1, site2, site3, site4] | ||
else: | ||
raise TypeError("Provided connection not supported.") | ||
|
||
|
||
def sort_by_classes(potential): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might consider using a more descriptive name for this method, since sort_by_classes
is kinda vague (regarding the desired input). Maybe something like sorted_potential_member_classes
(sounds a bit clumsy but I will keep brainstorming some other option).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this method is similar for sort_by_types
, so probably we need something that would distinct these two (sort_by_types
and sort_by_classes
) with the sort_connection_strings
.
other names I can think sort_potential_member_types
and sort_potential_member_classes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed by adding more detailed doc strings. I don't think every function needs to be that detailed. potential
is the only argument, so we don't need it to also be part of the function name. I added details about what potential is and how it's sorted.
if ( # TODO: why are these skipped? | ||
isinstance(mb_force, hoomd.md.long_range.pppm.Coulomb) | ||
or isinstance(mb_force, hoomd.md.pair.pair.LJ) | ||
or isinstance(mb_force, hoomd.md.special_pair.LJ) | ||
or isinstance(mb_force, hoomd.md.pair.pair.Ewald) | ||
or isinstance(mb_force, hoomd.md.special_pair.Coulomb) | ||
): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if ( # TODO: why are these skipped? | |
isinstance(mb_force, hoomd.md.long_range.pppm.Coulomb) | |
or isinstance(mb_force, hoomd.md.pair.pair.LJ) | |
or isinstance(mb_force, hoomd.md.special_pair.LJ) | |
or isinstance(mb_force, hoomd.md.pair.pair.Ewald) | |
or isinstance(mb_force, hoomd.md.special_pair.Coulomb) | |
): | |
if ( # TODO: why are these skipped? | |
isinstance(mb_force, [ | |
hoomd.md.long_range.pppm.Coulomb, | |
hoomd.md.pair.pair.LJ, | |
hoomd.md.special_pair.LJ, | |
hoomd.md.pair.pair.Ewald, | |
hoomd.md.special_pair.Coulomb, | |
] | |
): |
continue | ||
keys = mb_force.params.param_dict.keys() | ||
gmso_keys = gmso_force.params.param_dict.keys() | ||
print("\n\n", keys, gmso_keys, gmso_force, "\n\n") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
print("\n\n", keys, gmso_keys, gmso_force, "\n\n") |
@@ -1413,3 +1430,26 @@ def _convert_params_units( | |||
potential.parameters = converted_params | |||
converted_potentials.append(potential) | |||
return converted_potentials | |||
|
|||
|
|||
def _convert_single_param_units( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this belong to utils/units.py
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. It should get moved into using the new UnitsClasses, but maybe we can do that in a later PR.
So I think it's a necessity to use atomclass instead of atomtype. This is related to #765, which is to try to get hoomd to support GAFF. In fact, that PR is updated with the changes to this PR, because both are necessary for GAFF, so in the end I'll probably address these comments/fixes directly in there and close this PR. But I'm getting away from the point. Here's the situation that arises. A topology has 5 atomtypes, each with unique names, bonded in a line to form 2 unique dihedrals. When writing out the snapshot and the forcefield, the important attribute is the string that is used as the key for these lookup tables. If we use atomtypes, then it'll look like: Molecule representation -> atom1 - atom2 - atom3 - atom4 - atom5 in forces But if the classes of dihedrals are all the same, then we're duplicating dihedrals. Now the situation you're describing, where the dihedrals are defined by their atomtypes, not atomclasses. In this scenario, the atomtypes = atomclasses. The strange situation would be: for dihedral in top.dihedrals:
unique_members = tuple(
[site.atom_type.atomclass for site in dihedral.connection_members]
)
unique_dihedrals[unique_members] = dihedral |
I am not too worry about the duplication issue (since that will be more or less constrained by how many unique molecules in a systems and how many unique dihedrals in those molecules). There will be duplications, but I don't think that will be to that much of a degree. Consider that against the trade-off of supporting the 10% non-common case, that might be worth it. What I mean about dihedral define between atomtypes, I mean it's look like: This topic has been brought up a while ago, but in the strange case that you brought up, the dihedral define by type will take precedent to that defined by atomclass. |
I think the duplication issue will get out of hand. For instance, I have simulations with up to 20 different atomtypes in them. But maybe only 10 or so atomclasses. I have the feeling that will result in a large number of dihedrals between the two cases. There's a reason something large like the OPLS forcefield defines the dihedrals by classes, not by types. I will put together an example of this and we can count the number of dihedrals, but run time will be affected by the number of dihedrals in hoomd no doubt about it. I've never actually seen a forcefield with hybrid type and class definitions. It seems a bit silly: if you have a unique site that needs it's own unique dihedral, then give it an atomclass that's the same as its atomtype. We certainly do not have any tested examples of such a forcefield if it is something we would want to support anyways. |
I also am totally spitballing with that 10% uncommon case. I've never actually seen that done. So who knows exactly what that number is. |
if the dihedral is type1="aa" and is given no class1, then it's class should also be "aa". |
I am thinking more about the case where the bond/angle is defined on atomclass and the dihedral on atomtype (for some reason, like you try to develop a forcefield I try to have it very specific to the atom type level). In that case, the atom type still need its original atomclass info. |
Having just checked this, it resulted in 23 unique dihedrals by types and 14 by classes. |
I can see this. We need tests for it though, which we don't currently have. I think we could have a check at the beginning of the writer that checks to see if all the classes capture all of the unique dihedral expressions. If not, then we populate the snapshot and forces by types, but that shouldn't be the default case. I think this PR should get merged tomorrow for CECAM, and I don't have the time to add that feature in. So potentially we can just raise this as a bug in the issues, and address it when we have the time/ a reasonable test case. |
included in #765 |
HOOMD functionality right now leads to bugs in writing out the forcefields.
There are two functions that need to be in sync
gmso/external/convert_hoomd.py to_hoomd_snapshot
gmso/external/convert_hoomd.py to_hoomd_forcefield
These are not compatible, as discussed in #758.
Additionally, the sorting algorithms in gmso.utils.sorting.py have some bugs with dihedrals, which need to be flipped if the two central atoms are equivalent, but the first and fourth atoms should be in reversed order.
This PR will restructure some of the sorting algorithms and migrate some of the ones that were used in formats/lammpsdata.py from gmso/core/views.py to gmso/utils/sorting.py
TODO List