Skip to content

Commit

Permalink
Merge pull request #529 from openforcefield/mol_fix_and_xyz
Browse files Browse the repository at this point in the history
Mol fix and xyz
  • Loading branch information
Josh Horton authored Mar 3, 2020
2 parents 0f7fd72 + 617d00f commit 0e5ff41
Show file tree
Hide file tree
Showing 6 changed files with 477 additions and 83 deletions.
9 changes: 8 additions & 1 deletion docs/releasehistory.rst
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,10 @@ New features
tests will fail when using this format due to a loss of information. We have also added support for fixed
hydrogen layer nonstandard InChI which can help in the case of tautomers, but overall creating molecules from InChI should be
avoided.
- `PR #529 <https://github.com/openforcefield/openforcefield/pull/529>`_: Adds the ability to write out to XYZ files via
:py:meth:`Molecule.to_file <openforcefield.topology.Molecule.to_file>` Both single frame and multiframe XYZ files are supported.
Note reading from XYZ files will not be supported due to the lack of connectivity information.


Behavior changed
""""""""""""""""
Expand Down Expand Up @@ -122,7 +126,7 @@ Tests added
tests which add coverage for :py:meth:`Molecule.to_inchi <openforcefield.topology.Molecule.to_inchi>` and
:py:meth:`Molecule.from_inchi <openforcefield.topology.Molecule.from_inchi>`. Also added coverage for bad inputs and
:py:meth:`Molecule.to_inchikey <openforcefield.topology.Molecule.to_inchikey>`.

- `PR #529 <https://github.com/openforcefield/openforcefield/pull/529>`_: Added to XYZ file coverage tests.

Bugfixes
""""""""
Expand Down Expand Up @@ -154,6 +158,9 @@ Bugfixes
- `Issue #491 <https://github.com/openforcefield/openforcefield/issues/491>`_: We can now parse large molecules without hitting a match limit cap.
- `Issue #474 <https://github.com/openforcefield/openforcefield/issues/474>`_: We can now convert molecules to InChI and
InChIKey and from InChI.
- `Issue #523 <https://github.com/openforcefield/openforcefield/issues/523>`_: The
:py:meth: `Molecule.to_file <openforcefield.topology.Molecule.to_file>` can now correctly write to `MOL` files in
line with the support file type list.

Example added
"""""""""""""
Expand Down
231 changes: 231 additions & 0 deletions openforcefield/data/molecules/butane_multi.sdf
Original file line number Diff line number Diff line change
@@ -0,0 +1,231 @@

-OEChem-02262011303D

14 13 0 0 0 0 0 0 0999 V2000
1.8902 0.0426 0.2431 C 0 0 0 0 0 0 0 0 0 0 0 0
0.5463 0.1116 -0.4361 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.5015 -0.0789 0.6687 C 0 0 0 0 0 0 0 0 0 0 0 0
-1.9030 -0.0290 0.1079 C 0 0 0 0 0 0 0 0 0 0 0 0
2.1028 -1.0325 0.4479 H 0 0 0 0 0 0 0 0 0 0 0 0
2.7074 0.4597 -0.3756 H 0 0 0 0 0 0 0 0 0 0 0 0
1.8601 0.5345 1.2423 H 0 0 0 0 0 0 0 0 0 0 0 0
0.4286 -0.6991 -1.1842 H 0 0 0 0 0 0 0 0 0 0 0 0
0.3731 1.1127 -0.9172 H 0 0 0 0 0 0 0 0 0 0 0 0
-0.4197 0.7360 1.4121 H 0 0 0 0 0 0 0 0 0 0 0 0
-0.3403 -1.0819 1.0992 H 0 0 0 0 0 0 0 0 0 0 0 0
-2.2795 1.0011 -0.0281 H 0 0 0 0 0 0 0 0 0 0 0 0
-1.8988 -0.5731 -0.8579 H 0 0 0 0 0 0 0 0 0 0 0 0
-2.5658 -0.5036 0.8628 H 0 0 0 0 0 0 0 0 0 0 0 0
1 2 1 0 0 0 0
2 3 1 0 0 0 0
3 4 1 0 0 0 0
1 5 1 0 0 0 0
1 6 1 0 0 0 0
1 7 1 0 0 0 0
2 8 1 0 0 0 0
2 9 1 0 0 0 0
3 10 1 0 0 0 0
3 11 1 0 0 0 0
4 12 1 0 0 0 0
4 13 1 0 0 0 0
4 14 1 0 0 0 0
M END
$$$$

-OEChem-02262011303D

14 13 0 0 0 0 0 0 0999 V2000
1.8976 -0.0233 0.2846 C 0 0 0 0 0 0 0 0 0 0 0 0
0.5356 -0.2022 -0.3019 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.5459 0.1748 0.6846 C 0 0 0 0 0 0 0 0 0 0 0 0
-1.9140 0.0021 0.1152 C 0 0 0 0 0 0 0 0 0 0 0 0
2.5160 -0.9024 -0.0541 H 0 0 0 0 0 0 0 0 0 0 0 0
2.3995 0.8748 -0.1536 H 0 0 0 0 0 0 0 0 0 0 0 0
1.9280 -0.0066 1.3720 H 0 0 0 0 0 0 0 0 0 0 0 0
0.4186 0.4937 -1.1808 H 0 0 0 0 0 0 0 0 0 0 0 0
0.3396 -1.2342 -0.6770 H 0 0 0 0 0 0 0 0 0 0 0 0
-0.4260 -0.3861 1.6367 H 0 0 0 0 0 0 0 0 0 0 0 0
-0.3499 1.2523 0.9530 H 0 0 0 0 0 0 0 0 0 0 0 0
-2.3305 1.0167 -0.1233 H 0 0 0 0 0 0 0 0 0 0 0 0
-1.8986 -0.5403 -0.8508 H 0 0 0 0 0 0 0 0 0 0 0 0
-2.5699 -0.5194 0.8655 H 0 0 0 0 0 0 0 0 0 0 0 0
1 2 1 0 0 0 0
2 3 1 0 0 0 0
3 4 1 0 0 0 0
1 5 1 0 0 0 0
1 6 1 0 0 0 0
1 7 1 0 0 0 0
2 8 1 0 0 0 0
2 9 1 0 0 0 0
3 10 1 0 0 0 0
3 11 1 0 0 0 0
4 12 1 0 0 0 0
4 13 1 0 0 0 0
4 14 1 0 0 0 0
M END
$$$$

-OEChem-02262011303D

14 13 0 0 0 0 0 0 0999 V2000
-1.8794 -0.1793 -0.2565 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.5059 0.4467 -0.1843 C 0 0 0 0 0 0 0 0 0 0 0 0
0.5528 -0.5505 0.1934 C 0 0 0 0 0 0 0 0 0 0 0 0
1.8813 0.1834 0.2353 C 0 0 0 0 0 0 0 0 0 0 0 0
-1.9486 -1.0073 -0.9718 H 0 0 0 0 0 0 0 0 0 0 0 0
-2.1711 -0.4779 0.7672 H 0 0 0 0 0 0 0 0 0 0 0 0
-2.5892 0.6224 -0.5674 H 0 0 0 0 0 0 0 0 0 0 0 0
-0.2267 0.9036 -1.1640 H 0 0 0 0 0 0 0 0 0 0 0 0
-0.5660 1.2716 0.5559 H 0 0 0 0 0 0 0 0 0 0 0 0
0.3620 -1.0419 1.1457 H 0 0 0 0 0 0 0 0 0 0 0 0
0.5736 -1.3428 -0.5940 H 0 0 0 0 0 0 0 0 0 0 0 0
2.2832 0.2684 -0.7762 H 0 0 0 0 0 0 0 0 0 0 0 0
2.5692 -0.2979 0.9574 H 0 0 0 0 0 0 0 0 0 0 0 0
1.6648 1.2015 0.6590 H 0 0 0 0 0 0 0 0 0 0 0 0
1 2 1 0 0 0 0
2 3 1 0 0 0 0
3 4 1 0 0 0 0
1 5 1 0 0 0 0
1 6 1 0 0 0 0
1 7 1 0 0 0 0
2 8 1 0 0 0 0
2 9 1 0 0 0 0
3 10 1 0 0 0 0
3 11 1 0 0 0 0
4 12 1 0 0 0 0
4 13 1 0 0 0 0
4 14 1 0 0 0 0
M END
$$$$

-OEChem-02262011303D

14 13 0 0 0 0 0 0 0999 V2000
-1.5206 -0.0165 0.2787 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.5930 -0.5666 -0.7456 C 0 0 0 0 0 0 0 0 0 0 0 0
0.6896 0.2389 -0.8863 C 0 0 0 0 0 0 0 0 0 0 0 0
1.4615 0.2842 0.4020 C 0 0 0 0 0 0 0 0 0 0 0 0
-1.0827 0.8399 0.8599 H 0 0 0 0 0 0 0 0 0 0 0 0
-1.8355 -0.7861 1.0127 H 0 0 0 0 0 0 0 0 0 0 0 0
-2.4764 0.3282 -0.2111 H 0 0 0 0 0 0 0 0 0 0 0 0
-1.0715 -0.6921 -1.7386 H 0 0 0 0 0 0 0 0 0 0 0 0
-0.3153 -1.6167 -0.4302 H 0 0 0 0 0 0 0 0 0 0 0 0
0.4322 1.2426 -1.2808 H 0 0 0 0 0 0 0 0 0 0 0 0
1.2882 -0.2664 -1.6780 H 0 0 0 0 0 0 0 0 0 0 0 0
2.4839 -0.1714 0.2725 H 0 0 0 0 0 0 0 0 0 0 0 0
1.6381 1.3724 0.6329 H 0 0 0 0 0 0 0 0 0 0 0 0
0.9015 -0.1903 1.2220 H 0 0 0 0 0 0 0 0 0 0 0 0
1 2 1 0 0 0 0
2 3 1 0 0 0 0
3 4 1 0 0 0 0
1 5 1 0 0 0 0
1 6 1 0 0 0 0
1 7 1 0 0 0 0
2 8 1 0 0 0 0
2 9 1 0 0 0 0
3 10 1 0 0 0 0
3 11 1 0 0 0 0
4 12 1 0 0 0 0
4 13 1 0 0 0 0
4 14 1 0 0 0 0
M END
$$$$

-OEChem-02262011303D

14 13 0 0 0 0 0 0 0999 V2000
-1.4890 -0.2619 0.4871 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.6807 0.5742 -0.4700 C 0 0 0 0 0 0 0 0 0 0 0 0
0.6712 -0.0156 -0.7659 C 0 0 0 0 0 0 0 0 0 0 0 0
1.5152 -0.1656 0.4762 C 0 0 0 0 0 0 0 0 0 0 0 0
-1.6319 0.3374 1.4229 H 0 0 0 0 0 0 0 0 0 0 0 0
-1.0576 -1.2449 0.6737 H 0 0 0 0 0 0 0 0 0 0 0 0
-2.5296 -0.4380 0.1041 H 0 0 0 0 0 0 0 0 0 0 0 0
-0.6151 1.6267 -0.1275 H 0 0 0 0 0 0 0 0 0 0 0 0
-1.2340 0.6077 -1.4394 H 0 0 0 0 0 0 0 0 0 0 0 0
1.1948 0.6857 -1.4402 H 0 0 0 0 0 0 0 0 0 0 0 0
0.5332 -1.0133 -1.2101 H 0 0 0 0 0 0 0 0 0 0 0 0
1.1011 -0.8916 1.1972 H 0 0 0 0 0 0 0 0 0 0 0 0
1.7287 0.8056 0.9622 H 0 0 0 0 0 0 0 0 0 0 0 0
2.4938 -0.6064 0.1295 H 0 0 0 0 0 0 0 0 0 0 0 0
1 2 1 0 0 0 0
2 3 1 0 0 0 0
3 4 1 0 0 0 0
1 5 1 0 0 0 0
1 6 1 0 0 0 0
1 7 1 0 0 0 0
2 8 1 0 0 0 0
2 9 1 0 0 0 0
3 10 1 0 0 0 0
3 11 1 0 0 0 0
4 12 1 0 0 0 0
4 13 1 0 0 0 0
4 14 1 0 0 0 0
M END
$$$$

-OEChem-02262011303D

14 13 0 0 0 0 0 0 0999 V2000
-1.4941 -0.2249 -0.0958 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.6159 0.5972 0.8214 C 0 0 0 0 0 0 0 0 0 0 0 0
0.7187 -0.0695 1.0581 C 0 0 0 0 0 0 0 0 0 0 0 0
1.4324 -0.2419 -0.2503 C 0 0 0 0 0 0 0 0 0 0 0 0
-2.4692 0.3382 -0.1659 H 0 0 0 0 0 0 0 0 0 0 0 0
-1.7299 -1.2190 0.3361 H 0 0 0 0 0 0 0 0 0 0 0 0
-1.0562 -0.3936 -1.0798 H 0 0 0 0 0 0 0 0 0 0 0 0
-0.4381 1.5706 0.3236 H 0 0 0 0 0 0 0 0 0 0 0 0
-1.1716 0.7752 1.7593 H 0 0 0 0 0 0 0 0 0 0 0 0
0.6114 -1.0068 1.6223 H 0 0 0 0 0 0 0 0 0 0 0 0
1.3136 0.6230 1.6947 H 0 0 0 0 0 0 0 0 0 0 0 0
1.1682 -1.1986 -0.7603 H 0 0 0 0 0 0 0 0 0 0 0 0
1.1881 0.6394 -0.8861 H 0 0 0 0 0 0 0 0 0 0 0 0
2.5428 -0.1892 -0.0971 H 0 0 0 0 0 0 0 0 0 0 0 0
1 2 1 0 0 0 0
2 3 1 0 0 0 0
3 4 1 0 0 0 0
1 5 1 0 0 0 0
1 6 1 0 0 0 0
1 7 1 0 0 0 0
2 8 1 0 0 0 0
2 9 1 0 0 0 0
3 10 1 0 0 0 0
3 11 1 0 0 0 0
4 12 1 0 0 0 0
4 13 1 0 0 0 0
4 14 1 0 0 0 0
M END
$$$$

-OEChem-02262011303D

14 13 0 0 0 0 0 0 0999 V2000
-1.8827 -0.0372 0.1937 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.4869 0.4974 -0.0318 C 0 0 0 0 0 0 0 0 0 0 0 0
0.5277 -0.6105 -0.1738 C 0 0 0 0 0 0 0 0 0 0 0 0
1.8825 0.0227 -0.3945 C 0 0 0 0 0 0 0 0 0 0 0 0
-2.4888 0.8394 0.5564 H 0 0 0 0 0 0 0 0 0 0 0 0
-1.9048 -0.8867 0.8869 H 0 0 0 0 0 0 0 0 0 0 0 0
-2.3523 -0.3312 -0.7699 H 0 0 0 0 0 0 0 0 0 0 0 0
-0.2414 1.1364 0.8414 H 0 0 0 0 0 0 0 0 0 0 0 0
-0.4983 1.1660 -0.9168 H 0 0 0 0 0 0 0 0 0 0 0 0
0.5780 -1.1342 0.8166 H 0 0 0 0 0 0 0 0 0 0 0 0
0.2791 -1.3307 -0.9487 H 0 0 0 0 0 0 0 0 0 0 0 0
2.2793 0.4796 0.5265 H 0 0 0 0 0 0 0 0 0 0 0 0
2.5920 -0.6671 -0.8937 H 0 0 0 0 0 0 0 0 0 0 0 0
1.7166 0.8560 -1.1272 H 0 0 0 0 0 0 0 0 0 0 0 0
1 2 1 0 0 0 0
2 3 1 0 0 0 0
3 4 1 0 0 0 0
1 5 1 0 0 0 0
1 6 1 0 0 0 0
1 7 1 0 0 0 0
2 8 1 0 0 0 0
2 9 1 0 0 0 0
3 10 1 0 0 0 0
3 11 1 0 0 0 0
4 12 1 0 0 0 0
4 13 1 0 0 0 0
4 14 1 0 0 0 0
M END
$$$$
87 changes: 87 additions & 0 deletions openforcefield/tests/test_molecule.py
Original file line number Diff line number Diff line change
Expand Up @@ -521,6 +521,93 @@ def test_to_from_topology(self, molecule):
molecule_copy = Molecule.from_topology(topology)
assert molecule == molecule_copy

def test_to_multiframe_xyz(self):
"""Test writing out a molecule with multiple conformations to an xyz file"""

# load in an SDF of butane with multiple conformers in it
molecules = Molecule.from_file(get_data_file_path('molecules/butane_multi.sdf'), 'sdf')
# now we want to combine the conformers to one molecule
butane = molecules[0]
for mol in molecules[1:]:
butane.add_conformer(mol._conformers[0])

# make sure we have the 7 conformers
assert butane.n_conformers == 7
with NamedTemporaryFile(suffix='.xyz') as iofile:
# try and write out the xyz file
butane.to_file(iofile.name, 'xyz')

# now lets check whats in the file
with open(iofile.name) as xyz_data:
data = xyz_data.readlines()
# make sure we have the correct amount of lines writen
assert len(data) == 112
# make sure all headers and frame data was writen
assert data.count('14\n') == 7
for i in range(1, 8):
assert f'C4H10 Frame {i}\n' in data

# now make sure the first line of the coordinates are correct in every frame
coords = ['C 1.8902000189 0.0425999984 0.2431000024\n',
'C 1.8976000547 -0.0232999995 0.2845999897\n',
'C -1.8794000149 -0.1792999953 -0.2565000057\n',
'C -1.5205999613 -0.0164999999 0.2786999941\n',
'C -1.4889999628 -0.2619000077 0.4871000051\n',
'C -1.4940999746 -0.2249000072 -0.0957999974\n',
'C -1.8826999664 -0.0372000001 0.1937000006\n']
for coord in coords:
assert coord in data

def test_to_single_xyz(self):
"""Test writing to a single frame xyz file"""

# load a molecule with a single conformation
toluene = Molecule.from_file(get_data_file_path('molecules/toluene.sdf'), 'sdf')
# make sure it has one conformer
assert toluene.n_conformers == 1

with NamedTemporaryFile(suffix='.xyz') as iofile:
# try and write out the xyz file
toluene.to_file(iofile.name, 'xyz')

# now lets check the file contents
with open(iofile.name) as xyz_data:
data = xyz_data.readlines()
# make sure we have the correct amount of lines writen
assert len(data) == 17
# make sure all headers and frame data was writen
assert data.count('15\n') == 1
assert data.count('C7H8\n') == 1
# now check that we can find the first and last coords
coords = ['C 0.0000000000 0.0000000000 0.0000000000\n',
'H -0.0000000000 3.7604000568 0.0000000000\n']
for coord in coords:
assert coord in data

def test_to_xyz_no_conformers(self):
"""Test writing a molecule out when it has no conformers here all coords should be 0."""

# here we want to make a molecule with no coordinates
ethanol = create_ethanol()
assert ethanol.n_conformers == 0

with NamedTemporaryFile(suffix='.xyz') as iofile:
# try and write out the xyz file
ethanol.to_file(iofile.name, 'xyz')

# now lets check the file contents
with open(iofile.name) as xyz_data:
data = xyz_data.readlines()
# make sure we have the correct amount of lines writen
assert len(data) == 11
# make sure all headers and frame data was writen
assert data.count('9\n') == 1
assert data.count('C2H6O\n') == 1
# now check that all coords are 0
coords = ['0.0000000000', '0.0000000000', '0.0000000000']
for atom_coords in data[2:]:
assert atom_coords.split()[1:] == coords

# TODO: Should there be an equivalent toolkit test and leave this as an integration test?
@pytest.mark.parametrize('molecule', mini_drug_bank())
@pytest.mark.parametrize('format', [
Expand Down
Loading

0 comments on commit 0e5ff41

Please sign in to comment.