diff --git a/README.md b/README.md index d20672c..0416f32 100644 --- a/README.md +++ b/README.md @@ -83,11 +83,15 @@ $$\psi = \theta - 180^o$$ Periodic-dihedral: -$$U_{Periodic} = K_0 * (1 + cos(n_0*\theta - 90^o))$$ +$$U_{Periodic} = K_0 * (1 + cos(n_0*\theta - d_0))$$ -$$+ K_1 * (1 + cos(n_1*\theta - 180^o)) + K_2 * (1 + cos(n_2*\theta))$$ +$$+ K_1 * (1 + cos(n_1*\theta - d_1)) + K_2 * (1 + cos(n_2*\theta - d_2))$$ -$$+ K_3 * (1 + cos(n_3*\theta - 180^o)) + K_4 * (1 + cos(n_4*\theta))$$ +$$+ K_3 * (1 + cos(n_3*\theta - d_3)) + K_4 * (1 + cos(n_4*\theta) - d_4)$$ + +$$where: n_0 = 0 ; n_1 = 1 ; n_2 = 2 ; n_3 = 3 ; n_4 = 4 $$ + +$$d_0 = 90^o ; d_1 = 180^o ; d_2 = 0^o ; d_3 = 180^o ; d_4 = 0^o ## Examples Some basic workflows that use this package and cover several force field (FF) types are [here](https://github.com/GOMC-WSU/GOMC_Examples/tree/main/MoSDeF-dihedral-fit). A summary of the FFs covered in the examples are listed below. diff --git a/docs/index.rst b/docs/index.rst index 0976c30..e974a37 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -63,15 +63,21 @@ Ryckaert-Bellemans (RB)-torsions: Periodic-dihedral: .. math:: - U_{Periodic} = K_0 * (1 + cos(n_0*\theta - 90^o)) + U_{Periodic} = K_0 * (1 + cos(n_0*\theta - d_0)) + +.. math:: + + K_1 * (1 + cos(n_1*\theta - d_1)) + + K_2 * (1 + cos(n_2*\theta - d_2)) + +.. math:: + + K_3 * (1 + cos(n_3*\theta - d_3)) + + K_4 * (1 + cos(n_4*\theta - d_4)) .. math:: - + K_1 * (1 + cos(n_1*\theta - 180^o)) - + K_2 * (1 + cos(n_2*\theta)) + where: n_0 = 0 ; n_1 = 1 ; n_2 = 2 ; n_3 = 3 ; n_4 = 4 .. math:: - + K_3 * (1 + cos(n_3*\theta - 180^o)) - + K_4 * (1 + cos(n_4*\theta)) + d_0 = 90^o ; d_1 = 180^o ; d_2 = 0^o ; d_3 = 180^o ; d_4 = 0^o **MoSDeF-dihedral-fit Highlights**: #. With a **Gaussian 16** log file and a few user inputs, the user can easily fit a dihedral. diff --git a/docs/overview/general_info.rst b/docs/overview/general_info.rst index 95035a7..f438d7c 100644 --- a/docs/overview/general_info.rst +++ b/docs/overview/general_info.rst @@ -66,15 +66,21 @@ Ryckaert-Bellemans (RB)-torsions: Periodic-dihedral: .. math:: - U_{Periodic} = K_0 * (1 + cos(n_0*\theta - 90^o)) + U_{Periodic} = K_0 * (1 + cos(n_0*\theta - d_0)) .. math:: - + K_1 * (1 + cos(n_1*\theta - 180^o)) - + K_2 * (1 + cos(n_2*\theta)) + + K_1 * (1 + cos(n_1*\theta - d_1)) + + K_2 * (1 + cos(n_2*\theta - d_2)) .. math:: - + K_3 * (1 + cos(n_3*\theta - 180^o)) - + K_4 * (1 + cos(n_4*\theta)) + + K_3 * (1 + cos(n_3*\theta - d_3)) + + K_4 * (1 + cos(n_4*\theta - d_4)) + +.. math:: + where: n_0 = 0 ; n_1 = 1 ; n_2 = 2 ; n_3 = 3 ; n_4 = 4 + +.. math:: + d_0 = 90^o ; d_1 = 180^o ; d_2 = 0^o ; d_3 = 180^o ; d_4 = 0^o MoSDeF-dihedral-fit is a part of the MoSDeF ecosystem ----------------------------------------------------- diff --git a/docs/reference/contributing.rst b/docs/reference/contributing.rst index 02763e0..2e4eda2 100644 --- a/docs/reference/contributing.rst +++ b/docs/reference/contributing.rst @@ -6,11 +6,11 @@ We welcome contributions to our project, whether it's adding a feature, fixing a 1. **Open an Issue on GitHub**: Start by opening an issue on GitHub to describe the change you want to make. This allows for discussion and ensures everyone is aligned before significant work is undertaken. For bug fixes, we will confirm the behavior of the bug and validate the proposed fix. For new features, we will assess the feature's relevance and discuss possible designs. If you can provide code a short code section to show how the bug occurs or an example of the feature that you want, it will assist us in finding the bug or knowing what should be added. For added security measures, we prefer that you provide the short code sections in the text of the GitHub Issue and not by uploaded files or compressed folders/directories. -2. **Create a Pull Request (PR)**: Once there is consensus, [create a pull request](https://help.github.com/en/articles/about-pull-requests) with your code changes. For larger features, you can create a pull request before completing the implementation to receive early feedback. Indicate that it is a work in progress by including "WIP" at the start of the PR title. +2. **Create a Pull Request (PR)**: Once there is consensus, `create a pull request `_ with your code changes. For larger features, you can create a pull request before completing the implementation to receive early feedback. Indicate that it is a work in progress by including "WIP" at the start of the PR title. 3. **Documentation and Unit Tests**: For new features, please provide sufficient documentation and unit tests: - - **Documentation**: Include a summary of the method, explanations of all input parameters, and the expected output. A minimal example can be found [here](https://github.com/GOMC-WSU/MoSDeF-dihedral-fit/blob/main/mosdef_dihedral_fit/utils/io.py). - - **Unit Tests**: Unit tests for the all possible options that a user can select are mandatory, as this ensures that all the code is covered and checked regularly for errors. We use [pytest](https://docs.pytest.org/en/latest/), which can be run by executing `pytest` from the root directory of the package. These tests are automatically run as part of our CI workflow each time a change is added. + * **Documentation**: Include a summary of the method, explanations of all input parameters, and the expected output. + * **Unit Tests**: Unit tests for the all possible options that a user can select are mandatory, as this ensures that all the code is covered and checked regularly for errors. We use `pytest `_, which can be run by executing `pytest` from the root directory of the package. These tests are automatically run as part of our CI workflow each time a change is added. 4. **Review and Merge**: Once the PR is marked as ready and all checks have passed, core developers/maintainers will review it and may suggest changes. After everyone is satisfied with the code, the pull request will be merged. diff --git a/paper/paper.md b/paper/paper.md index 143413c..b1f37a7 100644 --- a/paper/paper.md +++ b/paper/paper.md @@ -60,24 +60,25 @@ bibliography: paper.bib # Summary -Molecular Mechanics (MM) simulations (e.g., molecular dynamics and Monte Carlo) provide a third method of scientific discovery, adding to the traditional theoretical and experimental scientific methods [@Mielke:2019; @Siegfried:2014]. Experimental methods measure the data under set conditions (e.g., temperature and pressure), whereas the traditional theoretical methods are based on analytical equations, and sometimes their constants are fitted to experimental data (e.g., equations of state). The MM simulations are deterministic and stochastic, and their models can be optimized to match experimental data, similar to analytical theory-based methods [@Allen:2017; @Frekel:2002; @Jorgensen:1996; @Martin:1998; @Weiner:1984; @Weiner:1986; @Potoff:2009; @Hemmen:2015; @Errington:1999]. In larger, more complex systems, the stochastic simulation's molecules can jump large energy barriers that deterministic simulations may not be able to overcome in a reasonable timeframe, even with modern computing capabilities [@Allen:2017; @Frekel:2002]. However, deterministic and stochastic systems that provide adequate sampling for calculating a given property can provide critical insights into the system's phase space, which are not obtainable via traditional theoretical and experimental methods. Additionally, molecular simulations provide critical insights from visualizations and by obtaining chemical or material properties that do not currently exist, are not easily attainable (e.g., too expensive or dangerous) by traditional theoretical and experimental methods [@Hollingsworth:2018; @Hirst:2014], or require hard-to-achieve conditions, such as very high pressures and temperatures [@Yu:2023; @Koneru:2022; @Swai:2020; @Kumar:2022; @Louie:2021]. However, these MM models require force field parameters to be determined ideally from Quantum Mechanics (QM) simulations or other methods, including the vibrational spectrum and machine learning methods [@Kania:2021; @Friederich:2018; @Vermeyen:2023; @Mayne:2013; @Schmid:2011; @Vanommeslaeghe:2014], where the MM proper dihedrals (i.e., dihedrals) are challenging to obtain if they don't currently exist for the chosen force field or they do not properly scale-up to larger molecules or moiety combinations from their separately derived parameters using small molecules [@Kania:2021; @Mayne:2013]. While the same QM simulations can be used to fit the dihedrals in all of the force field types, these MM dihedrals are also not easily transferable between different force fields due to the differing parameters and formulas (e.g., combining rules and 1-4 scaling factors) used in each force field [@Huang:2013; @Vanommeslaeghe:2010; @Vanommeslaeghe:2014; @Chen:2015]. +Molecular Mechanics (MM) simulations (e.g., molecular dynamics and Monte Carlo) provide a third method of scientific discovery, adding to the traditional theoretical and experimental scientific methods [@Mielke:2019; @Siegfried:2014]. Experimental methods measure the data under set conditions (e.g., temperature and pressure), whereas the traditional theoretical methods are based on analytical equations, and sometimes their constants are fitted to experimental data. The MM simulations are deterministic and stochastic, and their models, commonly known as "force fields", can be optimized to match experimental data, similar to analytical theory-based methods [@Allen:2017; @Frekel:2002; @Jorgensen:1996; @Martin:1998; @Weiner:1984; @Weiner:1986; @Potoff:2009; @Hemmen:2015; @Errington:1999]. In larger, more complex systems, the stochastic simulation's molecules can jump large energy barriers that deterministic simulations may not be able to overcome in a reasonable timeframe, even with modern computing capabilities [@Allen:2017; @Frekel:2002]. However, deterministic and stochastic systems that provide adequate sampling for calculating a given property can provide critical insights into the system's phase space, which are not obtainable via traditional theoretical and experimental methods. Additionally, molecular simulations provide critical insights from visualizations and by obtaining chemical or material properties that do not currently exist, are not easily attainable (e.g., too expensive or dangerous) by traditional theoretical and experimental methods [@Hollingsworth:2018; @Hirst:2014], or require hard-to-achieve conditions, such as very high pressures and temperatures [@Yu:2023; @Koneru:2022; @Swai:2020; @Kumar:2022; @Louie:2021]. However, the force field parameters are ideally determined from Quantum Mechanics (QM) simulations or other methods, including the vibrational spectrum and machine learning methods [@Kania:2021; @Friederich:2018; @Vermeyen:2023; @Mayne:2013; @Schmid:2011; @Vanommeslaeghe:2014]. The MM proper dihedrals (i.e., dihedrals) are challenging to obtain if they don't currently exist for the chosen force field, inaccurately scale-up in larger molecules, or misbehave with other moiety combinations, provided some were separately derived using small molecules [@Kania:2021; @Mayne:2013]. While the same QM simulations can fit the dihedrals in most force field types, these dihedrals aren't easily transferable between force fields due to the differing parameters and formulas, including the combining rules and 1-4 scaling factors. [@Huang:2013; @Vanommeslaeghe:2010; @Vanommeslaeghe:2014; @Chen:2015]. - -The `MoSDeF-Dihedral-Fit` [@Crawford:2023b] library lets users quickly calculate the MM proper dihedrals (dihedrals) directly from the QM simulations for several force fields (OPLS, TraPPE, AMBER, Mie, and Exp6) [@Jorgensen:1996; @Martin:1998; @Weiner:1984; @Weiner:1986; @Potoff:2009; @Hemmen:2015; @Errington:1999]. The user simply has to generate or use an existing Molecular Simulation Design Framework (MoSDeF) force field XML file [@Cummings:2021; @Summers:2020; @GMSO:2019; @forcefield-utilities:2022], provide Gaussian 16 or Gaussian-style Quantum Mechanics (QM) simulation files that cover the dihedral rotation (typically, 0-360 degrees), and provide the molecular structure information in a mol2 format [@Gaussian16:2016]. The `MoSDeF-Dihedral-Fit` software uses the QM and MM data to fit the dihedral for the specific force field, fitting the constants for the OPLS dihedral equation form with the correct combining rules and 1-4 scaling factors, as specified in the MoSDeF XML force field file. This software also accounts for multiple instances of the dihedral and the molecular symmetry in the molecule, and automatically removes all the unusable cosine power series values due to this symmetry. The user can set other dihedral energies in the molecule to zero, allowing for a more flexible and accurate dihedral fit; this allows the multiple dihedral's conformational energies to be calculated from a single dihedral angle, a strategy that was used in some of the original OPLS dihedral fits. The `MoSDeF-Dihedral-Fit` software analytically calculates the Ryckaert-Bellemans (RB)-torsions and the periodic dihedral from the OPLS dihedral. If another form of the dihedral equation that is not currently supported is needed, the software outputs the raw data points to enable users to fit any other dihedral form. Therefore, the `MoSDeF-Dihedral-Fit` software allows the fitting of any dihedral form, provided the force fields and software it utilizes are supported by MoSDeF and MoSDeF-GOMC (which uses GPU Optimized Monte Carlo - GOMC) [@Crawford:2023a; @Crawford:2022; @Crawford:2023b; @Nejahi:2019; @Nejahi:2021], vmd-python [@vmd-python:2016] (a derivative or modified version of the original VMD software [@Humphrey:1996; @Stone:2001]), and the QM data is provided as a Gaussian output file, or a generalized Gaussian-style output form [@Gaussian16:2016]. +The `MoSDeF-Dihedral-Fit` [@Crawford:2023b] library lets users quickly calculate the MM dihedrals directly from the QM simulations for several force fields (OPLS, TraPPE, AMBER, Mie, and Exp6) [@Jorgensen:1996; @Martin:1998; @Weiner:1984; @Weiner:1986; @Potoff:2009; @Hemmen:2015; @Errington:1999]. The user simply has to generate or use an existing Molecular Simulation Design Framework (MoSDeF) force field XML file [@Cummings:2021; @Summers:2020; @GMSO:2019; @forcefield-utilities:2022], provide Gaussian 16 log or Gaussian-style QM simulation files that cover the dihedral rotation (typically, 0-360 degrees), and provide the molecular structure information in a mol2 format [@Gaussian16:2016]. The `MoSDeF-Dihedral-Fit` software uses the QM and MM data to produce the dihedral for the specific force field, fitting the constants for the OPLS dihedral form and then analytically converting them to the Ryckaert-Bellemans (RB)-torsions and the periodic dihedrals. The software outputs the calculated MM dihedral points, enabling users to fit unsupported dihedral forms, provided the force fields are supported by the MoSDeF, GPU Optimized Monte Carlo (GOMC),and MoSDeF-GOMC [@Crawford:2023a; @Crawford:2022; @Crawford:2023b; @Nejahi:2019; @Nejahi:2021], and vmd-python [@vmd-python:2016] software (a derivative of the VMD software [@Humphrey:1996; @Stone:2001]). # Statement of need -Many different types of Molecular Mechanics (MM) simulation models exist, which are also known as "force fields". While many of these force field parameters can be transferred between force fields, such as bonds, angles, and improper dihedrals (impropers), the proper dihedrals (dihedrals) can not be easily transferred due to the different combining rules (arithmetic and geometric) and 1-4 scaling factors (i.e., scaling factors between the 1st and 4th atoms) that were used in the development of the original parameters [@Berthelot:1898; @Good:1970; @Lorentz:1881]. The accuracy of these dihedral parameters for each force field is critical for molecular simulations to obtain the correct molecular conformations and configurations, which are absolutely required for understanding and analyzing the system's microstructure and physical properties (e.g., free energies, viscosities, adsorption loading, diffusion constants, and many more). +While many of these Molecular Mechanics (MM) force field parameters can be transferred between force fields, such as bonds, angles, and improper dihedrals (impropers), the proper dihedrals (dihedrals) can not be easily transferred due to the different combining rules (arithmetic and geometric) and 1-4 scaling factors (i.e., between the 1st and 4th bonded atoms) that were used in the development of the original parameters [@Berthelot:1898; @Good:1970; @Lorentz:1881]. The accuracy of these dihedral parameters is critical in obtaining the correct molecular conformations and configurations, which are absolutely required for understanding and analyzing the system's microstructure and physical properties (e.g., free energies, viscosities, adsorption loading, diffusion constants, and many more). + +While some dihedral fitting software currently exists, they only fit the CHARMM-style force fields [@Mayne:2013], or fit the dihedral constants to the final MM and QM energies, which need to be calculated by other means [@Guvench:2008]. Therefore, the molecular simulation community needs a generalized software package that imports QM and MM files, automatically reads and organizes the QM data, calculates the MM energies, auto-corrects the dihedral fit to account for multiple instances of the dihedral, and automatically removes the unusable cosine power series combinations due to this symmetry. The `MoSDeF-dihedral-fit` software accomplishes all this and automatically accounts for any of the common combining rules and the 1-4 scaling factors specified via the MoSDeF XML (i.e., force field) files [@Cummings:2021; @Summers:2020; @GMSO:2019; @forcefield-utilities:2022]. By allowing the user to set any other dihedral in the molecule to zero, this software avoids forcing one dihedral fit to correct the inaccurate forces of another dihedral, resulting in a problematic or bad cosine series fit; thus, providing a more flexible and accurate fit by combining multiple dihedral conformational energies in a single dihedral, a strategy used in the original and modern OPLS force fields [Jorgensen:1996: @Chao:2021]. For example, a carboxylic acid with an alkyl tail has two dihedrals in the same rotation cycle; the C-C-C-O: (O: = oxygen without hydrogen) dihedral is set to zero while the C-C-O-H dihedral is fit [@Jorgensen:1996; @Chao:2021; @Ganesh:2004]. The `MoSDeF-dihedral-fit` [@Crawford:2023b] API fills the missing gap by providing a generalized and easy solution to fitting dihedrals in their commonly used forms and outputting the MM dihedral data points so users can fit other custom dihedral forms. -While some dihedral fitting software currently exists, they only fit the CHARMM-style force fields [@Mayne:2013], or fit the dihedral constants to the final MM and QM energies, which need to be calculated by other means [@Guvench:2008]. Therefore, the molecular simulation community needs a generalized software package that imports QM and MM files, automatically reads and organizes the QM data, calculates the MM energies, and automatically fits the dihedral. Additionally, the molecular simulation community needs software that fits the dihedral to any force field style and auto-corrects the fit to account for multiple instances of the dihedral and molecular symmetry, since fitting these dihedrals is a high barrier to simulating new chemistry and materials if these parameters do not exist for the chosen force field. The `MoSDeF-dihedral-fit` software accomplishes all this and automatically accounts for any of the common combining rules, allows the zeroing out any other dihedrals in the molecule (i.e., allowing a better fit and user control by setting the other dihedral energies to zero), and accounts for any 1-4 scaling factors specified via the MoSDeF XML files [@Cummings:2021; @Summers:2020; @GMSO:2019; @forcefield-utilities:2022], which contain the force fields. For example, two dihedrals in the same rotation cycle that both start at the carbon chain and end with a carboxylic acid demonstrate that setting one dihedral to zero is a typical practice, as the C-C-C-O: (O: = oxygen without hydrogen) dihedral is set to zero while the C-C-C-O (O = oxygen with hydrogen) is the only fit or non-zero dihedral [@Jorgensen:1996; @Chao:2021; @Ganesh:2004]. Setting one dihedral to zero avoids potentially incorrect forces and a problematic or bad cosine series dihedral fit; otherwise, it would require fitting the first dihedral with a keytone or alcohol and then fitting the remaining dihedral in the carboxylic acid. The `MoSDeF-dihedral-fit` [@Crawford:2023b] API fills the missing gap by providing a generalized and easy solution to fitting dihedrals for any dihedral form that is allowable in the Molecular Simulation Design Framework (MoSDeF) and MoSDeF-GOMC (uses the GPU Optimized Monte Carlo - GOMC MM engine) software [@Crawford:2023a; @Crawford:2022; @Crawford:2023b; @Nejahi:2019; @Nejahi:2021]. +Commonly used force field dihedrals, such as OPLS, were originally fit when the QM simulations used to fit the dihedrals were computationally prohibitive. Due to these limitations, scientists assumed that the dihedral fits were transferable with all the atom classes in the dihedral fit; however, this is not always an accurate assumption. Some of the dihedrals were only fit to the first minimum and not the entire dihedral landscape, which can lead to errors in the predicted molecular conformations. These prior assumptions in the dihedral fits may also lead to problems in reproducibility in modified force fields. Today, with more advanced hardware and software, QM simulations can be conducted with more complex molecules, allowing for higher quality and customized dihedral fits. The `MoSDeF-dihedral-fit` software will enable scientists to create a generalized, or molecule-specific dihedral parameters, quickly, accurately and reproducibly. -Commonly used force field dihedrals, such as OPLS, were originally fit when the QM simulations used to fit the dihedrals were computationally prohibitive. Due to these limitations, scientists assumed that the dihedral fits were transferable with all the atom classes in the dihedral fit; however, this is not always an accurate assumption. Some of the dihedrals were only fit to the first minimum and not the entire dihedral landscape, which can lead to errors in the predicted molecular conformations. These prior assumptions in the dihedral fits may also lead to problems in reproducibility in modified force fields. Today, with more advanced hardware and software, QM simulations can be conducted with more complex molecules, allowing for higher quality and customized dihedral fits. The `MoSDeF-dihedral-fit` software will enable scientists to create a generalized, or molecule-specific dihedral parameters, quickly, accurately and reproducibly for any force field. # Acknowledgements This research was partially supported by the National Science Foundation (grants OAC-1835713, OAC-1835874, and CBET 2052438). Atomfold LLC also donated research and development time and computational resources for this research and software. Wayne State University Grid provided some of the computational resources used in this work. + # Mathematics Proper dihedral (dihedral) forms. @@ -102,8 +103,12 @@ $$\psi = \theta - 180^o$$ Periodic-dihedral: -$$U_{Periodic} = K_0 * (1 + cos(n_0*\theta - 90^o))$$ +$$U_{Periodic} = K_0 * (1 + cos(n_0*\theta - d_0))$$ + +$$+ K_1 * (1 + cos(n_1*\theta - d_1)) + K_2 * (1 + cos(n_2*\theta - d_2))$$ + +$$+ K_3 * (1 + cos(n_3*\theta - d_3)) + K_4 * (1 + cos(n_4*\theta) - d_4)$$ -$$+ K_1 * (1 + cos(n_1*\theta - 180^o)) + K_2 * (1 + cos(n_2*\theta))$$ +$$where: n_0 = 0 ; n_1 = 1 ; n_2 = 2 ; n_3 = 3 ; n_4 = 4 $$ -$$+ K_3 * (1 + cos(n_3*\theta - 180^o)) + K_4 * (1 + cos(n_4*\theta))$$ +$$d_0 = 90^o ; d_1 = 180^o ; d_2 = 0^o ; d_3 = 180^o ; d_4 = 0^o $$