Nonpher Python package contains functions for generating hard-to-synthesize (HS) structures as well as several functions for calculating molecular complexity. Nonpher utilizes molecular morphing algorithm implemented in the Molpher-lib [https://github.com/lich-uct/molpher-lib] library. In molecular morphing, new structures are iteratively generated by simple structural changes, such as the addition or removal of an atom or a bond. In Nonpher, molecular morphing was optimized so that it yields structures not overly complex, but just right hard-to-synthesize. HS structures generated by Nonpher can be used as negative examples for the training of machine learning classifiers. Molecular morphing approach is described in Hoksza D. et al., J. Cheminform. 2014 Mar 21;6(1):7 [https://dx.doi.org/10.1186/1758-2946-6-7] and Nonpher in Voršilák M and Svozil D., J. Cheminform. 2017 Mar 20;9(1):20 [https://dx.doi.org/10.1186/s13321-017-0206-2].
- Linux 64-bit (because at the moment, molpher-lib is compiled only for 64-bit Linux systems, but otherwise Nonpher is platform independent)
- RDKit
- Molpher-lib >=0.0.0b2
For conda installation due to some issues between packages, Molpher-lib is fixed to 0.0.0b2, RDKit to 2018.3.1 and libboost to 1.65.1. With newer or development version of Molpher-lib, these requirments are not so strict.
Nonpher is distributed as a conda package. At the moment, this is the preferred way to install and use the library. All you need to do is get the full Anaconda[https://www.anaconda.com/] distribution or its lightweight variant, Miniconda[https://docs.conda.io/en/latest/miniconda.html]. It is essentially a Python distribution, package manager and virtual environment in one and makes setting up a development environment for any project very easy. After installing Anaconda/Miniconda (and environment preparing) you can run the following in the Linux terminal:
conda install -c rdkit -c lich nonpher
Once you have installed RDKit[https://www.rdkit.org/] and Molpher-lib[https://github.com/lich-uct/molpher-lib], you can download/clone Nonpher and install it with the following command:
python setup.py install
To generate HS structures, Nonpher requires starting molecules to have ions and charges removed. The input for the Nonpher is a CSV file each line of which consists of input compound ID and input compound SMILES. The output CSV file contains input compound ID, input compound SMILES and output HS compound SMILES. The script is issued by the following command:
$ nonpher [-h] [-H] [INPUT_FILE [OUTPUT_FILE]]
where parameter -H instructs the script to skipping the first line (header) of the input CSV file.
Naturally in/with the right path (where you downloaded Nonpher [NONPHER_REPOSITORY/nonpher]), you can use:
python nonpher.py [-h] [-H] [INPUT_FILE [OUTPUT_FILE]]
from nonpher import nonpher
morph = nonpher.complex_nonpher("O=C(C)Oc1ccccc1C(=O)O")