Merge pull request #9 from PhasesResearchLab/main

Update RS branch
PhasesResearchLab · Oct 25, 2023 · f84ec3b · f84ec3b
2 parents ea29afd + 9d09a00
commit f84ec3b
Show file tree

Hide file tree

Showing 24 changed files with 253 additions and 50 deletions.
diff --git a/.github/workflows/fullTest.yml b/.github/workflows/fullTest.yml
@@ -1,6 +1,6 @@
 name: Full Test
 
-on: [push, pull_request, release]
+on: [push, pull_request]
 
 jobs:
   testPython309:

diff --git a/.github/workflows/partialTest.yml b/.github/workflows/partialTest.yml
@@ -1,6 +1,6 @@
 name: Core Functions
 
-on: [push, pull_request, release]
+on: [push, pull_request]
 
 jobs:
   coreTest:

diff --git a/.github/workflows/publishPyPI.yml b/.github/workflows/publishPyPI.yml
@@ -0,0 +1,33 @@
+name: Upload to PyPI
+
+on:
+  push:
+    tags:
+      - '**'
+
+jobs:
+  deploy:
+    runs-on: ubuntu-latest
+    steps:
+    - uses: actions/checkout@v3
+
+    - name: Set up Python
+      uses: actions/setup-python@v4
+      with:
+        python-version: '3.10'
+        cache: 'pip'
+        cache-dependency-path: 'pyproject.toml'
+
+    - name: Install dependencies
+      run: |
+        python -m pip install --upgrade pip
+        pip install build
+        
+    - name: Build package
+      run: python -m build
+
+    - name: Publish package
+      uses: pypa/[email protected]
+      with:
+        user: __token__
+        password: ${{ secrets.PYPI_API_TOKEN }}
diff --git a/.github/workflows/weeklyTesting.yml b/.github/workflows/weeklyTesting.yml
@@ -50,7 +50,7 @@ jobs:
           python -m pip install --upgrade pip
           python -m pip install --upgrade setuptools
           python -m pip install wheel flask pytest
-          python -m pip install -e .
+          python -m pip install -e ".[dev]"
       - name: Download Models
         run: python -c "import pysipfenn; c = pysipfenn.Calculator(); c.downloadModels(); c.loadModels();"
 

diff --git a/README.md b/README.md
@@ -104,7 +104,7 @@ one of the required versions of Python (3.9+) is used and there are no dependenc
 installed on your system (see instructions at https://docs.conda.io/en/latest/miniconda.html), you can create a 
 new environment with:
 
-    conda create -n pysipfenn-workshop python=3.9 jupyter
+    conda create -n pysipfenn-workshop python=3.10 jupyter
     conda activate pysipfenn-workshop
 
 And then simply install pySIPFENN from PyPI with

diff --git a/docs/docs_requirements.txt b/docs/docs_requirements.txt
@@ -1,4 +1,6 @@
-sphinx-github-changelog
+sphinx-github-changelog>=1.2.1
 sphinx-autodoc-typehints
-sphinx-rtd-theme
-myst-nb
+sphinx-rtd-theme>=1.3.0
+myst-nb>=0.17.2
+sphinx>5.0.0
+pytest
diff --git a/docs/exportingmodels.md b/docs/exportingmodels.md
@@ -81,7 +81,7 @@ and you should see new files like `MyModelNameGoesHere_simplified_fp16.onnx` in
 
 ## PyTorch
 
-This is the simplest of the export methods because, as mentioned in [ONNXExporter](#onnxexporter) section, pySIPFENN 
+This is the simplest of the export methods because, as mentioned in [ONNXExporter](#ONNXExporter) section, pySIPFENN 
 models are already stored as PyTorch models; therefore, no conversion is needed. You can use it by simply calling 
 
     from pysipfenn import PyTorchExporter
@@ -106,7 +106,7 @@ other platforms as well, such as Linux or Windows, through [coremltools](https:/
 from Apple used by this exporter.
 
 Note that under the hood, CoreML uses the float16 precision, so the model predictions will numerically match those
-exported with [ONNXExporter](#onnxexporter) in float16 precision rather than the default pySIPFENN models. This can
+exported with [ONNXExporter](#ONNXExporter) in float16 precision rather than the default pySIPFENN models. This can
 be useful if you want to use the models on devices with limited memory, such as mobile phones or embedded devices, and 
 generally should not significantly affect the accuracy of the predictions.
 

diff --git a/docs/faq.md b/docs/faq.md
@@ -88,6 +88,14 @@ bandwidth. In such a case, we recommend trying again in a day or two.
 If your download is slow or fails during normal time periods, please let us know, and we
 will try to help you by providing the files directly.
 
+## Known problems with easy solutions
+
+### Torch and ONNX Issues
+
+- Some users recently reported getting `ImportError: cannot import name 'COMMON_SAFE_ASCII_CHARACTERS' from 'charset_normalizer.constant'` error when using both recent (mid-2023) `torch` and `Mac OS on M1/M2 Macs`. It seems to be some dependency issue between `onnx2torch` and `torchvision`, but it can be quickly solved with installing missing `chardet` package with pip as
+
+        pip install chardet
+
 ## More Complex Issues
 
 ### Out-Of-Memory Error / Models Cannot Load

diff --git a/docs/index.rst b/docs/index.rst
@@ -7,7 +7,7 @@
 pySIPFENN
 =========
 
-|GitHub top language| |PyPI - Python Version| |PyPI Version| |PyPI Downloads| |GitHub license|
+|GitHub top language| |PyPI - Python Version| |GitHub license| |PyPI Version| |PyPI Downloads|
 
 |Commit Build Status| |Build Status|  |Coverage Status|
 
@@ -43,9 +43,9 @@ pySIPFENN
     :alt: Coverage Status
     :target: https://codecov.io/gh/PhasesResearchLab/pySIPFENN
 
-.. |GitHub license| image:: https://img.shields.io/github/license/PhasesResearchLab/pySIPFENN
+.. |GitHub license| image:: https://img.shields.io/badge/License-LGPL_v3-blue.svg
     :alt: GitHub license
-    :target: https://github.com/PhasesResearchLab/pySIPFENN
+    :target: https://www.gnu.org/licenses/lgpl-3.0
 
 .. |GitHub last commit| image:: https://img.shields.io/github/last-commit/PhasesResearchLab/pySIPFENN?label=Last%20Commit
     :alt: GitHub last commit (by committer)

diff --git a/docs/install.md b/docs/install.md
@@ -9,7 +9,7 @@ one of the required versions of Python (3.9+) is used and there are no dependenc
 installed on your system (see [Miniconda install instructions](https://docs.conda.io/en/latest/miniconda.html)), you can create a 
 new environment with:
 
-    conda create -n pysipfenn python=3.9 jupyter numpy
+    conda create -n pysipfenn python=3.10 jupyter numpy
     conda activate pysipfenn
 
 And then simply install pySIPFENN from PyPI with

diff --git a/docs/source/pysipfenn.rst b/docs/source/pysipfenn.rst
@@ -10,4 +10,5 @@ Subpackages
    pysipfenn.core
    pysipfenn.descriptorDefinitions
    pysipfenn.modelsSIPFENN
+   pysipfenn.tests
 
diff --git a/docs/source/pysipfenn.tests.rst b/docs/source/pysipfenn.tests.rst
@@ -0,0 +1,67 @@
+pySIPFENN Tests
+===============
+
+Core pySIPFENN Functionalities
+------------------------------
+
+.. automodule:: pysipfenn.tests.test_pysipfenn
+   :members:
+   :undoc-members:
+   :show-inheritance:
+
+KS2022 Featurization Correctness
+--------------------------------
+
+.. automodule:: pysipfenn.tests.test_KS2022
+    :members:
+    :undoc-members:
+    :show-inheritance:
+
+KS2022 Dilute-Optimized Featurization Correctness
+-------------------------------------------------
+
+.. automodule:: pysipfenn.tests.test_KS2022_dilute
+    :members:
+    :undoc-members:
+    :show-inheritance:
+
+Ward2017 Featurization Correctness
+----------------------------------
+
+.. automodule:: pysipfenn.tests.test_Ward2017
+    :members:
+    :undoc-members:
+    :show-inheritance:
+
+Auto Runtime of All ONNX Models with Ward2017
+---------------------------------------------
+
+.. automodule:: pysipfenn.tests.test_AllCompatibleONNX_Ward2017
+    :members:
+    :undoc-members:
+    :show-inheritance:
+
+Accuracy of NN9 20 24 Predictions Against Reference
+---------------------------------------------------
+
+.. automodule:: pysipfenn.tests.test_Krajewski2020_NN9NN20NN24_ONNX
+    :members:
+    :undoc-members:
+    :show-inheritance:
+
+Model Exporters Runtime
+-----------------------
+
+.. automodule:: pysipfenn.tests.test_ModelExporters
+    :members:
+    :undoc-members:
+    :show-inheritance:
+
+Defining Custom Models
+----------------------
+
+.. automodule:: pysipfenn.tests.test_customModel
+    :members:
+    :undoc-members:
+    :show-inheritance:
+
diff --git a/pyproject.toml b/pyproject.toml
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
 
 [project]
 name = "pysipfenn"
-version = "0.13.0"
+version = "0.13.1"
 authors = [
   { name="Adam Krajewski", email="[email protected]" },
     { name="Jonathan Siegel", email="[email protected]" },

diff --git a/pysipfenn/__init__.py b/pysipfenn/__init__.py
@@ -1,2 +1,3 @@
+print('Importing from top pySIPFENN namespace...')
 from pysipfenn.core.pysipfenn import *
-from pysipfenn.core.modelExporters import *
+from pysipfenn.core.modelExporters import *
diff --git a/pysipfenn/core/modelExporters.py b/pysipfenn/core/modelExporters.py
@@ -9,10 +9,8 @@
     from onnxconverter_common import float16
     from onnxsim import simplify
 except ModuleNotFoundError as e:
-    print(f'Could not import {e.name}.\n')
-    print('Dependencies for exporting to CoreML, Torch, and ONNX are not installed by default with pySIPFENN. You need '
-          'to install pySIPFENN in "dev" mode like: pip install -e "pysipfenn[dev]", or like pip install -e ".[dev]" if'
-          'you are cloned it. See pysipfenn.org for more details.')
+    print('Note: Export Dependencies are not installed by default. If you need them, you have to install pySIPFENN in '
+          '"dev" mode like: pip install -e "pysipfenn[dev]", or like pip install -e ".[dev]" (see pysipfenn.org)')
 
 
 class ONNXExporter:
@@ -280,4 +278,4 @@ def exportAll(self):
         """Export all loaded models to CoreML format with the export function."""
         for model in tqdm(self.calculator.loadedModels):
             self.export(model)
-        print('*****  Done exporting all models!  *****')
+        print('*****  Done exporting all models!  *****')
diff --git a/pysipfenn/core/pysipfenn.py b/pysipfenn/core/pysipfenn.py
@@ -24,7 +24,7 @@
 
 # - add new ones here if extending the code
 
-__version__ = '0.13.0'
+__version__ = '0.13.1'
 __authors__ = [["Adam Krajewski", "[email protected]"],
                ["Jonathan Siegel", "[email protected]"]]
 __name__ = 'pysipfenn'

diff --git a/pysipfenn/tests/test_AllCompatibleONNX_Ward2017.py b/pysipfenn/tests/test_AllCompatibleONNX_Ward2017.py
@@ -8,8 +8,12 @@
 IN_GITHUB_ACTIONS = os.getenv("GITHUB_ACTIONS") == "true" and os.getenv("MODELS_FETCHED") != "true"
 
 class TestAllCompatibleONNX_Ward2017(unittest.TestCase):
+    '''_Requires the models to be downloaded first._ It then tests the **runtime** of the pySIPFENN on all POSCAR
+    files in the exampleInputFiles directory and persistence of the results in a CSV file.
+    '''
     @pytest.mark.skipif(IN_GITHUB_ACTIONS, reason="Test depends on the ONNX network files")
     def test_runtime(self):
+        '''Runs the test.'''
         c = pysipfenn.Calculator()
         with resources.files('pysipfenn').joinpath('tests/testCaseFiles/exampleInputFiles/') as exampleInputsDir:
             c.runFromDirectory(directory=exampleInputsDir, descriptor='Ward2017')

diff --git a/pysipfenn/tests/test_KS2022.py b/pysipfenn/tests/test_KS2022.py
@@ -9,8 +9,19 @@
 
 from pysipfenn.descriptorDefinitions import KS2022
 
+
 class TestKS2022(unittest.TestCase):
+    '''Tests the correctness of the KS2022 descriptor generation function by comparing the results to the reference data
+    for the first 25 structures in the exampleInputFiles directory, stored in the exampleInputFilesDescriptorTable.csv.
+    That file that is also used to test the correctness of the Ward2017, which is a superset of the KS2022.
+    '''
     def setUp(self):
+        '''Reads the reference data from the exampleInputFilesDescriptorTable.csv file and the labels from the first
+        row of that file. Then it reads the first 25 structures from the exampleInputFiles directory and generates the
+        descriptors for them. The results are stored in the functionOutput list. It defines the emptyLabelsIndx list
+        that contains the indices of the labels that are not used in the KS2022 (vs Ward2017) descriptor generation. It
+        also persists the test results in the KS2022_TestResult.csv file.
+        '''
         with resources.files('pysipfenn'). \
                 joinpath('tests/testCaseFiles/exampleInputFilesDescriptorTable.csv').open('r', newline='') as f:
             reader = csv.reader(f)
@@ -35,7 +46,12 @@ def setUp(self):
         self.functionOutput = [KS2022.generate_descriptor(s).tolist() for s in tqdm(testStructures[:25])]
         with resources.files('pysipfenn').joinpath('tests/KS2022_TestResult.csv').open('w+', newline='') as f:
             f.writelines([f'{v}\n' for v in self.functionOutput[0]])
+
     def test_resutls(self):
+        '''Compares the results of the KS2022 descriptor generation function to the reference data on a field-by-field
+        basis by calculating the relative difference between the two and requiring it to be less than 1% for all fields
+        except 0-valued fields, where the absolute difference is required to be less than 1e-6.
+        '''
         for fo, trd, name in zip(self.functionOutput, self.testReferenceData, self.exampleInputFiles):
             for eli in self.emptyLabelsIndx:
                 trd.pop(eli)
@@ -50,12 +66,17 @@ def test_resutls(self):
 
 class TestKS2022Profiling(unittest.TestCase):
     '''Test the KS2022 descriptor generation by profiling the execution time of the descriptor generation function
-        for two example structures in serial and parallel (8 workers) mode.'''
+    for two example structures (JVASP-10001 and diluteNiAlloy).
+    '''
     def test_serial(self):
+        '''Test the serial execution of the descriptor generation function 4 times for each of the two examples.'''
         KS2022.profile(test='JVASP-10001', nRuns=4)
         KS2022.profile(test='diluteNiAlloy', nRuns=4)
 
     def test_parallel(self):
+        '''Test the parallel execution of the descriptor generation function 24 times for each of the two examples
+        but in parallel with up to 8 workers to speed up the execution.
+        '''
         KS2022.profileParallel(test='JVASP-10001', nRuns=24)
         KS2022.profileParallel(test='diluteNiAlloy', nRuns=24)
 

diff --git a/pysipfenn/tests/test_KS2022_dilute.py b/pysipfenn/tests/test_KS2022_dilute.py
@@ -15,10 +15,11 @@ class TestKS2022(unittest.TestCase):
     def setUp(self):
         '''Import the lables expected for the KS2022 dilute descriptor (same as KS2022) and initialize 4 test materials
         (mp-13, mp-27, mp-165, mp-1211280) to be used in the tests. The 4 test cases should be sufficient to test the
-        dilute descriptor as general KS2022 is tested more extensively and problems should propagate to the dilute
+        dilute descriptor as general KS2022 is tested more extensively, and problems should propagate to the dilute
         featurizer. To create the dilute structures, 2x2x2 supercells of the test materials are created and the
         atom at site 0 is replaced with aluminum. Results for the first test case, comparing general KS2022, explicit
-        base, and implicit (pure) base, are persisted in the KS2022_dilute_TestReslt.csv'''
+        base, and implicit (pure) base, are persisted in the KS2022_dilute_TestReslt.csv
+        '''
 
         with resources.files('pysipfenn'). \
                 joinpath('descriptorDefinitions/labels_KS2022_dilute.csv').open('r', newline='') as f:
@@ -67,7 +68,10 @@ def test_resutls_assumePure(self):
 
     def test_resutls_explicitBase(self):
         '''Compare the KS2022_dilute featurizer results with general KS2022 using explicit base structures, i.e.
-        structures from before the dilute element was added.'''
+        structures from before the dilute element was added. Calculates the relative difference between the two and
+        requires it to be less than 1% for all fields except 0-valued fields, where the absolute difference is
+        required to be less than 1e-6.
+        '''
         for fo, trd, name in zip(self.functionOutput_explicitBase, self.testReferenceData, self.testMaterialsLabels):
             for p_fo, p_trd, l in zip(fo, trd, self.labels):
                 if p_trd>0.01 and p_fo>0.01:
@@ -80,10 +84,15 @@ def test_resutls_explicitBase(self):
 
 class TestKS2022_diluteProfiling(unittest.TestCase):
     '''Test the dilute version of KS2022 descriptor generation by profiling the execution time of the descriptor generation function
-        for one example structures in serial and parallel (8 workers) mode.'''
+    for one example dilute structure.
+    '''
     def test_serial(self):
+        '''Test the serial execution of the descriptor generation function 10 times.'''
         KS2022_dilute.profile(test='diluteNiAlloy', nRuns=10)
     def test_parallel(self):
+        '''Test the parallel execution of the descriptor generation function 64 times but in parallel with up to 8
+        workers to speed up the execution.
+        '''
         KS2022_dilute.profileParallel(test='diluteNiAlloy', nRuns=64)
 
 

diff --git a/pysipfenn/tests/test_Krajewski2020_NN9NN20NN24_ONNX.py b/pysipfenn/tests/test_Krajewski2020_NN9NN20NN24_ONNX.py
@@ -17,8 +17,14 @@
     testStructure = Structure.from_file(f'{exampleInputsDir}/{testFile}')
 
 class TestKrajewski2020ModelsFromONNX(unittest.TestCase):
+    '''_Requires the NN9/20/24 models to be downloaded first._ It takes the 0-Cr8Fe18Ni4.POSCAR file from the
+    exampleInputFiles directory and calculates the energy with the NN9/20/24 models. The results are then compared to
+    the reference results obtained by authors using pySIPFENN (MxNet->ONNX->PyTorch) and SIPFENN (directly in MxNet)
+    to the 6th decimal place (0.001 meV/atom).
+    '''
     @pytest.mark.skipif(IN_GITHUB_ACTIONS, reason="Test depends on the ONNX network files")
     def test_resutls(self):
+        '''Runs the test.'''
         c = pysipfenn.Calculator()
         c.calculate_Ward2017(structList=[testStructure])
         c.makePredictions(models=c.loadedModels, toRun=toTest, dataInList=c.descriptorData)