Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up rng #285

Merged
merged 15 commits into from
Jan 31, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 11 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,16 +11,17 @@ For examples and more information see [documentation](https://libatoms.github.io

# Recent changes

Renames:
v0.2.0:

- `generic.run()` -> `generic.calculate()`
- `wfl.map.run()` -> `wfl.map.map()`
- `wfl.generate.md.sample()` -> `wfl.generate.md.md()`
- `wfl.generate.optimize.run()` -> `wfl.generate.optimize.optimize()`
- `wfl.generate.buildcell.run()` -> `wfl.generate.buildcell.buildcell()`
- `wfl.generate.minimahopping.run()` -> `wfl.generate.minimahopping.minimahopping()`
- `phonopy.run()` -> `phonopy.phonopy()`
- `smiles.run()` -> `smiles.smiles()`
- `wfl.descriptors.quippy.calc()` -> `wfl.descriptors.quippy.calculate()`
- Change all wfl operations to use explicit random number generator [pull 285](https://github.com/libAtoms/workflow/pull/285), to improve reproducibility of scripts and reduce the chances that on script rerun, cached jobs will not be recognized due to uncontrolled change in random seed (as in [issue 283](https://github.com/libAtoms/workflow/issues/283) and [issue 284](https://github.com/libAtoms/workflow/issues/284)). Note that this change breaks backward compatibility because many functions now _require_ an `rng` argument, for example
```python
rng = np.random.default_rng(1)
md_configs = md.md(..., rng=rng, ...)
```

v0.1.0:

- make it possible to fire off several remote autoparallellized ops without waiting for their jobs to finish
- multi-pass calclation in `Vasp`, to allow for things like GGA followed by HSE
- MACE fitting, including remote jobs
- various bug fixes
4 changes: 2 additions & 2 deletions complete_pytest.tin
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ module load compiler/gnu python/system python_extras/quippy lapack/mkl
module load python_extras/torch/cpu

if [ -z "$WFL_PYTEST_EXPYRE_INFO" ]; then
echo "To override partition used, set WFL_PYTEST_EXPYRE_INFO='{"resources" : {"partitions": "DESIRED_PARTITION"}}'" 1>&2
echo "To override partition used, set WFL_PYTEST_EXPYRE_INFO='{\"resources\" : {\"partitions\": \"DESIRED_PARTITION\"}}'" 1>&2
fi

if [ ! -z $WFL_PYTHONPATH_EXTRA ]; then
Expand Down Expand Up @@ -66,7 +66,7 @@ echo "summary line $l"
# ===== 152 passed, 17 skipped, 3 xpassed, 78 warnings in 4430.81s (1:13:50) =====
lp=$( echo $l | sed -E -e 's/ in .*//' -e 's/\s*,\s*/\n/g' )

declare -A expected_n=( ["passed"]="160" ["skipped"]="21" ["warnings"]=828 ["xfailed"]=2 ["xpassed"]=1 )
declare -A expected_n=( ["passed"]="163" ["skipped"]="21" ["warnings"]=801 ["xfailed"]=2 ["xpassed"]=1 )
IFS=$'\n'
for out in $lp; do
out_n=$(echo $out | sed -e 's/^=* //' -e 's/ .*//' -e 's/,//')
Expand Down
54 changes: 22 additions & 32 deletions docs/source/examples.daisy_chain_mlip_fitting.ipynb
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
Expand All @@ -22,7 +21,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
Expand Down Expand Up @@ -88,19 +86,14 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"nbsphinx": "hidden"
},
"metadata": {},
"outputs": [],
"source": [
"# set random seed, so that MD runs, etc are reproducible and we can check for RMSEs. \n",
"# this cell is hidden from tutorials. \n",
"random_seed = 20230301\n",
"np.random.seed(random_seed)"
"rng = np.random.default_rng(20230301)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
Expand All @@ -127,7 +120,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
Expand Down Expand Up @@ -157,7 +149,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
Expand Down Expand Up @@ -196,15 +187,16 @@
"metadata": {},
"outputs": [],
"source": [
"# regenerate smiles_confgis with a random seed set for testing purposes\n",
"# regenerate smiles_configs with a random seed set for testing purposes\n",
"# this cell is hidden from docs. \n",
"\n",
"outputs = OutputSpec(\"1.ch.rdkit.xyz\", overwrite=True)\n",
"smiles_configs = smiles.smiles(all_smiles, outputs=outputs, randomSeed=random_seed)"
"# set seed for smiles generation to a value from our (reproducible) random number generator\n",
"smiles_configs = smiles.smiles(all_smiles, outputs=outputs, randomSeed=int(rng.integers(np.iinfo(np.int32).max,\n",
" dtype=np.int32)))"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
Expand Down Expand Up @@ -252,7 +244,7 @@
"# add random seet for testing purposes\n",
"# this cell is hidden from tutorials. \n",
"\n",
"md_params[\"autopara_rng_seed\"] = random_seed"
"md_params[\"rng\"] = rng"
]
},
{
Expand All @@ -272,7 +264,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
Expand Down Expand Up @@ -301,7 +292,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
Expand Down Expand Up @@ -353,7 +343,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
Expand All @@ -373,7 +362,7 @@
" inputs=md_soap_global,\n",
" outputs=outputs,\n",
" num=100, # target number of structures to pick\n",
" at_descs_info_key=\"SOAP\")\n",
" at_descs_info_key=\"SOAP\", rng=rng)\n",
" \n",
"\n",
"train_fname = \"6.1.train.xyz\"\n",
Expand Down Expand Up @@ -402,7 +391,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
Expand Down Expand Up @@ -482,7 +470,9 @@
"# set to None for github testing purposes\n",
"# This cell is hidden from being rendered in the docs. \n",
"# remote_info = None\n",
"gap_params[\"rnd_seed\"] = random_seed "
"\n",
"# set seed for gap fitting to a value from our (reproducible) random number generator\n",
"gap_params[\"rnd_seed\"] = rng.integers(2 ** 31) "
]
},
{
Expand All @@ -501,7 +491,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
Expand All @@ -528,11 +517,14 @@
" num_cores = 2,\n",
" partitions = \"standard\")\n",
"\n",
"# note - set OMP_NUM_THREADS to prevent GAP evaluation from using OpenMP, \n",
"# as it clashes with multiprocessing.pool, which wfl uses\n",
"remote_info = RemoteInfo(\n",
" sys_name = \"github\",\n",
" job_name = \"gap-eval\",\n",
" resources = resources,\n",
" check_interval = 10, \n",
" env_vars = [\"OMP_NUM_THREADS=1\"],\n",
" input_files = [\"gap.xml*\"])\n",
"\n",
"gap_calc_autopara_info = AutoparaInfo(\n",
Expand All @@ -549,7 +541,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
Expand Down Expand Up @@ -629,13 +620,13 @@
"\n",
"ref_errors = {\n",
" 'atomization_energy/atom': {\n",
" 'train': {'RMSE': 0.01784008096943666, 'MAE': 0.014956982529185385, 'count': 50}, \n",
" '_ALL_': {'RMSE': 0.01810274269833562, 'MAE': 0.015205129586306461, 'count': 100}, \n",
" 'test': {'RMSE': 0.018361647458989924, 'MAE': 0.015453276643427535, 'count': 50}}, \n",
" 'train': {'RMSE': 0.026086070005729697, 'MAE': 0.02188233965637977, 'count': 50},\n",
" '_ALL_': {'RMSE': 0.02631013293566519, 'MAE': 0.02217457525203221, 'count': 100},\n",
" 'test': {'RMSE': 0.026532303741682858, 'MAE': 0.02246681084768465, 'count': 50}},\n",
" 'forces/comp': {\n",
" 'train': {'RMSE': 0.6064186755127647, 'MAE': 0.440771517768513, 'count': 5037}, \n",
" '_ALL_': {'RMSE': 0.6150461768441347, 'MAE': 0.44579797424459006, 'count': 10074}, \n",
" 'test': {'RMSE': 0.6235543194385854, 'MAE': 0.450824430720667, 'count': 5037}}\n",
" 'train': {'RMSE': 0.6700484048236104, 'MAE': 0.48912067870708253, 'count': 5097},\n",
" '_ALL_': {'RMSE': 0.674843619212047, 'MAE': 0.49305958821652923, 'count': 10188},\n",
" 'test': {'RMSE': 0.6796105918104188, 'MAE': 0.49700313992928696, 'count': 5091}}\n",
"}\n",
"\n",
"print(errors)\n",
Expand All @@ -659,7 +650,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "dev",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
Expand All @@ -673,9 +664,8 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.13"
"version": "3.9.16"
},
"orig_nbformat": 4,
"vscode": {
"interpreter": {
"hash": "793ab331f77558158f2e16fabf356357fde3f61b8f3bb6d95e9b59dbfcb88650"
Expand Down
6 changes: 4 additions & 2 deletions docs/source/examples.select_fps.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,10 @@
"metadata": {},
"outputs": [],
"source": [
"import wfl, pathlib\n",
"import numpy as np\n",
"import yaml\n",
"import pathlib\n",
"import wfl\n",
"from wfl.configset import ConfigSet, OutputSpec\n",
"from wfl.descriptors.quippy import calculate as calc_descriptors\n",
"from wfl.select.by_descriptor import greedy_fps_conf_global\n",
Expand All @@ -53,7 +55,7 @@
"# Step 2: Sampling\n",
"fps_out = OutputSpec(files=work_dir/\"out_fps.xyz\")\n",
"nsamples = 8\n",
"selected_configs = greedy_fps_conf_global(inputs=md_desc, outputs=fps_out, num=nsamples, at_descs_info_key='desc', keep_descriptor_info=False)"
"selected_configs = greedy_fps_conf_global(inputs=md_desc, outputs=fps_out, num=nsamples, at_descs_info_key='desc', keep_descriptor_info=False, rng=np.random.default_rng())"
]
},
{
Expand Down
7 changes: 4 additions & 3 deletions examples/iterative_gap_fit/batch_gap_fit.py
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ def get_ref_error(in_file, out_file, gap_file, **kwargs):
return error


def run_md(atoms, out_file, gap_file, **kwargs):
def run_md(atoms, out_file, gap_file, rng, **kwargs):
"""
Generates new configs via the wfl.generate_configs.md sample function.

Expand All @@ -91,7 +91,7 @@ def run_md(atoms, out_file, gap_file, **kwargs):
in_config = ConfigSet(atoms)
out_config = OutputSpec(files=out_file)
calculator = (Potential, None, {'param_filename': gap_file})
sample_md(in_config, out_config, calculator=calculator, **kwargs)
sample_md(in_config, out_config, calculator=calculator, rng=rng, **kwargs)
return None


Expand Down Expand Up @@ -305,6 +305,7 @@ def get_file_names(GAP_dir, MD_dir, fit_idx, calc='md'):

def main(max_count=5, verbose=False):
workdir = os.path.join(os.path.dirname(__file__))
rng = np.random.default_rng(1)

### GAP parameters
gap_params = os.path.join(workdir, 'multistage_gap_params.json')
Expand Down Expand Up @@ -367,7 +368,7 @@ def main(max_count=5, verbose=False):

if calc == 'md':
# Run an MD to create new structures
run_md(md_configs, files["calc_out"], files["gap"], **md_params)
run_md(md_configs, files["calc_out"], files["gap"], rng=rng, **md_params)
elif calc == 'optimize':
# Run an ase relaxation, to create new structures.
run_optimize(md_configs, files["calc_out"], files["gap"], **optimize_params)
Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

setuptools.setup(
name="wfl",
version="0.1.0b",
version="0.2.0",
packages=setuptools.find_packages(exclude=["tests"]),
install_requires=["click>=7.0", "numpy", "ase>=3.21", "pyyaml", "spglib", "docstring_parser",
"expyre-wfl @ https://github.com/libAtoms/ExPyRe/tarball/main",
Expand Down
6 changes: 3 additions & 3 deletions tests/assets/cli_rss/job.test_cli_rss_create_ref.slurm
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
#SBATCH --exclusive
#SBATCH --output=test_cli_rss_create_ref.stdout
#SBATCH --error=test_cli_rss_create_ref.stderr
#SBATCH --time=2:00:00
#SBATCH --time=6:00:00

pwd

Expand All @@ -21,8 +21,8 @@ export OMP_NUM_THREADS=1
export OPENBLAS_NUM_THREADS=1
export MKL_NUM_THREADS=1

export VASP_COMMAND=vasp.serial
export VASP_COMMAND_GAMMA=vasp.gamma_serial
export ASE_VASP_COMMAND=vasp.serial
export ASE_VASP_COMMAND_GAMMA=vasp.gamma.serial
export VASP_PP_PATH=$VASP_PATH/pot/rev_54/PBE
export GRIF_BUILDCELL_CMD=$HOME/src/work/AIRSS/airss-0.9.1/src/buildcell/src/buildcell

Expand Down
Loading
Loading