Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build an iterative phonon flow #306

Open
wants to merge 54 commits into
base: main
Choose a base branch
from
Open

Build an iterative phonon flow #306

wants to merge 54 commits into from

Conversation

JaGeo
Copy link
Collaborator

@JaGeo JaGeo commented Dec 21, 2024

Very first draft for an iterative phonon flow. There are still many, many things to solve before this can work.

  • make sure that a new random seed is used for each step (make sure the order of the settings has been considered in the seed computation)

  • reuse the previous database

  • make sure that the phonon benchmark runs are reused as well

  • expose all relevant outputs to allow for an iterative procedure

  • random_seed + len(workflow_maker.volume_custom_scale_factors)

  • fix previous unit tests

  • fix reading mace models

  • improve distort_type_1

  • make sure random seed creation is done correctly when very similar structures are provided in "structures" - currently, this is not correctly considered (needs to be done in iterative flow as well)

  • fix write_benchmark_metrics

  • document that only 0.01 displacement is used in the benchmark

  • do some testing on the cluster to confirm the options and capabilities

These points will be moved into a subsequent issue.

  • add better tests for random structure generation and random seeds!!!!
  • Further feature ideas: only add new rss structures where benchmark results is not yet good enough
  • Clean up test data: I made more strict tests and had to copy the previous test data to a new folder. We could try to clean it up. However, it should not lead to less strictly done tests. We need a test that ensures that the random seed is upgraded

@JaGeo
Copy link
Collaborator Author

JaGeo commented Dec 22, 2024

Disclaimer: This is a small programming project that I might work on for fun in the next weeks.

@JaGeo JaGeo marked this pull request as draft December 22, 2024 18:05
@JaGeo JaGeo changed the title Build an iterative phonon flow WIP: Build an iterative phonon flow Dec 22, 2024
@JaGeo JaGeo marked this pull request as ready for review December 22, 2024 23:01
@JaGeo JaGeo changed the title WIP: Build an iterative phonon flow Build an iterative phonon flow Dec 22, 2024
if volume_custom_scale_factors is None:
if volume_scale_factor_range is None:
volume_scale_factor_range = [0.90, 1.1]

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add warning that default scale factor range is used

Comment on lines +316 to +326
new_atoms_list = [
atoms
for atoms in atoms_list
if atoms.info["config_type"] != "IsolatedAtom"
]

ase.io.write(destination_file_path, new_atoms_list, append=True)

logging.info(
f"File {self.pre_xyz_files[0]} has been copied to {destination_file_path}"
)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a good universal solution? maybe, i can reduce it to just one set of isolatedAtom energies to avoid failures

Comment on lines -318 to -334
if self.pre_database_dir and os.path.exists(self.pre_database_dir):
if len(self.pre_xyz_files) == 2:
files_new = ["train.extxyz", "test.extxyz"]
for file_name, file_new in zip(self.pre_xyz_files, files_new):
with (
open(
os.path.join(self.pre_database_dir, file_name)
) as pre_xyz_file,
open(file_new, "a") as xyz_file,
):
xyz_file.write(pre_xyz_file.read())
logging.info(f"File {file_name} has been copied to {file_new}")

elif len(self.pre_xyz_files) > 2:
raise ValueError(
"Please provide a train and a test extxyz file (two files in total) for the pre_xyz_files."
)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did not keep the split ratio constant

Comment on lines 127 to 128
print(train_data_path)
print(test_data_path)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

@@ -1770,6 +1781,7 @@ def run_gap(num_processes_fit: int, parameters) -> None:
open("std_gap_out.log", "w", encoding="utf-8") as file_std,
open("std_gap_err.log", "w", encoding="utf-8") as file_err,
):
print(*parameters)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

@JaGeo
Copy link
Collaborator Author

JaGeo commented Dec 23, 2024

default_hyperparameters = load_mlip_hyperparameter_defaults(
Does not work correctly if the make is not directly used but rather later on the cluster. We have to think about alternative ways to load the default config.

See #305 but we might need to investigate if the post init is really enough for this case as well

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant