Validation loss vs training loss #708

alfonso2166 · 2024-11-21T16:46:30Z

alfonso2166
Nov 21, 2024

Hi all,
My training goes flawlessly well. However, I was wondering if it makes sense to plot validation/training loss as a function of epochs. I did that with the results of my training and got almost identical results. I varied max_L and r_max, screenshot attached.

Mace_Training.pdf

Specs of my training are below:

model: "MACE"
max_L: 0
r_max: 4.0
name: "mace01"
model_dir: "MACE_models"
log_dir: "MACE_models"
checkpoints_dir: "MACE_models"
results_dir: "MACE_models"
compute_stress: True
compute_forces: True
train_file: "train.xyz"
valid_fraction: 0.10
test_file: "test.xyz"
num_interactions: 2
correlation: 3
num_channels: 64
energy_key: "energy_xtb"
forces_key: "forces_xtb"
stress_key: "REF_stress"
error_table: "PerAtomRMSEstressvirials"
loss: "weighted"
forces_weight: 1.0
energy_weight: 1.0
stress_weight: 1.0
config_type_weights: '{"Default":1.0}'
device: cuda
batch_size: 10
max_num_epochs: 100
swa: False
seed: 123

All the best,
Alfonso

ilyes319 · 2024-11-25T14:14:03Z

ilyes319
Nov 25, 2024
Maintainer

Hello @alfonso2166 ,

The effect of different cutoff and model sizes is very system dependent. Some systems are well described with small models and small cutoffs. Can we know a bit more what you are training on : system/level of theory. Also if you share your log files for different runs, it would help me understanding better what is happening.

0 replies

alfonso2166 · 2024-12-08T04:38:36Z

alfonso2166
Dec 8, 2024
Author

Hi Ilyes,

Thanks for your response. I am training a model for water molecules confined between hexagonal boron nitride sheets. My data set is built with QBox using SCAN with supercells containing from 25 to 100 molecules of water. I attached some log and train files as requested.

From my plots, it looks like validation loss is ~10 times the train loss. I was wondering if this might be related to the use of 10% of my training set as validation.

Mace_Training.pdf

All the best,
Alfonso

L0_R4_mace01_run-123_train.txt
L0_R4_mace01_run-123.log
L0_R6_mace01_run-123_train.txt
L0_R6_mace01_run-123.log
L0_R7_mace01_run-123_train.txt
L0_R7_mace01_run-123.log

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Validation loss vs training loss #708

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Validation loss vs training loss #708

alfonso2166 Nov 21, 2024

Replies: 2 comments

ilyes319 Nov 25, 2024 Maintainer

alfonso2166 Dec 8, 2024 Author

alfonso2166
Nov 21, 2024

ilyes319
Nov 25, 2024
Maintainer

alfonso2166
Dec 8, 2024
Author