Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to avoid the unknown configurations while fitting. #359

Open
darjaved opened this issue Feb 26, 2024 · 4 comments
Open

How to avoid the unknown configurations while fitting. #359

darjaved opened this issue Feb 26, 2024 · 4 comments

Comments

@darjaved
Copy link

I did the enumeration by Pymatgen and When i imported my DFT data, CASM created some additional configurations, but those are without any DFT data ( energies, converged structures). these unknown configurations are still in my CASM project. In my train file i only have structures for which i have DFT data i.e., the calculated ones. When i use the command "casm-learn -s fit.json --checkhull > fit_log.txt" it prints those unknown configurations also as "unknown".

-- Check: individual 0 --
Index: Selected #Selected CV RMS wRMS Estimator FeatureSelection Note

0: 1111111111111111111111011111111 30           0.026055716  0.010945006  0.013644882  Lasso                    SelectFromModel

Writing: /data/javeedd/BCC/CE/cluster_expansions/clex.formation_energy/calctype.default/ref.default/bset.default/eci.__tmp/eci.json

DFT ground states:
name comp(a) configname dft_hull_dist formation_energy clex_hull_dist clex(formation_energy) clex_dft_hull_dist
SCEL1_1_1_1_0_0_0/0 0.000 SCEL1_1_1_1_0_0_0/0 0.0 0.000000 0.000000 0.000942 0.0
SCEL12_6_2_1_1_5_2/2 0.250 SCEL12_6_2_1_1_5_2/2 0.0 -0.153580 0.000000 -0.146531 0.0
SCEL12_6_1_2_0_5_4/1 0.500 SCEL12_6_1_2_0_5_4/1 0.0 -0.225089 0.028147 -0.209743 0.0
SCEL8_2_2_2_0_0_0/2 0.625 SCEL8_2_2_2_0_0_0/2 0.0 -0.190727 0.031752 -0.160895 0.0
SCEL1_1_1_1_0_0_0/1 1.000 SCEL1_1_1_1_0_0_0/1 0.0 0.000000 0.000000 0.002189 0.0

Predicted ground states:
name comp(a) configname dft_hull_dist formation_energy clex_hull_dist clex(formation_energy) clex_dft_hull_dist
SCEL1_1_1_1_0_0_0/0 0.000000 SCEL1_1_1_1_0_0_0/0 0.00000000 0.00000000 0.0 0.000942 0.000000
SCEL12_6_2_1_1_5_2/2 0.250000 SCEL12_6_2_1_1_5_2/2 0.00000000 -0.15358013 0.0 -0.146531 0.000000
SCEL6_3_1_2_0_2_2/0 0.333333 SCEL6_3_1_2_0_2_2/0 unknown unknown 0.0 -0.195637 -0.028036
SCEL2_2_1_1_0_1_1/0 0.500000 SCEL2_2_1_1_0_1_1/0 unknown unknown 0.0 -0.237890 -0.028147
SCEL12_6_2_1_1_4_0/5 0.583333 SCEL12_6_2_1_1_4_0/5 0.00320163 -0.19897974 0.0 -0.208706 -0.031529
SCEL4_2_2_1_1_0_0/0 0.750000 SCEL4_2_2_1_1_0_0/0 unknown unknown 0.0 -0.144467 -0.037934
SCEL1_1_1_1_0_0_0/1 1.000000 SCEL1_1_1_1_0_0_0/1 0.00000000 0.00000000 0.0 0.002189 0.000000

@xivh
Copy link
Contributor

xivh commented Feb 28, 2024

Are they duplicate structures (non-primitive) or are they unique? If they are unique, then you probably want to keep them there.

I don't use casm learn, but one thing you could try is to create a new selection file without them with the --subset function of casm select. Then read back in this selection file with casm select -c selection.json --set selected and the configurations should hopefully not show up anymore (you can check with casm query).

@darjaved
Copy link
Author

how should i keep them there, i don't have the DFT data for them, those were created during the CASM import from my calculated configurations i think.

@darjaved
Copy link
Author

darjaved commented Mar 1, 2024

i think this issue is same as discussed here.

#293 (comment)

@xivh
Copy link
Contributor

xivh commented Mar 1, 2024

Try importing with the setting {"mapping": {"primitive_only": true}}.

You can check if they are primitive by querying is_primitive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants