Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Objectives #29

Closed
t-kimber opened this issue Jun 9, 2021 · 3 comments
Closed

Objectives #29

t-kimber opened this issue Jun 9, 2021 · 3 comments
Assignees
Labels

Comments

@t-kimber
Copy link
Collaborator

t-kimber commented Jun 9, 2021

Objectives

maxsmi code:

  • Provides confidence in the prediction using ensemble learning
  • Provides augmentation strategies to be used e.g. in benchmarks
  • Provides a guide to best practices in SMILES augmentation
  • Provides a systematic way of generating SMILES augmentation

maxsmi research:

  • Does augmentation improve performance independently of the model ?
  • Does the performance (only/strictly) increase with augmentation or does it reach a plateau ?
  • What is the best augmentation strategy ?
  • Can the model learn the inherent symmetry of the molecules using controlled duplication ?
  • (Show that in the string-encoding of molecules, SMILES augmentation works better than SMILES, SELFIES and DeepSMILES.)

Datasets

Size / Dataset Data before processing Data after processing Train set (before augmenting) 80% Test set (before augmenting) 20%
ESOL 1128 1128 902 226
ESOL_small 1128 1068 854 214
free solv 642 642 513 129
lipophilicity 4200 4199 3359 840
ChEMBL28 6026 5849 4679 1170

Results

To answer the questions from the objectives, #28 is created.

Note:

  • Delaney corresponds to ESOL
  • SAMPL corresponds to free solv

MoleculeNet results:

@t-kimber t-kimber pinned this issue Jun 11, 2021
@t-kimber t-kimber added the good first issue Good for newcomers label Jun 11, 2021
@t-kimber t-kimber self-assigned this Jun 11, 2021
@t-kimber
Copy link
Collaborator Author

@t-kimber
Copy link
Collaborator Author

MoleculeNet RMSE results
Screenshot 2021-06-11 at 10 51 28

@t-kimber
Copy link
Collaborator Author

CNF RMSE results
Screenshot 2021-06-11 at 11 03 20

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant