A collection of curated and standardized Molecular DataSets (MolDS) for benchmarking machine learning methods. For all datasets, we provide standardized dataset splitting.
- All the datasets are curated and standardized in the same procedure.
- We provide standardized data splitting (Details see the summary).
- Tools for curating and standardizing new datasets.
Prerequirments:
conda install -c conda-forge rdkit