Some motifs in generated vocabulary are not parseable for rdkit #45

suyufeng · 2022-11-11T19:40:33Z

I was trying to build our customized language models. I found the pattern "C1=CC=CCNCCcc[cH:1]CC=CCCCC=CCCC=CCCCCC=C1" generated by "get_vocab.py" are not parseable for rdkit.

So when I ran the "preprocess.py", it would report an error on hgraph2graph/hgraph/vocab.py line 65, in count_inters:
inters = [a for a in mol.GetAtoms() if a.GetAtomMapNum() > 0]
AttributeError: 'NoneType' object has no attribute 'GetAtoms'

It is because within the function vocab.py::count_inters, the code tried to covert smile to mol:
line 64: mol = Chem.MolFromSmiles(s)

I would appreciate someone can provide a solution.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some motifs in generated vocabulary are not parseable for rdkit #45

Some motifs in generated vocabulary are not parseable for rdkit #45

suyufeng commented Nov 11, 2022

Some motifs in generated vocabulary are not parseable for rdkit #45

Some motifs in generated vocabulary are not parseable for rdkit #45

Comments

suyufeng commented Nov 11, 2022