Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON Decode Error when using similarity search #55

Open
Cajac102 opened this issue Jul 28, 2021 · 2 comments
Open

JSON Decode Error when using similarity search #55

Cajac102 opened this issue Jul 28, 2021 · 2 comments

Comments

@Cajac102
Copy link

Hey,

I am trying to search pubchem for similar compounds with this call:

similars = pcp.get_compounds(smile, 'smiles', searchtype='similarity', threshold=0.7, as_dataframe=True)

This works well for some SMILES, for example for "Cc1noc(C)c1Br".
But for others, e.g. "Cn1c(=O)c2nc(Cl)[nH]c2n(C)c1=O", I get the following error:

Traceback (most recent call last):
  File "/home/caro/leval/.snakemake/scripts/tmpqq4csqb1.find_pubchem_hits.py", line 37, in <module>
    similars = pcp.get_compounds("Cn1c(=O)c2nc(Cl)[nH]c2n(C)c1=O", 'smiles', searchtype='similarity', threshold=similarity_threshold, as_dataframe=True)
  File "/home/caro/leval/.snakemake/conda/db9d54b7c1d0500c41e4539e39469ab2/lib/python3.9/site-packages/pubchempy.py", line 321, in get_compounds
    results = get_json(identifier, namespace, searchtype=searchtype, **kwargs)
  File "/home/caro/leval/.snakemake/conda/db9d54b7c1d0500c41e4539e39469ab2/lib/python3.9/site-packages/pubchempy.py", line 299, in get_json
    return json.loads(get(identifier, namespace, domain, operation, 'JSON', searchtype, **kwargs).decode())
  File "/home/caro/leval/.snakemake/conda/db9d54b7c1d0500c41e4539e39469ab2/lib/python3.9/site-packages/pubchempy.py", line 288, in get
    status = json.loads(response.decode())
  File "/home/caro/leval/.snakemake/conda/db9d54b7c1d0500c41e4539e39469ab2/lib/python3.9/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/home/caro/leval/.snakemake/conda/db9d54b7c1d0500c41e4539e39469ab2/lib/python3.9/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/home/caro/leval/.snakemake/conda/db9d54b7c1d0500c41e4539e39469ab2/lib/python3.9/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Expecting ',' delimiter: line 322816 column 7 (char 7196244)

If I turn the double quotation marks around the SMILES into single ones, I get

json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 215438 column 3 (char 4812373)

I would be glad if you could help me here!

Cheers,
Caro

@nbehrnd
Copy link

nbehrnd commented Aug 11, 2021

May you share a MWE yielding this problem? With a minimal

import pubchempy as pcp


def retrieve_similar(structure=""):
    """Retrieve PubChem entries of similar structure."""
    similars = pcp.get_compounds(structure,
                                 'smiles',
                                 searchtype='similarity',
                                 threshold=0.7,
                                 as_dataframe=True)
    print(similars)


# the example working fine
retrieve_similar("Cc1noc(C)c1Br")

(or, retrieve_similar("Cn1c(=O)c2nc(Cl)[nH]c2n(C)c1=O"), respectively), I interpret the output for both like a successful interaction with the database (Python 3.9.2, PubChemPy 1.0.4). For documentation, the archive below includes a Jupyter notebook with a one-time code.

test_case_similarity.zip

@Cajac102
Copy link
Author

Thanks!
With your example,

retrieve_similar("Cn1c(=O)c2nc(Cl)[nH]c2n(C)c1=O")

works perfectly for me too.
However,

ligand_smile = "Cn1c(=O)c2nc(Cl)[nH]c2n(C)c1=O"
retrieve_similar(ligand_smile)

throws a JSON decode error again.

I found two fixes.

  1. Explicitely casting it into a string before makes it work again:
    retrieve_similar(str(ligand_smile))

This confuses me because type(ligand_smile) and type("Cn1c(=O)c2nc(Cl)[nH]c2n(C)c1=O") both give me <class 'str'>.

  1. I used python 3.7.10 and PubChemPy 1.0.4. Upgrading to python 3.9 also fixed the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants