Making sense of PDB files generated by DiffDock-PP #10

PatWalters · 2023-05-09T14:01:42Z

Thank you for providing an example showing how to run DiffDock-PP inference with PDB files. I was able to run the example you provided, but I'm confused by the output. The pdb files generated by the inference script don't look like protein structures. As an example, consider the reference receptor for the 1A2K example. If I look at only the alpha carbons for the 1A2K_r_b.pdb I see this, which looks like a normal protein structure.

However the resulting protein and ligand pdb files look nothing like proteins. Is this a bug or am I missing something?

aretasg-alchemab · 2023-05-26T13:50:06Z

I would be interested in an answer too. I am confused as to why a "rigid" docking tool attempts to alter the receptor conformation in any way.

grahamtholt · 2023-06-11T20:21:26Z

I believe that this is an issue with the format of the output PDB files. It does not appear to me that conformations are being altered. You can confirm this by viewing the "spheres" representation in PyMol, it shows C_alpha atoms in the correct places.

I have found that modifying the output PDB files in the following manner allows PyMol to correctly display the C_alpha trace:

Remove the extra chain code character present in column 23
Add residue numbers in columns 23-26

For more info see the ATOM record guidelines found here, among other places.

My guess is that this is just a minor bug in the PDB write code.

PatWalters · 2023-06-12T01:17:42Z

Thank you, @grahamtholt, that was incredibly helpful. I found that I could fix the pdb files with three lines of code using Prody.

import prody

def fix_pdb(infile_name, outfile_name):
    prot = prody.parsePDB(infile_name)
    prody.writePDB(outfile_name, prot)

You can also add the code below to make Prody less chatty.

prody.confProDy(verbosity='critical')

Sue-Fwl · 2024-10-26T18:25:55Z

Hello @PatWalters,
Can you please share the method you parsed with the output list of HeteroData? I couldn't find much clues in the repo.
Running the example file a list of HeteroData were returned in the pickle file and I couldn't find much data to parse into a pdb file.

HeteroData( name='1A2K', center=[1, 3], receptor={ pos=[248, 3], x=[248, 1281], }, ligand={ pos=[196, 3], x=[196, 1281], }, (receptor, contact, receptor)={ edge_index=[2, 4960] }, (ligand, contact, ligand)={ edge_index=[2, 3920] } ), inf)

This was referenced Aug 3, 2023

Interpreting output as a docked protein complex #17

Open

Interpreting DiffDock-PP output as a docked protein complex gcorso/DiffDock#147

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Making sense of PDB files generated by DiffDock-PP #10

Making sense of PDB files generated by DiffDock-PP #10

PatWalters commented May 9, 2023

aretasg-alchemab commented May 26, 2023 •

edited

Loading

grahamtholt commented Jun 11, 2023 •

edited

Loading

PatWalters commented Jun 12, 2023

Sue-Fwl commented Oct 26, 2024

Making sense of PDB files generated by DiffDock-PP #10

Making sense of PDB files generated by DiffDock-PP #10

Comments

PatWalters commented May 9, 2023

aretasg-alchemab commented May 26, 2023 • edited Loading

grahamtholt commented Jun 11, 2023 • edited Loading

PatWalters commented Jun 12, 2023

Sue-Fwl commented Oct 26, 2024

aretasg-alchemab commented May 26, 2023 •

edited

Loading

grahamtholt commented Jun 11, 2023 •

edited

Loading