Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to prepare data using RNA graphs generated by RNAglib #5

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

jgcarvajalp
Copy link

No description provided.

@cgoliver
Copy link
Owner

Thank you for this @jgcarvajalp ! Would you be able to include a small step by step guide with this PR for what steps should be taken now to prepare the data?

@jgcarvajalp
Copy link
Author

jgcarvajalp commented Nov 23, 2022

  1. download cif files
  2. run python binding_pocket_analyse.py
if __name__ == "__main__":
    pdb_path = os.path.join("../data/cif_files") # path to cif files
    process_all(pdb_path, "../data/lig_dict.p")
  1. python binding_pocket_filter.py
if __name__ == "__main__":
    d = pickle.load(open('../data/lig_dict.p', 'rb'))
    c = 10
    conc = .6
    ligs = get_valids(d, c, conc, min_size=4)
    pickle.dump(ligs, open("../data/lig_dict_filter.p", "wb"))
  1. python build_dataset.py
  • Line 188: update the path to the folder that contains the cif files
  • Line 194: update the path to the folder that contains the RNA graphs
  • Check that the PDB ID of the RNA graphs filer is uppercase
if __name__ == "__main__":
    get_binding_site_graphs_all('../data/lig_dict_filter.p', '../data/pockets_nx', non_binding=False)
  1. python annotator.py
annotate_all(parallel=False, graph_path="../data/pockets_nx", dump_path="../data/annotated/pockets_nx_annotated",
        ablate=False, mode='fp')

@cgoliver
Copy link
Owner

@jgcarvajalp i am assuming here that the user should run RNAglib separately in order to create the folder that stores the graphs?

see: "Line 194: update the path to the folder that contains the RNA graphs"

@jgcarvajalp
Copy link
Author

@cgoliver Yes. The user should generate the graphs separately.

@JasonJiangs
Copy link

Hello, thank you for concluding this data preparation pipeline. But when I follow this pipeline, it seems I had some issues.

First, in the second step of getting the lig_dict.p, it only contains the mg ions as the ligand, so I reversed the conditional statement at line 150 in binding_pocket_analyse.py.
Then I tried some standard cif files with the pipeline for test, but at the step 4, it return a graph with 0 nodes and 0 edges in Line 114 which leads to a None value return at Line 210 of function get_pocket_graph().
For the Line 194 of generating the rna graphs, I use the exact function fr3d_to_graph from rnaglib as mentioned in the readme.

I am wondering if you have encountered similar issues, please let me know if I have done anything wrong during the process. Thank you very much for any advice and help!

Best,
Jason

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants