Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add chemical clustering code #4

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

apayne97
Copy link
Collaborator

@apayne97 apayne97 commented Apr 15, 2024

Intro

Simple idea, which is to use a version of hierarchical clustering where instead of using a metric of distance computed once, instead use a maximum-common substructure algorithm to find the # of atoms in the common substructure.

In a sense, I'm trying to match a Medicinal Chemists intuition of what the scaffolds are for a set of heterogeneous hits you might get from a screen. I've found that doing this with fingerprint distances, etc doesn't really work in the way I'd like.

I think this similar to how MedChemica's MCPairs algorithm works, so I might end up just recommending that instead, but I think this will be a good exercise and potentially useful to a few different projects I'm working on.

To-Do

  • add a way to traverse backward through the clusters to collect all the molecules that belong to each cluster
  • add the final while loop
  • add a filtering step to avoid calculating MCSS's that will time out "truncate" anyways
  • add Bemis-Murcko scaffold clustering
  • add tests and a good test case of molecules that isn't a proprietary (?) molecule screen

@codecov-commenter
Copy link

Welcome to Codecov 🎉

Once you merge this PR into your default branch, you're all set! Codecov will compare coverage reports and display results in all future pull requests.

Thanks for integrating Codecov - We've got you covered ☂️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants