Questions about benchmark evaluation #14

FarmerTao · 2025-01-25T02:17:34Z

Great work on the structure motif search algorithm！ I have some questions regarding the benchmark process. In the provided answer.tsv file, only the UniProt ID and description are included, but there is no information about the exact location of the motif (e.g., specific residue positions).

Given this, how can we determine whether the algorithm correctly matches the motif at the right positions? Is the assumption here that every match is considered a good match, and therefore, we only need to focus on whether a match occurs or not?

Any clarification or additional insights would be greatly appreciated!

khb7840 · 2025-01-26T07:52:09Z

Thank you for your feedback!
The benchmark module primarily assesses whether folddisco can retrieve proteins known to have specific motifs (like zinc fingers or serine peptidases) in their structures. While I’ve mostly done position-based checks using external tools, incorporating such checks into folddisco would be useful.

For the determination of correct matches in folddisco, currently, folddisco evaluates match quality through node_count (the number of nodes covered by matched pairwise features) and RMSD. Because partial motif matches are allowed in folddisco, higher node_count and lower RMSD generally indicate a better match.

khb7840 added enhancement New feature or request question Further information is requested labels Jan 26, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions about benchmark evaluation #14

Questions about benchmark evaluation #14

FarmerTao commented Jan 25, 2025 •

edited

Loading

khb7840 commented Jan 26, 2025

Questions about benchmark evaluation #14

Questions about benchmark evaluation #14

Comments

FarmerTao commented Jan 25, 2025 • edited Loading

khb7840 commented Jan 26, 2025

FarmerTao commented Jan 25, 2025 •

edited

Loading