You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Desired functionality hpo has a struct that caches similarity calculations from term-term calculations. This caching should be safe across threads to allow multiprocessing similarity.
Constraints
With an Ontology with ~13,000 terms, the total number of possible combinations is n! / (k! * (n - k)!)
--> 13,000! / (2! * (13,000 -2)!)
==> 84,493,500
For each combination we must store a 32bit float similarity score + a hash for the two 32bit HpoTermIds. So we could end up with a huge cache and might have to find a way to limit the overall size. We could e.g. have one Hashset that contains all comparisons that result in 1 and another one for all that result in 0.
The text was updated successfully, but these errors were encountered:
Desired functionality
hpo
has a struct that caches similarity calculations from term-term calculations. This caching should be safe across threads to allow multiprocessing similarity.Constraints
With an Ontology with ~13,000 terms, the total number of possible combinations is
n! / (k! * (n - k)!)
-->
13,000! / (2! * (13,000 -2)!)
==>
84,493,500
For each combination we must store a 32bit float similarity score + a hash for the two 32bit HpoTermIds. So we could end up with a huge cache and might have to find a way to limit the overall size. We could e.g. have one Hashset that contains all comparisons that result in 1 and another one for all that result in 0.
The text was updated successfully, but these errors were encountered: