Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HpoFrequency should return a proportion rather than a percentage #38

Open
pnrobinson opened this issue Jan 13, 2018 · 2 comments
Open

Comments

@pnrobinson
Copy link
Collaborator

Currently, the function public int lowerBound() returns a percentage as an integer (e.g., 30 for 30%).
This is not good because downstream code very probably would need to have the corresponding proportion (probability), i.e., 0.3. Also, we have frequency data such as 12/34 (for 12 of 34 patients in some study), and this would naturally be represented as a float or double.

The current implementation of fromPercent results in an unnecessary loss of information.

public static HpoFrequency fromPercent(int percent) {
    if (percent < 1) {
      return EXCLUDED;
    } else if (percent < 5) {
      return VERY_RARE;
    } else if (percent < 30) {
      return OCCASIONAL;
    } else if (percent < 80) {
      return FREQUENT;
    } else if (percent < 100) {
      return VERY_FREQUENT;
    } else {
      return ALWAYS_PRESENT;
    }
  }

Therefore, I think that we need to make a new class. Currently, HpoFrequency is an enum. I would suggest we introduce a class called Frequency that would have something like this

public class Frequency {
   Float freq=null;
   HpoFrequency frequencyClass;


   public float getFrequency() {
    if (freq !=null) return freq.getValue():
    else return frequencyClass.upperlimit();
  }
}

I would also suggest that the upper_limit and lower_limit functions in HpoFrequency be revised to return floats. I am in the process of writing code for an HpoDisease class that would
parse the phenotype_annotation.tab file and use all of the frequency information, and this would be very useful.
Please let me know if anything speaks against this solution. If not, I will implement it and make a PR.

@holtgrewe
Copy link
Contributor

Hm, what about renaming HpoFrequency to HpoFrequencyName and then create an interface hierarchy as below?

  • HpoNamedFrequency -- only contains HpoFrequencyName with the rough classification, functions for obtaining smallest/largest value in class
  • HpoFractionalFrequency -- adds a float for a describing the percentage/fraction
  • HpoRatioFrequency -- does not have the float but a numerator/denominator

@pnrobinson
Copy link
Collaborator Author

Do you mean

interface Frequency {
  float getLowerBound();
  float getUpperBound();
  float getFrequency();
}

with the above three classes implementing the interface? One difficulty is that only the enum has upper and lower bounds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants