Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add NetMHCIIpan to mhctools #29

Merged
merged 3 commits into from
Aug 26, 2015
Merged

Add NetMHCIIpan to mhctools #29

merged 3 commits into from
Aug 26, 2015

Conversation

tavinathanson
Copy link
Contributor

Somewhat less hacky version of adding NetMHCIIpan to mhctools. Trumps #16.

Review on Reviewable

@iskandr
Copy link
Contributor

iskandr commented Aug 25, 2015

Review status: 0 of 11 files reviewed at latest revision, 9 unresolved discussions, all commit checks successful.


mhctools/alleles.py, line 153 [r1] (raw file):
What's the motivation for a species argument and why wouldn't you want to do something like `parse_single_allele_name("HLA-A*02:01", "HLA").

Also, calling this argument species makes me think it should take values like "human" or "mus musculus", not like "HLA" and "H-2".


mhctools/alleles.py, line 200 [r1] (raw file):
If we're going to have an explicit species argument, shouldn't its default value be "HLA" in the args list?


mhctools/alleles.py, line 310 [r1] (raw file):
I think having if not is_compact duplicated all over this function is less readable than just having two functions.


mhctools/base_commandline_predictor.py, line 86 [r1] (raw file):
Different predictors use different formats, I think you should still be normalizing here (otherwise, how will you know in which format you should be comparing user input?). Maybe I'm misunderstanding the interaction between this change the way you're now passing normalize_allele_func to the base class for predictors.


mhctools/base_predictor.py, line 49 [r1] (raw file):
I'm confused why this needs to get passed to the base class, since the only place we actually care about the allele's format is when passing it directly to the program we're calling.


mhctools/file_formats.py, line 202 [r1] (raw file):
Oy vey.


mhctools/file_formats.py, line 203 [r1] (raw file):
Oy vey x2

(maybe make is_mhciipan an argument?)


mhctools/netmhcii_pan.py, line 38 [r1] (raw file):
What does it mean for DR alleles to be "standard"?


test/test_mhc_allele_names.py, line 64 [r1] (raw file):
Oh no! Really don't like returning a list here. There should be separate logic for parsing class II alleles which can come in alpha-beta pairings (since that's a class II specific concern)


Comments from the review on Reviewable.io

@tavinathanson
Copy link
Contributor Author

Addressed comments. Also added support for alpha/beta separation by a /, as discussed offline.


Review status: 0 of 10 files reviewed at latest revision, 8 unresolved discussions, some commit checks pending.


mhctools/alleles.py, line 153 [r1] (raw file):
Discussed offline: the top level parser figures out the "HLA" and then passes that down to parse_single_allele_name with that species name (now renamed back to parse_allele_name). I just renamed species to species_prefix here.


mhctools/alleles.py, line 200 [r1] (raw file):
Don't see why? I do use aNone-ness in my logic to see if the argument was provided.


mhctools/alleles.py, line 310 [r1] (raw file):
Yeah, the compact function was getting more complicated with the class II logic, so I decided to try this. But I agree with you. Done.


mhctools/base_commandline_predictor.py, line 86 [r1] (raw file):
Yeah, I think you're right. I think I just misunderstood the purpose of this.


mhctools/file_formats.py, line 203 [r1] (raw file):
Done. We talked offline about just breaking it out into a separate function; but when I tried that, it wasn't as simple looking as I had hoped and I couldn't handle the code duplication.


mhctools/netmhcii_pan.py, line 38 [r1] (raw file):
I just meant non-alpha/beta alleles. Bad wording. Updated.


test/test_mhc_allele_names.py, line 64 [r1] (raw file):
Like we talked about offline, there is separate logic: the old logic lives in parse_single_allele_name. But you make a good point that parse_allele_name should take the place of parse_single_allele_name, and we should have parse_classi_or_classii_allele_name for the extra logic. I just fixed that.


mhctools/base_predictor.py, line 49 [r1] (raw file):
Yeah, you're also right about that. More of me misunderstanding some of the code! I was confusing normalizing for the program vs. our own normalization that doesn't necessarily reach the predictor (base_predictor.check_hla_alleles). Fixed.


Comments from the review on Reviewable.io

@iskandr
Copy link
Contributor

iskandr commented Aug 25, 2015

One lingering concern: will parsing a haplotype like "HLA-DRA1_03:01/HLA-DRB1_01:01" work (due to repetition of the "HLA" prefix)?

If so, LGTM


Review status: 0 of 10 files reviewed at latest revision, 3 unresolved discussions, all commit checks successful.


mhctools/alleles.py, line 193 [r3] (raw file):
What happens if you're parsing HLA-DRA1_03:04/HLA-DRB1_01:01. Does the repeating of "HLA" in a haplotype trigger this error? And if so, that seems undesirable, shouldn't we at least check that the species prefix is different?


Comments from the review on Reviewable.io

@tavinathanson
Copy link
Contributor Author

As discussed offline, more formats will be handled in #31

tavinathanson added a commit that referenced this pull request Aug 26, 2015
@tavinathanson tavinathanson merged commit 25e1585 into master Aug 26, 2015
@tavinathanson tavinathanson deleted the iipan_revised branch August 26, 2015 00:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants