Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle UNK residues in template sequences #5

Closed
benmwebb opened this issue Nov 19, 2021 · 0 comments
Closed

Handle UNK residues in template sequences #5

benmwebb opened this issue Nov 19, 2021 · 0 comments
Assignees

Comments

@benmwebb
Copy link
Member

As noted in #3, in some cases the alignment file contains some X residues in the template (the example given is from the Arabidopsis thaliana bulk download in NP_001030619.1_1.ali.xml, where the template is 4buj chain E).

We need the 3-letter name to output entity_poly_seq. Simply mapping X in the alignment to UNK, as proposed in #3, doesn't completely handle this though since we need to output (UNK) for entity_poly.pdbx_seq_one_letter_code but X for _entity_poly.pdbx_seq_one_letter_code_can (and I'm not sure what we should be writing out for ma_alignment.sequence; see ihmwg/ModelCIF#2). We'd also need to add UNK to chem_comp.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant