-
Notifications
You must be signed in to change notification settings - Fork 403
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Elevate lineage XEC as clade 24F and KP.2.3 as clade 24G #1152
Conversation
This commit elevates Pango lineage XEC as clade 24F
ORF1a:A599T is not included in all XEC*s, another Orf1a:1367 branch is designated XEC.2 so not included in raw XEC seqs without *. |
Hah, I came here to make this PR, didn't know it existed already, thanks @trvrb. I'll also add KP.2.3, since it also satisfies the criteria. What is and isn't XEC is not quite clear, it depends on one's assumptions surrounding the origin. Some of the branches of XEC are definitely multi-recombinants, it's not quite clear which is the original recombinant, maybe there was also a single host with coinfection that resulted in multi-spillover. It possibly makes sense to treat both XEC and XEK as clade 24F as XEK has an identical spike. @trvrb I changed the parent of XEC to 24A (JN.1) from 24C (KP.3) in the clade hierarchy, because XEC because the most recent common ancestor of the parents is JN.1 not KP.3 (KS.1.1 is JN.1.13.1.1.1, hence not part of KP=JN.1.11.1). I also fixed the defining mutations as @aviczhl2 pointed out. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I discussed this PR yesterday with @rneher and he gave verbal approval so I will merge this. Open also looks great |
@corneliusroemer: Amazing! Thanks so much for cleaning things up and adding clade 24G / lineage KP.2.3. Everything here looks great to me. |
With this PR, we are designating clade 24F (lineage XEC) and 24G (lineage KP.2.3)
Clade 24F, lineage XEC
Clade 24F, lineage XEC stems from a rather complex recombination event. Some discussion is visible at cov-lineages/pango-designation#2717. One parent is KS.1.1 and the other parent is KP.3.3. KS.1.1 belongs to clade 24A (lineage JN.1) and KP.3.3 belongs to clade 24C (lineage KP.3), which itself descends from JN.1. So this is recombinant of JN.1-derived diversity. The breakpoint is around 21738~22599, right around the spike junction which is common.
XEC likely originated somewhere in Central Europe, being particularly common early on in Germany, Czechia, and Austria, with first sequences collected in June 2024.
The mutational make-up relative to previously dominant lineage KP.3.1.1 can be seen on Cov-Spectrum: https://cov-spectrum.org/explore/World/AllSamples/Past6M/variants?pangoLineage=XEC&pangoLineage1=KP.3.1.1&analysisMode=CompareEquals&, where XEC bears the amino acid mutations ORF1a:A599T, S:T22N, S:F59S, N:G204P, ORF9b:P3H relative to KP.3.1.1. It's not immediately clear to me if the fitness of this lineage is due to the two spike mutations vs mutations elsewhere in the genome. @ryhisner discusses potential molecular change to nucleocapsid (N) here: https://x.com/LongDesertTrain/status/1837346366961451290
When forcing a regular bifurcating phylogeny (as the ncov workflow does), XEC shows up as derived from its parent KP.3.3.
At the moment, the latest Nextclade dataset was released on July 17 and doesn't contain XEC. Because of this, many of the normally referenced data systems also don't contain XEC. In the below, I've instead worked from the
pango_lineage
metadata provided by GISAID rather than theNextclade_pango
calls.Running https://github.com/nextstrain/forecasts-ncov on
pango_lineage
counts results in an estimated fitness of 1.7 relative to clade 24A / lineage JN.1 and is currently significantly greater than other lineages:This results in the prediction of XEC becoming dominant over KP.3.1.1 in most countries in late October:
If we look at frequency in the USA vs fitness, we see that XEC is currently inferred to be the 2nd most common Pango lineage (behind KP.3.1.1) and have the highest fitness relative to the population average:
If we do a simple lineage regression of logit transformed frequencies for countries with data we get the following:
The average per-day growth rate is 0.07 and current predicted frequency varies in these countries between 5% and 23%.
Thus, 24D / XDV.1 fulfills our clade designation criteria 4: "A clade shows consistent >0.05 per day growth in frequency where it's circulating and has reached >5% regional frequency".
Clade 24G, lineage KP.2.3
Clade 24G, lineage KP.2.3, is a descendant of clade 24B (JN.1.11.1=KP) with extra spike substitutions S:R346T, S:H146Q and deletion S:S31-, and ORF3a:K67N.
KP.2.3 was first commonly observed in India, with first sequences from January 2024.
It was among the fittest lineages until KP.3, KP.3.1.1 and XEC came around and is still common in various countries around the globe, in particular Brazil, Singapore, India, and likely other countries in South East Asia with limited surveillance based on the high prevalence in Singapore.
In sequences collected in June and July 2024, per cov-spectrum, it was:
In sequences collected since August 2024, it was 38% in Brazil.
It satisfied clade criterion 4 "A clade shows consistent >0.05 per day growth in frequency where it's circulating and has reached >5% regional frequency" in sequences collected in April/May in Asia and only narrowly misses the regional criterion 3 in South America and Asia.
One reason KP.2.3 is only elevated now is that its high prevalence in many countries around the globe became only clear gradually due to delayed sequence upload. KP.2.3 is particularly prevalent in countries with relatively low sequencing intensity and long delays.
Another reason for designation is that KP.2.3 is a direct descendant of the US-recommended vaccine strain KP.2 (for Moderna/Pfizer, i.e. mRNA).
Prevalence and growth in Asia in months April and May 2024:
Pre-merge checklist
Post merge checklist