Skip to content

Commit

Permalink
Fix monoexonic transcripts filtering
Browse files Browse the repository at this point in the history
  • Loading branch information
chbk committed Sep 26, 2022
1 parent 10275a0 commit b93c7d4
Showing 1 changed file with 9 additions and 5 deletions.
14 changes: 9 additions & 5 deletions bin/filter_rare_transcripts.py
Original file line number Diff line number Diff line change
Expand Up @@ -156,11 +156,15 @@
['chromosome', 'strand', 'start', 'end']
)

mono['group'] = (
(mono['chromosome'] != mono['chromosome'].shift()) |
(mono['strand'] != mono['strand'].shift()) |
(mono['start'] > mono['end'].shift())
).cumsum()
previous_end_max = mono.set_index(['chromosome', 'strand'])['end'].groupby(
['chromosome', 'strand'],
observed = True
).shift().groupby(
['chromosome', 'strand'],
observed = True
).cummax().fillna(-1).to_numpy()

mono['group'] = (mono['start'] > previous_end_max).cumsum()

mono = mono.set_index(['chromosome', 'strand', 'group'])

Expand Down

0 comments on commit b93c7d4

Please sign in to comment.