You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I tried the line plot in dmslogo with toydata.csv. Errors say "not sequential unbroken integers".
Then I turned to the example. Even after reading the instruction, I still felt confused especially there was a gap between original and new in BG505_to_HXB2.csv (e.g. site: 141, 142l isite:142, 151).
What is "not sequential unbroken integers"? How to get the isite in SARS2?
Thx in advance.
Codes here:
# load data
toydata = pd.read_csv("toydata.csv")
# logo plot check
fig, ax = dmslogo.draw_logo(toydata.query('show_site'),
x_col='site',
letter_col='mutation',
letter_height_col='escape_score',
xtick_col='wt_site',
title='AZD8895',
addbreaks=False)
# line plot failed
fig, ax = dmslogo.draw_line(toydata,
x_col='site', # how to get the isite in SARS2?what is "not sequential unbroken integers"?
height_col='tot_escape_score',
xtick_col='site',
show_col='show_site',
title='AZD8895',
widthscale=2)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
/var/folders/mv/v7pv40mn6d3gwx8g563lpclm0000gn/T/ipykernel_14414/3092297124.py in <module>
----> 1 fig, ax = dmslogo.draw_line(toydata,
2 x_col='site', # how to get the isite in SARS2?what is "sequential unbroken integers"?
3 height_col='tot_escape_score',
4 xtick_col='site',
5 show_col='show_site',
~/anaconda3/envs/SARS2_RBD_Ab_escape_maps/lib/python3.8/site-packages/dmslogo/line.py in draw_line(data, x_col, height_col, height_col2, xtick_col, show_col, xlabel, ylabel, title, color, color2, show_color, linewidth, widthscale, heightscale, axisfontscale, hide_axis, ax, ylim_setter, fixed_ymin, fixed_ymax)
162 if (xlen != data[x_col].nunique()) or any(list(range(xmin, xmax + 1)) !=
163 data[x_col].unique()):
--> 164 raise ValueError('`x_col` not sequential unbroken integers')
165
166 if len(data[x_col]) != len(data[x_col].unique()):
ValueError: `x_col` not sequential unbroken integers
The line plot requires x_col to have sequential unbroken numbers, because the line plot draws a value for every site. The logo plot does not require this because it can break the axis to just show certain sites of interest.
The x_col (or isite) column can just be any index that goes 1, 2, 3, ... so on. If you are using a protein that is already numbered that way, then it is just the site. But some proteins are no longer sequentially numbered. For instance, Omicron has some indels in the NTD but is still normally numbered using Wuhan-Hu-1 site numbering.
Dear Dr. Bloom,
I tried the line plot in dmslogo with toydata.csv. Errors say "not sequential unbroken integers".
Then I turned to the example. Even after reading the instruction, I still felt confused especially there was a gap between original and new in BG505_to_HXB2.csv (e.g. site: 141, 142l isite:142, 151).
What is "not sequential unbroken integers"? How to get the isite in SARS2?
Thx in advance.
Codes here:
OS: macOS Catalina 10.15.7
Python: 3.8.12
dmslogo: 0.6.2
The text was updated successfully, but these errors were encountered: