Swan now comes with several utilities that can be used fo compute and output various metrics using data in the SwanGraph.
- Calculating TPM values
- Calculating pi values
- Obtaining edge abundance information
- Obtaining TSS/TES abundance information
We'll be using the same SwanGraph as the rest of the tutorial pages to demonstrate these utilities. Load it using the following code:
import swan_vis as swan
# code to download this data is in the Getting started tutorial
sg = swan.read('../tutorials/data/swan.p')
Read in graph from ../tutorials/data/swan.p
Swan allows for users to calculate the TPM of their data using various groupby metrics using the calc_tpm()
function. You can use this to calculate TPM of any of the AnnData SwanGraph objects (SwanGraph.adata
for transcripts, SwanGraph.tss_adata
for TSSs, SwanGraph.tes_adata
for TESs, and SwanGraph.edge_adata
for edges; see the Data structure FAQ page for more information on these tables.
First, we'll calculate the TPM for each transcript in each dataset:
df = swan.calc_tpm(sg.adata)
df.head()
tid | ENST00000000233.9 | ENST00000000412.7 | ENST00000000442.10 | ENST00000001008.5 | ENST00000001146.6 | ENST00000002125.8 | ENST00000002165.10 | ENST00000002501.10 | ENST00000002596.5 | ENST00000002829.7 | ... | TALONT000482711 | TALONT000482903 | TALONT000483195 | TALONT000483284 | TALONT000483315 | TALONT000483322 | TALONT000483327 | TALONT000483978 | TALONT000484004 | TALONT000484796 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
hepg2_1 | 196.138474 | 86.060760 | 8.005652 | 46.032497 | 0.0 | 16.011305 | 258.182281 | 60.042389 | 0.0 | 0.0 | ... | 0.000000 | 4.002826 | 2.001413 | 12.008478 | 0.000000 | 0.000000 | 4.002826 | 14.009891 | 8.005652 | 0.000000 |
hepg2_2 | 243.975174 | 77.789185 | 7.071744 | 61.288448 | 0.0 | 12.964864 | 380.695557 | 64.824318 | 0.0 | 0.0 | ... | 1.178624 | 14.143488 | 4.714496 | 7.071744 | 2.357248 | 8.250368 | 2.357248 | 11.786240 | 10.607616 | 1.178624 |
hffc6_1 | 131.320969 | 194.355042 | 0.000000 | 107.683197 | 0.0 | 6.566049 | 278.400452 | 0.000000 | 0.0 | 0.0 | ... | 6.566049 | 13.132097 | 9.192468 | 1.313210 | 6.566049 | 9.192468 | 6.566049 | 0.000000 | 15.758516 | 1.313210 |
hffc6_2 | 137.061584 | 242.395935 | 0.000000 | 124.370689 | 0.0 | 8.883621 | 219.552338 | 0.000000 | 0.0 | 0.0 | ... | 15.229064 | 10.152709 | 6.345443 | 8.883621 | 1.269089 | 10.152709 | 15.229064 | 0.000000 | 16.498154 | 8.883621 |
hffc6_3 | 147.986496 | 273.205841 | 3.252450 | 172.379868 | 0.0 | 9.757351 | 200.025696 | 1.626225 | 0.0 | 0.0 | ... | 14.636026 | 11.383576 | 8.131125 | 8.131125 | 11.383576 | 11.383576 | 6.504900 | 0.000000 | 24.393377 | 9.757351 |
5 rows × 208306 columns
We can swap out the first argument with the different AnnData structures in the SwanGraph. For instance, say we want to calculate the TPM of each TSS:
df = swan.calc_tpm(sg.tss_adata)
df.head()
tss_id | ENSG00000000003.14_1 | ENSG00000000003.14_2 | ENSG00000000003.14_3 | ENSG00000000003.14_4 | ENSG00000000005.5_1 | ENSG00000000005.5_2 | ENSG00000000419.12_1 | ENSG00000000419.12_2 | ENSG00000000457.13_1 | ENSG00000000457.13_2 | ... | TALONG000085596_1 | TALONG000085799_1 | TALONG000085978_1 | TALONG000086022_1 | TALONG000086057_1 | TALONG000086218_1 | TALONG000086443_1 | TALONG000086539_1 | TALONG000086553_1 | TALONG000086766_1 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
hepg2_1 | 0.0 | 232.163910 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 54.038151 | 0.0 | 0.000000 | ... | 0.000000 | 0.000000 | 0.000000 | 60.042389 | 0.000000 | 0.000000 | 6.004239 | 0.000000 | 0.000000 | 8.005652 |
hepg2_2 | 0.0 | 276.976654 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 103.718910 | 0.0 | 2.357248 | ... | 0.000000 | 0.000000 | 0.000000 | 95.468544 | 0.000000 | 0.000000 | 31.822847 | 0.000000 | 1.178624 | 10.607616 |
hffc6_1 | 0.0 | 45.962341 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 101.117149 | 0.0 | 0.000000 | ... | 2.626419 | 6.566049 | 9.192468 | 0.000000 | 7.879258 | 11.818888 | 233.751328 | 9.192468 | 6.566049 | 15.758516 |
hffc6_2 | 0.0 | 53.301723 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 85.028938 | 0.0 | 1.269089 | ... | 6.345443 | 1.269089 | 12.690886 | 0.000000 | 8.883621 | 20.305418 | 119.294334 | 12.690886 | 2.538177 | 16.498154 |
hffc6_3 | 0.0 | 68.301460 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 89.442383 | 0.0 | 1.626225 | ... | 8.131125 | 8.131125 | 11.383576 | 0.000000 | 11.383576 | 17.888477 | 134.976685 | 27.645828 | 8.131125 | 24.393377 |
5 rows × 130176 columns
And finally, we can use an alternative metadata column to compute TPM on. For instance, we can use the cell_line
column:
df = swan.calc_tpm(sg.adata, obs_col='cell_line')
df.head()
tid | ENST00000000233.9 | ENST00000000412.7 | ENST00000000442.10 | ENST00000001008.5 | ENST00000001146.6 | ENST00000002125.8 | ENST00000002165.10 | ENST00000002501.10 | ENST00000002596.5 | ENST00000002829.7 | ... | TALONT000482711 | TALONT000482903 | TALONT000483195 | TALONT000483284 | TALONT000483315 | TALONT000483322 | TALONT000483327 | TALONT000483978 | TALONT000484004 | TALONT000484796 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
hepg2 | 226.245346 | 80.854897 | 7.417881 | 55.634102 | 0.0 | 14.093972 | 335.288177 | 63.051983 | 0.0 | 0.0 | ... | 0.741788 | 10.385033 | 3.70894 | 8.901457 | 1.483576 | 5.192516 | 2.967152 | 12.610396 | 9.643245 | 0.741788 |
hffc6 | 138.145737 | 234.247116 | 0.924052 | 132.139404 | 0.0 | 8.316465 | 234.709137 | 0.462026 | 0.0 | 0.0 | ... | 12.012672 | 11.550647 | 7.85444 | 6.006336 | 6.006336 | 10.164569 | 9.702543 | 0.000000 | 18.481035 | 6.468362 |
2 rows × 208306 columns
You can use the calc_pi()
function to calculate percent isoform use (pi) per gene in nearly the exact same way that you can use calc_tpm()
: you can run it on either the transcript, edge, TSS, or TES level, and you can choose the metadata variable to groupby. The only difference is that for calc_pi()
you must also provide an additional DataFrame object as the second argument that tells Swan what gene each entry comes from. Below the corresponding DataFrame that must be provided is listed for each AnnData:
AnnData | DataFrame |
---|---|
SwanGraph.adata |
SwanGraph.t_df |
SwanGraph.tss_adata |
SwanGraph.tss_adata.var |
SwanGraph.tes_adata |
SwanGraph.tes_adata.var |
First, we'll calculate the pi value for each transcript in each dataset:
df, sums = swan.calc_pi(sg.adata, sg.t_df)
df.head()
tid | ENST00000000233.9 | ENST00000000412.7 | ENST00000000442.10 | ENST00000001008.5 | ENST00000001146.6 | ENST00000002125.8 | ENST00000002165.10 | ENST00000002501.10 | ENST00000002596.5 | ENST00000002829.7 | ... | TALONT000482711 | TALONT000482903 | TALONT000483195 | TALONT000483284 | TALONT000483315 | TALONT000483322 | TALONT000483327 | TALONT000483978 | TALONT000484004 | TALONT000484796 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
hepg2_1 | 100.000000 | 100.0 | 100.000000 | 100.0 | 0.0 | 100.000000 | 100.0 | 93.750000 | 0.0 | 0.0 | ... | 0.000000 | 1.904762 | 6.666667 | 13.043478 | 0.000000 | 0.000000 | 1.333333 | 100.0 | 100.0 | 0.000000 |
hepg2_2 | 99.519226 | 100.0 | 60.000004 | 100.0 | 0.0 | 100.000000 | 100.0 | 80.882355 | 0.0 | 0.0 | ... | 5.263158 | 3.225806 | 13.793103 | 8.695652 | 0.884956 | 3.097345 | 0.884956 | 100.0 | 100.0 | 2.380952 |
hffc6_1 | 98.039215 | 100.0 | 0.000000 | 100.0 | 0.0 | 100.000000 | 100.0 | 0.000000 | 0.0 | 0.0 | ... | 2.604167 | 2.092050 | 16.279070 | 1.428571 | 0.854701 | 1.196581 | 0.854701 | 0.0 | 100.0 | 1.886792 |
hffc6_2 | 99.082573 | 100.0 | 0.000000 | 100.0 | 0.0 | 77.777779 | 100.0 | 0.000000 | 0.0 | 0.0 | ... | 4.285715 | 2.144772 | 11.627908 | 14.893617 | 0.166667 | 1.333333 | 2.000000 | 0.0 | 100.0 | 9.859155 |
hffc6_3 | 100.000000 | 100.0 | 100.000000 | 100.0 | 0.0 | 85.714287 | 100.0 | 100.000000 | 0.0 | 0.0 | ... | 4.326923 | 2.536232 | 15.151516 | 10.638298 | 1.711491 | 1.711491 | 0.977995 | 0.0 | 100.0 | 13.636364 |
5 rows × 208306 columns
As a note, the calc_pi()
function outputs not only a table of pi values but of counts per isoform per condition, which is used as an intermediate during DIE testing. To avoid recalculation, it is output here.
sums.head()
ENST00000000233.9 | ENST00000000412.7 | ENST00000000442.10 | ENST00000001008.5 | ENST00000001146.6 | ENST00000002125.8 | ENST00000002165.10 | ENST00000002501.10 | ENST00000002596.5 | ENST00000002829.7 | ... | TALONT000482711 | TALONT000482903 | TALONT000483195 | TALONT000483284 | TALONT000483315 | TALONT000483322 | TALONT000483327 | TALONT000483978 | TALONT000484004 | TALONT000484796 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
hepg2_1 | 98.0 | 43.0 | 4.0 | 23.0 | 0.0 | 8.0 | 129.0 | 30.0 | 0.0 | 0.0 | ... | 0.0 | 2.0 | 1.0 | 6.0 | 0.0 | 0.0 | 2.0 | 7.0 | 4.0 | 0.0 |
hepg2_2 | 207.0 | 66.0 | 6.0 | 52.0 | 0.0 | 11.0 | 323.0 | 55.0 | 0.0 | 0.0 | ... | 1.0 | 12.0 | 4.0 | 6.0 | 2.0 | 7.0 | 2.0 | 10.0 | 9.0 | 1.0 |
hffc6_1 | 100.0 | 148.0 | 0.0 | 82.0 | 0.0 | 5.0 | 212.0 | 0.0 | 0.0 | 0.0 | ... | 5.0 | 10.0 | 7.0 | 1.0 | 5.0 | 7.0 | 5.0 | 0.0 | 12.0 | 1.0 |
hffc6_2 | 108.0 | 191.0 | 0.0 | 98.0 | 0.0 | 7.0 | 173.0 | 0.0 | 0.0 | 0.0 | ... | 12.0 | 8.0 | 5.0 | 7.0 | 1.0 | 8.0 | 12.0 | 0.0 | 13.0 | 7.0 |
hffc6_3 | 91.0 | 168.0 | 2.0 | 106.0 | 0.0 | 6.0 | 123.0 | 1.0 | 0.0 | 0.0 | ... | 9.0 | 7.0 | 5.0 | 5.0 | 7.0 | 7.0 | 4.0 | 0.0 | 15.0 | 6.0 |
5 rows × 208306 columns
We can also calculate the pi value for the TSSs and TESs in each dataset:
df, sums = swan.calc_pi(sg.tss_adata, sg.tss_adata.var)
print(df.head())
print()
df, sums = swan.calc_pi(sg.tes_adata, sg.tes_adata.var)
print(df.head())
print()
tss_id ENSG00000000003.14_1 ENSG00000000003.14_2 ENSG00000000003.14_3 \
hepg2_1 0.0 100.0 0.0
hepg2_2 0.0 100.0 0.0
hffc6_1 0.0 100.0 0.0
hffc6_2 0.0 100.0 0.0
hffc6_3 0.0 100.0 0.0
tss_id ENSG00000000003.14_4 ENSG00000000005.5_1 ENSG00000000005.5_2 \
hepg2_1 0.0 0.0 0.0
hepg2_2 0.0 0.0 0.0
hffc6_1 0.0 0.0 0.0
hffc6_2 0.0 0.0 0.0
hffc6_3 0.0 0.0 0.0
tss_id ENSG00000000419.12_1 ENSG00000000419.12_2 ENSG00000000457.13_1 \
hepg2_1 0.0 100.0 0.0
hepg2_2 0.0 100.0 0.0
hffc6_1 0.0 100.0 0.0
hffc6_2 0.0 100.0 0.0
hffc6_3 0.0 100.0 0.0
tss_id ENSG00000000457.13_2 ... TALONG000085596_1 TALONG000085799_1 \
hepg2_1 0.0 ... 0.0 0.0
hepg2_2 100.0 ... 0.0 0.0
hffc6_1 0.0 ... 100.0 100.0
hffc6_2 100.0 ... 100.0 100.0
hffc6_3 100.0 ... 100.0 100.0
tss_id TALONG000085978_1 TALONG000086022_1 TALONG000086057_1 \
hepg2_1 0.0 100.0 0.0
hepg2_2 0.0 100.0 0.0
hffc6_1 100.0 0.0 100.0
hffc6_2 100.0 0.0 100.0
hffc6_3 100.0 0.0 100.0
tss_id TALONG000086218_1 TALONG000086443_1 TALONG000086539_1 \
hepg2_1 0.0 100.0 0.0
hepg2_2 0.0 100.0 0.0
hffc6_1 100.0 100.0 100.0
hffc6_2 100.0 100.0 100.0
hffc6_3 100.0 100.0 100.0
tss_id TALONG000086553_1 TALONG000086766_1
hepg2_1 0.0 100.0
hepg2_2 100.0 100.0
hffc6_1 100.0 100.0
hffc6_2 100.0 100.0
hffc6_3 100.0 100.0
[5 rows x 130176 columns]
tes_id ENSG00000000003.14_1 ENSG00000000003.14_2 ENSG00000000003.14_3 \
hepg2_1 0.0 100.0 0.0
hepg2_2 0.0 100.0 0.0
hffc6_1 0.0 100.0 0.0
hffc6_2 0.0 100.0 0.0
hffc6_3 0.0 100.0 0.0
tes_id ENSG00000000003.14_4 ENSG00000000003.14_5 ENSG00000000005.5_1 \
hepg2_1 0.0 0.0 0.0
hepg2_2 0.0 0.0 0.0
hffc6_1 0.0 0.0 0.0
hffc6_2 0.0 0.0 0.0
hffc6_3 0.0 0.0 0.0
tes_id ENSG00000000005.5_2 ENSG00000000419.12_1 ENSG00000000419.12_2 \
hepg2_1 0.0 92.592590 0.0
hepg2_2 0.0 98.863640 0.0
hffc6_1 0.0 98.701302 0.0
hffc6_2 0.0 95.522385 0.0
hffc6_3 0.0 98.181824 0.0
tes_id ENSG00000000419.12_3 ... TALONG000085596_1 TALONG000085799_1 \
hepg2_1 7.407407 ... 0.0 0.0
hepg2_2 1.136364 ... 0.0 0.0
hffc6_1 1.298701 ... 100.0 100.0
hffc6_2 4.477612 ... 100.0 100.0
hffc6_3 1.818182 ... 100.0 100.0
tes_id TALONG000085978_1 TALONG000086022_1 TALONG000086057_1 \
hepg2_1 0.0 100.0 0.0
hepg2_2 0.0 100.0 0.0
hffc6_1 100.0 0.0 100.0
hffc6_2 100.0 0.0 100.0
hffc6_3 100.0 0.0 100.0
tes_id TALONG000086218_1 TALONG000086443_1 TALONG000086539_1 \
hepg2_1 0.0 100.0 0.0
hepg2_2 0.0 100.0 0.0
hffc6_1 100.0 100.0 100.0
hffc6_2 100.0 100.0 100.0
hffc6_3 100.0 100.0 100.0
tes_id TALONG000086553_1 TALONG000086766_1
hepg2_1 0.0 100.0
hepg2_2 100.0 100.0
hffc6_1 100.0 100.0
hffc6_2 100.0 100.0
hffc6_3 100.0 100.0
[5 rows x 187454 columns]
And we can also choose to calculate pi values using a different metadata column, here shown on the cell_line
column:
df, sums = swan.calc_pi(sg.adata, sg.t_df, obs_col='cell_line')
df.head()
tid | ENST00000000233.9 | ENST00000000412.7 | ENST00000000442.10 | ENST00000001008.5 | ENST00000001146.6 | ENST00000002125.8 | ENST00000002165.10 | ENST00000002501.10 | ENST00000002596.5 | ENST00000002829.7 | ... | TALONT000482711 | TALONT000482903 | TALONT000483195 | TALONT000483284 | TALONT000483315 | TALONT000483322 | TALONT000483327 | TALONT000483978 | TALONT000484004 | TALONT000484796 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
hepg2 | 99.673203 | 100.0 | 71.428574 | 100.0 | 0.0 | 100.000000 | 100.0 | 85.0 | 0.0 | 0.0 | ... | 4.545455 | 2.935011 | 11.363637 | 10.434782 | 0.531915 | 1.861702 | 1.06383 | 100.0 | 100.0 | 1.785714 |
hffc6 | 99.006622 | 100.0 | 100.000000 | 100.0 | 0.0 | 85.714287 | 100.0 | 100.0 | 0.0 | 0.0 | ... | 3.823529 | 2.218279 | 14.285715 | 7.926829 | 0.815558 | 1.380176 | 1.31744 | 0.0 | 100.0 | 8.333334 |
2 rows × 208306 columns
In case you're interested in doing outside analyses on the level (For instance, using intron counting to assess alternative splicing), Swan provides a tool to output a DataFrame with edge abundance on the dataset level.
If we just want to get access to the edge abundance DataFrame, just use the get_edge_abundance()
function:
df = sg.get_edge_abundance()
df.head()
strand | edge_type | annotation | chrom | start | stop | hepg2_1 | hepg2_2 | hffc6_1 | hffc6_2 | hffc6_3 | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | + | exon | True | chr1 | 11869 | 12227 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
1 | + | exon | True | chr1 | 12010 | 12057 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
2 | + | intron | True | chr1 | 12057 | 12179 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
3 | + | exon | True | chr1 | 12179 | 12227 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
4 | + | intron | True | chr1 | 12227 | 12613 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
You can also specify if you want the data to be output in raw counts (kind='counts'
) or TPM (kind='tpm
). By default, this function returns counts. Here's an example with TPM:
df = sg.get_edge_abundance(kind='tpm')
df.head()
strand | edge_type | annotation | chrom | start | stop | hepg2_1 | hepg2_2 | hffc6_1 | hffc6_2 | hffc6_3 | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | + | exon | True | chr1 | 11869 | 12227 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
1 | + | exon | True | chr1 | 12010 | 12057 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
2 | + | intron | True | chr1 | 12057 | 12179 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
3 | + | exon | True | chr1 | 12179 | 12227 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
4 | + | intron | True | chr1 | 12227 | 12613 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
And finally, if you wish, you can provide the function with a prefix
value which will indicate that you want the output DataFrame to be saved in TSV form.
df = sg.get_edge_abundance(kind='tpm', prefix='test')
df.head()
strand | edge_type | annotation | chrom | start | stop | hepg2_1 | hepg2_2 | hffc6_1 | hffc6_2 | hffc6_3 | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | + | exon | True | chr1 | 11869 | 12227 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
1 | + | exon | True | chr1 | 12010 | 12057 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
2 | + | intron | True | chr1 | 12057 | 12179 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
3 | + | exon | True | chr1 | 12179 | 12227 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
4 | + | intron | True | chr1 | 12227 | 12613 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
The results will be saved in '{prefix}_edge_abundance.tsv'.
Similarly, if you wish to do analysis involving your TSS or TES data, you can also output these using the get_tss_abundance()
and get_tes_abundance()
functions respectively. These have identical options to get_edge_abundance()
so they can either output counts or TPM and optionally save to an output file.
First, let's output the TSS TPM to a file:
df = sg.get_tss_abundance(kind='tpm', prefix='test')
df.head()
tss_id | gid | gname | vertex_id | tss_name | chrom | coord | hepg2_1 | hepg2_2 | hffc6_1 | hffc6_2 | hffc6_3 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | ENSG00000000003.14_1 | ENSG00000000003.14 | TSPAN6 | 926111 | TSPAN6_1 | chrX | 100636191 | 0.00000 | 0.000000 | 0.000000 | 0.000000 | 0.00000 |
1 | ENSG00000000003.14_2 | ENSG00000000003.14 | TSPAN6 | 926112 | TSPAN6_2 | chrX | 100636608 | 232.16391 | 276.976654 | 45.962341 | 53.301723 | 68.30146 |
2 | ENSG00000000003.14_3 | ENSG00000000003.14 | TSPAN6 | 926114 | TSPAN6_3 | chrX | 100636793 | 0.00000 | 0.000000 | 0.000000 | 0.000000 | 0.00000 |
3 | ENSG00000000003.14_4 | ENSG00000000003.14 | TSPAN6 | 926117 | TSPAN6_4 | chrX | 100639945 | 0.00000 | 0.000000 | 0.000000 | 0.000000 | 0.00000 |
4 | ENSG00000000005.5_1 | ENSG00000000005.5 | TNMD | 926077 | TNMD_1 | chrX | 100585066 | 0.00000 | 0.000000 | 0.000000 | 0.000000 | 0.00000 |
Now we'll get the counts of each TES without saving to a file:
df = sg.get_tes_abundance(kind='counts')
df.head()
tes_id | gid | gname | vertex_id | tes_name | chrom | coord | hepg2_1 | hepg2_2 | hffc6_1 | hffc6_2 | hffc6_3 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | ENSG00000000003.14_1 | ENSG00000000003.14 | TSPAN6 | 926092 | TSPAN6_1 | chrX | 100627109 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
1 | ENSG00000000003.14_2 | ENSG00000000003.14 | TSPAN6 | 926093 | TSPAN6_2 | chrX | 100628670 | 116.0 | 235.0 | 35.0 | 42.0 | 42.0 |
2 | ENSG00000000003.14_3 | ENSG00000000003.14 | TSPAN6 | 926097 | TSPAN6_3 | chrX | 100632063 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
3 | ENSG00000000003.14_4 | ENSG00000000003.14 | TSPAN6 | 926100 | TSPAN6_4 | chrX | 100632541 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
4 | ENSG00000000003.14_5 | ENSG00000000003.14 | TSPAN6 | 926103 | TSPAN6_5 | chrX | 100633442 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |