swan v2.0
Swan 2.0
General workflow update
- changed workflow from adding datasets / samples one at a time; now users can pass in one GTF with the union of all expressed transcripts in their data
SwanGraph representation updates
- removed strand from genomic location in
SwanGraph.loc_df
- added strand to edges in
SwanGraph.edge_df
- added AnnData representations for tracking transcript abundance as well as automatically-calculated edge, tss, and tes abundance
- added options to represent complex metadata in the AnnData.obs tables
- added a single-cell option to SwanGraph initialization for data with individual cells as samples
- automatically calculates percent isoform use (pi) per dataset per gene (except for in single-cell mode)
- added functions to add and store color palettes for different metadata colors
Analysis options update
- implemented more statistically-robust and published method of isoform switching testing (aka differential isoform expression [DIE] testing), as described by Joglekar et. al., 2021
- reworked differential gene and transcript expression testing to work smoothly with AnnData representation
- changed output type of intron retention / exon skipping analysis to be a more descriptive pandas DataFrame
- all analysis code will now automatically store results in the
SwanGraph.adata.uns
dictionary using an automatically-generated key that can be easily regenerated to facilitate different pairwise testing and accessing previous results
Gene report update
- removed
indicate_dataset
option - added option to group datasets based on metadata columns (
groupby
) - added option to include / exclude / order datasets based on metadata information (`datasets)
- added option to represent datasets using color coded bars either derived from the dataset names or metadata columns (
metadata_cols
), as well as draw a legend for the colors - added option to either plot TPM or pi (
layer
) - added option to change what color palette heatmap is plotted in (
cmap
) - reworked options to indicate differentially-expressed transcripts in conjunction with how differential expression results are now stored (
include_qvals
,q
,log2fc
,qval_obs_col
,qval_obs_conditions
) - added option to display values on top of each heatmap cell (
display_numbers
) - added option to display transcript name as opposed to transcript ID
Other plotting changes
- removed
indicate_dataset
option - added functions to change the colors of plotted Swan plots and browser plots
Other utilities
- added functions to output calculated edge, tss, or tes abundance along with details of genomic location
- added functions to calculate TPM or percent isoform use (pi) given specific metadata settings
- changed how SwanGraph saving and loading works
Note: Saved Swan objects that were generated with previous versions of Swan will not be compatible with 2.0!