Release swan v2.0 · mortazavilab/swan_vis

Swan 2.0

changed workflow from adding datasets / samples one at a time; now users can pass in one GTF with the union of all expressed transcripts in their data

removed strand from genomic location in SwanGraph.loc_df
added strand to edges in SwanGraph.edge_df
added AnnData representations for tracking transcript abundance as well as automatically-calculated edge, tss, and tes abundance
added options to represent complex metadata in the AnnData.obs tables
added a single-cell option to SwanGraph initialization for data with individual cells as samples
automatically calculates percent isoform use (pi) per dataset per gene (except for in single-cell mode)
added functions to add and store color palettes for different metadata colors

implemented more statistically-robust and published method of isoform switching testing (aka differential isoform expression [DIE] testing), as described by Joglekar et. al., 2021
reworked differential gene and transcript expression testing to work smoothly with AnnData representation
changed output type of intron retention / exon skipping analysis to be a more descriptive pandas DataFrame
all analysis code will now automatically store results in the SwanGraph.adata.uns dictionary using an automatically-generated key that can be easily regenerated to facilitate different pairwise testing and accessing previous results

removed indicate_dataset option
added option to group datasets based on metadata columns (groupby)
added option to include / exclude / order datasets based on metadata information (`datasets)
added option to represent datasets using color coded bars either derived from the dataset names or metadata columns (metadata_cols), as well as draw a legend for the colors
added option to either plot TPM or pi (layer)
added option to change what color palette heatmap is plotted in (cmap)
reworked options to indicate differentially-expressed transcripts in conjunction with how differential expression results are now stored (include_qvals, q, log2fc, qval_obs_col, qval_obs_conditions)
added option to display values on top of each heatmap cell (display_numbers)
added option to display transcript name as opposed to transcript ID

added functions to output calculated edge, tss, or tes abundance along with details of genomic location
added functions to calculate TPM or percent isoform use (pi) given specific metadata settings
changed how SwanGraph saving and loading works

Note: Saved Swan objects that were generated with previous versions of Swan will not be compatible with 2.0!