Skip to content

swan v2.0

Compare
Choose a tag to compare
@fairliereese fairliereese released this 31 Aug 05:02
· 67 commits to master since this release

Swan 2.0

General workflow update

  • changed workflow from adding datasets / samples one at a time; now users can pass in one GTF with the union of all expressed transcripts in their data

SwanGraph representation updates

  • removed strand from genomic location in SwanGraph.loc_df
  • added strand to edges in SwanGraph.edge_df
  • added AnnData representations for tracking transcript abundance as well as automatically-calculated edge, tss, and tes abundance
  • added options to represent complex metadata in the AnnData.obs tables
  • added a single-cell option to SwanGraph initialization for data with individual cells as samples
  • automatically calculates percent isoform use (pi) per dataset per gene (except for in single-cell mode)
  • added functions to add and store color palettes for different metadata colors

Analysis options update

  • implemented more statistically-robust and published method of isoform switching testing (aka differential isoform expression [DIE] testing), as described by Joglekar et. al., 2021
  • reworked differential gene and transcript expression testing to work smoothly with AnnData representation
  • changed output type of intron retention / exon skipping analysis to be a more descriptive pandas DataFrame
  • all analysis code will now automatically store results in the SwanGraph.adata.uns dictionary using an automatically-generated key that can be easily regenerated to facilitate different pairwise testing and accessing previous results

Gene report update

  • removed indicate_dataset option
  • added option to group datasets based on metadata columns (groupby)
  • added option to include / exclude / order datasets based on metadata information (`datasets)
  • added option to represent datasets using color coded bars either derived from the dataset names or metadata columns (metadata_cols), as well as draw a legend for the colors
  • added option to either plot TPM or pi (layer)
  • added option to change what color palette heatmap is plotted in (cmap)
  • reworked options to indicate differentially-expressed transcripts in conjunction with how differential expression results are now stored (include_qvals, q, log2fc, qval_obs_col, qval_obs_conditions)
  • added option to display values on top of each heatmap cell (display_numbers)
  • added option to display transcript name as opposed to transcript ID

Other plotting changes

  • removed indicate_dataset option
  • added functions to change the colors of plotted Swan plots and browser plots

Other utilities

  • added functions to output calculated edge, tss, or tes abundance along with details of genomic location
  • added functions to calculate TPM or percent isoform use (pi) given specific metadata settings
  • changed how SwanGraph saving and loading works

Note: Saved Swan objects that were generated with previous versions of Swan will not be compatible with 2.0!