Releases: mortazavilab/swan_vis
Releases · mortazavilab/swan_vis
Swan v3.2
Swan v3.0
Swan 3.0
SwanGraph initialization
- Added bool options to SwanGraph initialization (
edge_adata
,end_adata
, andic_adata
) that the user can set to False if they don't wish to track abundance for these individual transcript features
AnnData compatibility
- Allows for addition of abundance information directly from an AnnData object; bypassing dense-matrix representations of the data (
SwanGraph.add_adata()
)
Documentation
- Updated sample data links
- Added AnnData data format to the file format specs
- Added additional examples to showcase functionality added in v3.0 and 2.5
Deprecation of differential gene and transcript expression tests
- Deprecated
SwanGraph.de_gene_test()
andSwanGraph.de_transcript_test()
as I have not had luck runningdiffxpy
in a while - Added example tutorial on how to directly use a Swan AnnData to perform differential expression testing with PyDESeq2
Known issues
- Currently Cerberus does not output transcript novelty assignments to GTFs and they are therefore not parsed by Swan; will fix in a future update
swan v2.5
Swan 2.5
SwanGraph structure changes
- Counts and other expression structure (ie TPM, PI) are now stored as sparse matrices to massively save on on-disk as well as in-memory storage
- Capability of storing gene-level abundance information (
SwanGraph.gene_adata
) calculated separately from transcript-level - Added AnnData to store intron chain level abundance information (
SwanGraph.ic_adata
) - Added tracking for stable gene ID in cases where reference annotation versions don't match (ie ENSG000000014.5 --> ENSG000000014)
Native compatibility with cerberus transcriptomes
- Will track TSSs, ICs, and TESs called by cerberus based on the names of transcripts provided from the GTF
Other changes
- DIE test now reports top 2 DPI isoforms
- Faster counts and TPM calculations using Scanpy tools
- Added option to sort by isoform's cumulative PI value in the gene report sorting
- Added plotting option for plotting browser models directly on to a preexisting Matplotlib axis
SwanGraph.plot_browser()
- Added plotting option to plot bed regions
SwanGraph.pg.plot_regions()
- Added options to calculate TPM across multiple datasets as either the minimum or maximum of the values between the datasets
Minor bug fixes
- Fixed DIE test bug when there are >11 isoforms / gene
- Fixed bugs in
SwanGraph.gen_report()
swan v2.0
Swan 2.0
General workflow update
- changed workflow from adding datasets / samples one at a time; now users can pass in one GTF with the union of all expressed transcripts in their data
SwanGraph representation updates
- removed strand from genomic location in
SwanGraph.loc_df
- added strand to edges in
SwanGraph.edge_df
- added AnnData representations for tracking transcript abundance as well as automatically-calculated edge, tss, and tes abundance
- added options to represent complex metadata in the AnnData.obs tables
- added a single-cell option to SwanGraph initialization for data with individual cells as samples
- automatically calculates percent isoform use (pi) per dataset per gene (except for in single-cell mode)
- added functions to add and store color palettes for different metadata colors
Analysis options update
- implemented more statistically-robust and published method of isoform switching testing (aka differential isoform expression [DIE] testing), as described by Joglekar et. al., 2021
- reworked differential gene and transcript expression testing to work smoothly with AnnData representation
- changed output type of intron retention / exon skipping analysis to be a more descriptive pandas DataFrame
- all analysis code will now automatically store results in the
SwanGraph.adata.uns
dictionary using an automatically-generated key that can be easily regenerated to facilitate different pairwise testing and accessing previous results
Gene report update
- removed
indicate_dataset
option - added option to group datasets based on metadata columns (
groupby
) - added option to include / exclude / order datasets based on metadata information (`datasets)
- added option to represent datasets using color coded bars either derived from the dataset names or metadata columns (
metadata_cols
), as well as draw a legend for the colors - added option to either plot TPM or pi (
layer
) - added option to change what color palette heatmap is plotted in (
cmap
) - reworked options to indicate differentially-expressed transcripts in conjunction with how differential expression results are now stored (
include_qvals
,q
,log2fc
,qval_obs_col
,qval_obs_conditions
) - added option to display values on top of each heatmap cell (
display_numbers
) - added option to display transcript name as opposed to transcript ID
Other plotting changes
- removed
indicate_dataset
option - added functions to change the colors of plotted Swan plots and browser plots
Other utilities
- added functions to output calculated edge, tss, or tes abundance along with details of genomic location
- added functions to calculate TPM or percent isoform use (pi) given specific metadata settings
- changed how SwanGraph saving and loading works
Note: Saved Swan objects that were generated with previous versions of Swan will not be compatible with 2.0!
swan v1.0.3
Fixes a missed dependency allowing for exon entries to be in whatever order beneath the corresponding transcript entries when loading from a GTF.
swan v1.0.2
Minor improvements
- Allows for exon entires to be in whatever order when loading a GTF
swan v1.0.1
Minor bug fixes and improvements for
- Added threading option in gen_report which will help Swan run better on clusters
- Patched a bug where edges can be exons and introns
- Patched a bug that remained in the code after internal testing for loading from a TALON db
swan v1.0
First public release!
Features include:
- Transcriptome loading via GTF or TALON database
- Transcript abundance support
- Transcriptome merging and tracking presence/absence of transcript models in each datasets
- Differential gene and transcript expression tests
- Detection of isoform switching genes
- Detection of novel exon skipping and intron retention events
- Unique visualization options to facilitate complexity of alternative splicing in datasets
swan 0.0.9
Added patch networkx command
swan 0.0.5
Patching pip setups