- trio binning
- denovogear: A program to detect denovo-variants using next-generation sequencing data.
- VBT-TrioAnalysis
- fitDNM: Enrichment of de novo mutations within genes
- denovo-db
- denovodb v.1.5 documentation
- DenovolyzeR
- TrioDeNovo
- PolyMutt
- DeNovoGear
- FamSeq
- DNMFilter
- TrioDeNovo
- Scalpel
- mirTrios
- VarScan
- TrioCaller
- SeqHBase
- adapterremoval: rapid adapter trimming, identification, and read merging
- Alfred: BAM alignment statistics, feature counting and feature annotation
- bamkit: Tools for common BAM file manipulations
- bam-readcount: count DNA sequence reads in BAM files
- bamtools
- biobambam2: Tools for early stage alignment file processing
- mosdepth: fast BAM/CRAM depth calculation for WGS, exome, or targetted sequencing.
- VariantBam: Filtering and profiling of next-generational sequencing data using region-specific rules
- bwa: Burrow-Wheeler Aligner for short-read alignment (see minimap2 for long-read alignment)
- arriba: Fast and accurate gene fusion detection from RNA-Seq data
- FuSeq: A fast detection of fusion genes from paired-end RNA-seq data
- GeneFuse: Gene fusion detection and visualization
- fusioncatcher: Finder of Somatic Fusion Genes in RNA-seq data
- STAR-Fusion: STAR-Fusion codebase
- STAR-Fusion-Tutorial: Tutorial for STAR-Fusion, FusionInspector, and de novo reconstruction of fusion transcripts using Trinity
- RDP: RDP provides quality-controlled, aligned and annotated Bacterial and Archaeal 16S rRNA sequences
- SILVA: SILVA provides comprehensive, quality checked and regularly updated datasets of aligned small (16S/18S, SSU) and large subunit (23S/28S, LSU) ribosomal RNA (rRNA) sequences
- GreenGene: Greengenes is a quality controlled, comprehensive 16S reference database and taxonomy based on a de novo phylogeny that provides standard operational taxonomic unit sets
- rrnDB: A searchable database documenting variation in ribosomal RNA operons (rrn) in Bacteria and Archaea
- EzTaxon-e: It contains comprehensive 16S rRNA gene sequences of taxa with valid names as well as sequences of uncultured taxa
- dada2: Accurate sample inference from amplicon data with single nucleotide resolution
- microbiomeGWAS
- metaSNV
- Metagenomics
- metag-rev-sup
- PanPhlAn
- MetaBAT
- Galaxy ToolShed
- multi-metagenome: Genome sequences of rare, uncultured bacteria
- srst2: Short Read Sequence Typing for Bacterial Pathogens
- ariba: Antimicrobial Resistance Identification By Assembly
- abricate: Mass screening of contigs for antimicrobial and virulence genes
- snippy: Rapid bacterial SNP calling and core genome alignments
- mag: Assembly and binning of metagenomes
- decontam: Simple statistical identification and removal of contaminants in marker-gene and metagenomics sequencing data
- MetaCarvel: A scaffolder for metagenomes
- MetaXcan
- EBAME18: Long read metagenomics practical at EBAME18
- bac-ngs-book: 病原微生物高通量测序数据分析笔记
- Zhang2019NBT
- Microbiota
- BISCUIT_SingleCell_IMM_ICML_2016: R Codebase for BISCUIT: Infinite Mixture Model to cluster and impute single cells.
- cisTopic: Probabilistic modelling of cis-regulatory topics from single cell epigenomics data
- CONICS: COpy-Number analysis In single-Cell RNA-Sequencing
- DoubletDetection: Doublet detection in single-cell RNA-seq data.
- dropSeqPipe: A SingleCell RNASeq pre-processing pipeline built on snakemake
- HoneyBADGER: HMM-integrated Bayesian approach for detecting CNV and LOH events from single-cell RNA-seq data
- ImmuneResistance: Single-cell RNA-seq of melanoma ecosystems reveals sources of T cell exclusion linked to immunotherapy clinical outcomes
- inferCNV: Inferring CNV from Single-Cell RNA-Seq
- SAVER: Single-cell RNA-seq Gene Expression Recovery
- scanpy: Single-Cell Analysis in Python. Scales to >1M cells. http://scanpy.rtfd.io
- scde: R package for analyzing single-cell RNA-seq data
- scImpute: Accurate and robust imputation of scRNA-seq data
- scell: Single-CELL rna-seq analysis software
- scg_lib_structs: Collections of library structure and sequence of popular single cell genomic methods
- single_cell_portal_core: Rails/Docker application for the Broad Institute single cell RNA-seq data portal
- single-cell-pseudotime: An overview of algorithms for estimating pseudotime in single-cell RNA-seq data
- single-cell-tutorial
- SingleR: Single-cell RNA-seq cell types Recognition
- scRNA-tools: Table of software for the analysis of single-cell RNA-seq data.
- seurat: R toolkit for single cell genomics
- snATAC: Ren Lab in-house dual-barcode single nucleus ATAC-seq (snATAC-seq) analysis pipeline
- STREAM_atac: Single-cell Trajectories Reconstruction, Exploration And Mapping of single-cell data. Preprocessing steps for single cell atac-seq data
- STREAM: Single-cell Trajectories Reconstruction, Exploration And Mapping of single-cell data
- tenx: Pipelines for the analysis of 10x single-cell RNA-sequencing data
- awesome-single-cell
- scPipe: a pipeline for single cell RNA-seq data analysis
- Linnarsson Lab Single-cell analysis of mouse cortex
- Human MTG single nucleus RNA-seq data
- scMerge:Statistical technique for removing unwanted variation from multiple scRNA-seq datasets
- scRNA-seq-workshop-Fall-2018
- SoupX: R package to quantify and remove cell free mRNAs from droplet based scRNA-seq data
-
NanoDJ: A Dockerized Jupyter Notebook for Interactive Oxford Nanopore MinION Sequence Manipulation and Genome Assembly
-
gnomad-sv-pipeline: Code and custom scripts relevant to gnomAD-SV (Collins*, Brand*, et al., 2019)
-
sv-benchmark: Public Benchmark of Long-Read Structural Variant Caller on PacBio CCS HG002 Data
-
Pomoxis: comprises a set of basic bioinformatic tools tailored to nanopore sequencing
-
Nanoflow: a NANOpore sequencing data bioinformatics workFLOW
-
Scrappie: a technology demonstrator for the Oxford Nanopore Research Algorithms group
-
wub: Tools and software library developed by the ONT Applications group
-
nano-snakemake: A snakemake pipeline for SV analysis from nanopore genome sequencing
-
pipeline-pinfish-analysis: Pipeline for annotating genomes using long read transcriptomics data with pinfish
-
hpv_minION_analysis: Contains scripts used to analyze HPV samples sequenced on ONT minIONs.
-
tiptoft: Predict plasmids from uncorrected long read data
-
nanoflow: De novo assembly of nanopore reads using nextflow
-
wub: Tools and software library developed by the ONT Applications group
-
monica: MinION Open Nucleotide Identifier for Continuous Analysis - an open source pathogen identifier for real-time analysis on MinION output
-
pipeline-polya-ng: Pipeline for calling poly(A) tail lengths from nanopore direct RNA data using nanopolish
-
pomoxis: Analysis components from Oxford Nanopore Research
- scrappie: Scrappie is a technology demonstrator for the Oxford Nanopore Research Algorithms group
- albacore: a professional quality suite of Rake tasks for building .NET or Mono based systems
- Basecalling-comparison: A comparison of different Oxford Nanopore basecallers
- fast5_fetcher: A tool for fetching nanopore fast5 files after filtering via demultiplexing, alignment, or other, to improve downstream processing efficiency
- SquiggleKit: A toolkit for manipulating nanopore signal data
- fast5seek: Subset of fast5 files contained in a fastq, BAM, or SAM file
- albacore: Dockerfile for the Albacore basecaller from Oxford Nanopore
- Basecalling-comparison: A comparison of different Oxford Nanopore basecallers
- npBarcode: Demultiplex barcoded Oxford Nanopore sequencing
- npReader: Real-time extraction and analysis Oxford Nanopore sequencing data
- nanopore adapters
- NanoFilt: https://github.com/wdecoster/nanofilt
- Deepbinner: a signal-level demultiplexer for Oxford Nanopore reads
- Porechop: adapter trimmer for Oxford Nanopore reads
- poretools: a toolkit for working with Oxford nanopore data
- NanoPlot: Plotting scripts for long read sequencing data
- longread_plots: A collection of plots for long read sequencing FastQ files from devices like Oxford Nanopore's MinION and PromethION.
- Nanopolish
- nanoQC: Quality control tools for nanopore sequencing data
- NanoR: R package for user-friendly analysis and comparison of ONT data
- pomoxis: Analysis components from Oxford Nanopore Research
- poretools document
- poretools github: a toolkit for working with Oxford nanopore data
- qcat: qcat is Python command-line tool for demultiplexing Oxford Nanopore reads from FASTQ files
- pycoQC: pycoQC computes metrics and generates Interactive QC plots from the sequencing summary report generated by Oxford Nanopore technologies basecaller (Albacore/Guppy)
- nanopack: Easily install all nanopack scripts together
- nanocomp: Comparison of multiple long read datasets
- nanolyse: Remove lambda phage reads from a fastq file
- nanomath: A few simple math function for other Oxford Nanopore processing scripts
- NovoGraph: building whole genome graphs from long-read-based de novo assemblies
- wtdbg2: A fuzzy Bruijn graph approach to long noisy reads assembly
- smartdenovo: Ultra-fast de novo assembler using long noisy reads
- MECAT2
- quickmerge: A simple and fast metassembler and assembly gap filler designed for long molecule based assemblies.
- npGraph: Resolve assembly graph in real-time using nanopore data
- Canu
- shasta: De novo assembly from Oxford Nanopore reads
- RaGOO: A tool to order and orient genome assembly contigs via Minimap2 alignments to a reference genome
- helen: H.E.L.E.N. (Homopolymer Encoded Long-read Error-corrector for Nanopore)
- racon: Ultrafast consensus module for raw de novo genome assembly of long uncorrected reads
- ntEdit: scalable genome assembly polishing
- nanopolish: Signal-level algorithms for MinION data
- Apollo
- Quiver
- Longshot: diploid SNV caller for error-prone reads
- NanoSatellite: Dynamic time warping of Oxford Nanopore squiggle data to characterize tandem repeats
- Clairvoyante: a multi-task convolutional deep neural network for variant calling in Single Molecule Sequencing
- deepsignal: Detecting methylation using signal-level features from Nanopore sequencing reads
- tombo: a suite of tools primarily for the identification of modified nucleotides from raw nanopore sequencing data
- DeepMod: a deep-learning tool for genomic-scale, strand-sensitive and single-nucleotide based detection of DNA modifications
- nanopore-methylation
- mCaller: A python program to call methylation (m6A in DNA) from nanopore signal data
- EpiNano: Detection of RNA modifications from Oxford Nanopore direct RNA sequencing reads
- graphmap: A highly sensitive and accurate mapper for long, error-prone reads
- rkmh: Classify sequencing reads using MinHash
- minialign: fast and accurate alignment tool for PacBio and Nanopore long reads
- NanoSim: Nanopore sequence read simulator
- DeepSimulator: The first deep learning based Nanopore simulator which can simulate the process of Nanopore sequencing.
- dRNA-paper-scripts: Highly parallel direct RNA sequencing on an array of nanopores(https://www.nature.com/articles/nmeth.4577)
- pinfish:Tools to annotate genomes using long read transcriptomics data
- flair: Full-Length Alternative Isoform analysis of RNA
- pinfish: Tools to annotate genomes using long read transcriptomics data
- pipeline-pinfish-analysis: Pipeline for annotating genomes using long read transcriptomics data with pinfish
- pychopper: A tool to identify full length cDNA reads
- LoReAn: Long Reads Annotation pipeline
- poreplex: A versatile sequenced read processor for nanopore direct RNA sequencing
- Mandalorion: Analysis Pipeline to analyze Nanopore RNAseq data
- pipeline-polya-ng: Pipeline for calling poly(A) tail lengths from nanopore direct RNA data using nanopolish
- PacBioEDA: Python scripts for Exploratory Data Analysis of Pacific Biosciences sequence data
- GenomicConsensus: PacBio® variant and consensus caller
- pbalign: pbalign maps PacBio reads to reference sequences and saves alignments to a BAM file
- pbmm2: A minimap2 frontend for PacBio native data formats
- w2rap-contigger: An Illumina PE genome contig assembler, can handle large (17Gbp) complex (hexaploid) genomes.
- w2rap: WGS (Wheat) Robust Assembly Pipeline
- GFA-spec: Graphical Fragment Assembly (GFA) Format Specification
- HapCUT2: software tools for haplotype assembly from sequence data
- masurca: MaSuRCA Genome Assembler Quick Start Guide
- minia: Minia is a short-read assembler based on a de Bruijn graph
- npScarf: Scaffold and Complete assemblies in real-time fashion
- redundans: Redundans is a pipeline that assists an assembly of heterozygous/polymorphic genomes.
- Scaff10X: Pipeline for scaffolding and breaking a genome assembly using 10x genomics linked-reads
- SDA: Segmental Duplication Assembler (SDA)
- shovill: Faster SPAdes assembly of Illumina reads
- SOAPdenovo2
- FALCON-Phase: FALCON-Phase integrates PacBio long-read assemblies with Phase Genomics Hi-C data to create phased, diploid, chromosome-scale scaffolds
- wtdbg2: A fuzzy Bruijn graph approach to long noisy reads assembly
- DBG2OLC: The genome assembler that reduces the computational time of human genome assembly from 400,000 CPU hours to 2,000 CPU hours, utilizing long erroneous 3GS sequencing reads and short accurate NGS sequencing reads.
- Flye: Fast and accurate de novo assembler for single molecule sequencing reads
- PBcR (http://wgs-assembler.sourceforge.net/wiki/index.php/PBcR)
- SALSA: A tool to scaffold long read assemblies with Hi-C data
- smartdenovo: Ultra-fast de novo assembler using long noisy reads
- NovoGraph: Genome Graph of Long-read De Novo Assemblies
- quickmerge: A simple and fast metassembler and assembly gap filler designed for long molecule based assemblies.
- PBJelly: Gap-closing-with-PBJelly
- GapCloser
- Corset: Software for clustering de novo assembled transcripts and counting overlapping reads
- ascatNgs: Somatic copy number analysis using paired end wholegenome sequencing
- needlestack: Multi-sample somatic variant caller
- seurat: Tumor-Normal Variant Caller
- facets: Algorithm to implement Fraction and Copy number Estimate from Tumor/normal Sequencing.
- Shimmer: a software package for the characterization of genetic differences between two very similar samples, e.g., a tumor sample and its matched normal tissue sample
- neusomatic: Deep convolutional neural networks for accurate somatic mutation detection
- Pisces: Somatic and germline variant caller for amplicon data.
- deTiN: DeTiN is designed to measure tumor-in-normal contamination and improve somatic variant detection sensitivity when using a contaminated matched control.
- DeepSVR: a machine learning model approach to somatic variant refinement
- somaticseq: An ensemble approach to accurately detect somatic mutations using SomaticSeq
- MuSiC2: identifying mutational significance in cancer genomes
- benchmarking germline small-variant calls: Repository for the GA4GH Benchmarking Team work developing standardized benchmarking methods for germline small variant calls
- vt: A tool set for short variant discovery in genetic sequence data
- dna-seq-gatk-variant-calling: This Snakemake pipeline implements the GATK best-practices workflow
- deepvariant: an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data
- speedseq: A flexible framework for rapid genome analysis and interpretation
- vg: tools for working with genome variation graphs
- GEMINI: integrative exploration of genetic variation and genome annotations
- Complete assembly of parental haplotypes with trio binning
- Haplotype Reference Consortium (HRC)
- HipSTR: Genotype and phase short tandem repeats using Illumina whole-genome sequencing data
- Eagle: https://www.nature.com/articles/ng.3679
- SHAPEIT
- UK Biobank phasing and imputation documentation (including brief description of SHAPEIT3)
- Beagle
- PopLDdecay: a fast and effective tool for linkage disequilibrium decay analysis based on variant call format(VCF) files
- emeraLD: tools to efficiently retrieve and calculate LD
- ngsLD: Calculation of pairwise Linkage Disequilibrium (LD) under a probabilistic framework
- LD Hub
- LDSC and associated files
- abstar: VDJ assignment and antibody sequence annotation. Scalable from a single sequence to billions of sequences.
- svim: Structural Variant Identification Method using Long Reads
- SURVIVOR: Toolset for SV simulation, comparison and filtering
- Sniffles: Structural variation caller using third generation sequencing
- NanoSV: SV caller for nanopore data
- smrtsv2: long read structural variant caller
- pbsv: PacBio structural variant (SV) calling and analysis tools
- Picky: Structural Variants Pipeline for Long Reads
- NanoVar: Structural variant caller using low-depth Nanopore sequencing
- SV-plaudit: Pipeline for structural variant image curation and analysis.
- SVJedi: SV genotyping with long reads
- cuteSV: Long read based human genomic structural variation detection
- EnsembleSV: A workflow for SV inference allosing for multiple sequencing technologies and methods
- svtyper: Bayesian genotyper for structural variants
- lumpy-sv: a general probabilistic framework for structural variant discovery
- parliament2: Runs a combination of tools to generate structural variant calls on whole-genome sequencing data
- delly: Structural variant discovery by integrated paired-end and split-read analysis
- manta: Structural variant and indel caller for mapped sequencing data
- parliament2: Runs a combination of tools to generate structural variant calls on whole-genome sequencing data
- SV2: Support Vector Structural Variation Genotyper
- pindel: identify the breakpoints of these variants from paired-end short reads
- MetaSV: An accurate and integrative structural-variant caller for next generation sequencing
- svaba: Structural variation and indel detection by local assembly
- wham: Structural variant detection and association testing
- gridss: Genomic Rearrangement IDentification Software Suite
- breakdancer: SV detection from paired end reads mapping
- SVenX: Pipeline for SV detection using 10X genomics data
- paragraph: Graph realignment tools for structural variants
- svtools: Tools for processing and analyzing structural variants
- SnowmanSV: Structural variation and indel detection using rolling local string graph assembly
- truvari: Structural variant comparison tool for VCFs
- parliament2: Runs a combination of tools to generate structural variant calls on whole-genome sequencing data
- Control-FREE: a tool for assessing copy number and allelic content using next generation sequencing data
- canvas: Canvas Copy Number Variant Caller
- CNVnator: a tool for CNV discovery and genotyping from depth-of-coverage by mapped reads
- cnv_facets: Somatic copy variant caller (CNV) for next generation sequencing
- CNV-Visualizer: Visualizing Copy Number Variations
- facets: Algorithm to implement Fraction and Copy number Estimate from Tumor/normal Sequencing.
- cnvkit: Copy number variant detection from targeted DNA sequencing
- ADTEx: detect somatic copy number variations (CNVs)
- NGSEPcore: an integrated framework for analysis of high throughput sequencing (HTS) reads. The main functionality of NGSEP is the variants detector, which allows to make integrated discovery and genotyping of Single Nucleotide Variants (SNVs), insertions, deletions, and genomic regions with copy number variation (CNVs)
- aCNViewer: Comprehensive genome-wide visualization of absolute copy number and copy neutral variations
- cancerTitanCNA: Analysis of subclonal copy number alterations (CNA) and loss of heterozygosity (LOH)
- TCAG-WGS-CNV-workflow: Scripts involved in our workflow for detecting CNVs from WGS data using read depth-based methods
- gatk4-somatic-cnvs
- Call somatic copy number variants using GATK4 CNV
- svaba: Structural variation and indel detection by local assembly
- truvari: Structural variant comparison tool for VCFs
- smoove: structural variant calling and genotyping with existing tools, but, smoothly
- sv-pipeline: Pipeline for structural variation detection in cohorts
- svtools: Tools for processing and analyzing structural variants
- samplot: Plot structural variant signals from many BAMs and CRAMs
- svviz2: visual evaluation of read support for structural variation
- FusorSV: an algorithm for optimally combining data from multiple structural variation detection methods
- seeksv: A bioinformatics tool for SV detection and virus integration discovery
- SViper: Swipe your Structural Variants called on long (ONT/PacBio) reads with short exact (Illumina) reads.
- AnnotSV: Annotation and Ranking of Human Structural Variations
- Nirvana: Nirvana provides clinical-grade annotation of genomic variants (SNVs, MNVs, insertions, deletions, indels, and SVs (including CNVs)
- StructuralVariantAnnotation: R package designed to simplify structural variant analysis
- RefSeq GRCh38
- Deciphering Developmental Disorders (DDD) Study
- OMIM (Online Mendelian Inheritance in Man): https://omim.org/downloads/
- gnomAD
- GeneHancer
- Exome Aggregation Consortium (ExAC)
- ENCODE: Encyclopedia of DNA Elements
- dbVar
- dbvar download
- American College of Medical Genetics and Genomics (ACMG)
- VarCards: an integrated genetic and clinical database for coding variants in the human genome
- SNPTEST: a program for the analysis of single SNP association in genome-wide studies.
- GTOOL: a program for transforming sets of genotype data for use with the programs SNPTEST and IMPUTE.
- fcGENE: A Versatile Tool for Processing and Transforming SNP Datasets.
- Fast GWAS download script
- h3agwas: GWAS Pipeline for H3Africa
- GWAAS Summary statistics
- QTLseqr: QTLseqr is an R package for QTL mapping using NGS Bulk Segregant Analysis
- ATAC-seq
- ATAC-seq
- atac_dnase_pipelines: ATAC-seq and DNase-seq processing pipeline
- ATAC-seq Guidelines
- atac_dnase_pipelines: ATAC-seq and DNase-seq processing pipeline
- ATAC-seq pipeline: ATAC-seq Data Standards and Prototype Processing Pipeline
- ChIPseeker: ChIP peak Annotation, Comparison and Visualization
- metaseq: Framework for integrated analysis and plotting of ChIP/RIP/RNA/-seq data
- pyflow-ChIPseq: a snakemake pipeline to process ChIP-seq files from GEO or in-house
- ChIP-seq-analysis: ChIP-seq analysis notes from Tommy Tang
- chipseq_pipeline: AQUAS TF and histone ChIP-seq pipeline
- chip-seq-pipeline2
- Methylation QTL data for brain and blood
- methylpy: WGBS/NOMe-seq Data Processing & Differential Methylation Analysis
- ViewBS: a powerful toolkit for visualization of high-throughput bisulfite sequencing data
- mCaller: A python program to call methylation (m6A in DNA) from nanopore signal data
- DNA-methylation-analysis: notes on DNA methylation analysis (arrays and sequencing data)
- bs3: BS-Seeker3: An Ultra-fast, Versatile Pipeline for Mapping Bisulfite-treated Reads
- bsseq: Devel repository for bsseq
- Hi-C data
- tadtool: an interactive tool for the identification of meaningful parameters in TAD-calling algorithms for Hi-C data.
- juicebox_scripts: A collection of scripts for working with Hi-C data, Juicebox, and other genomic file formats
- ALLHiC: phasing and scaffolding polyploid genomes based on Hi-C data
- genomedisco: Software for comparing contact maps from HiC, CaptureC and other 3D genome data
- 3DChromatin_ReplicateQC: Software to compute reproducibility and quality scores for Hi-C data
- hic_breakfinder
- TitanCNA_10X_snakemake: Snakemake workflow for 10X Genomics WGS analysis using TitanCNA
- SVenX: Pipeline for SV detection using 10X genomics data
- 10x Genomics
- awesome-10x-genomics: List of tools and resources related to the 10x Genomics GEMCode/Chromium system
- bxtools: Tools for analyzing 10X Genomics data
- ngsDist:Estimation of pairwise distances under a probabilistic framework
- NGS-pipe: next-generation sequencing pipelines for precision oncology
- ngsPopGen: Population genetics analyses from NGS data
- ngsTools: Programs to analyse NGS data for population genetics purposes
- viral-ngs: Viral genomics analysis pipelines
- NGSCheckMate: Software program for checking sample matching for NGS data
- abtools: Analysis of antibody NGS data
- alignment-and-variant-calling-tutorial: basic walk-throughs for alignment and variant calling from NGS sequencing data
- ClustersPloter: visualize genome features cluster
- Plot chromosome ideograms along with other genomic data
- Laying out multiple plots on a page
- plot2DO: A tool to assess the quality and distribution of genomic data
- CMplot
- ggcyto: Visualize Cytometry data with ggplot2
- ggsashimi: Command-line tool for the visualization of splicing events across multiple samples
- chromPlot: visualization of genomic data in chromosomal context
- karyoploteR
- DataVisualization: Data Visualization in Bioinformatics
- LocusCompare: https://github.com/boxiangliu/locuscompare.
- LocusCompareR
- zUMIs: A fast and flexible pipeline to process RNA sequencing data with UMIs
- umis: Tools for processing UMI RNA-tag data
- tcR: Advanced Data Analysis of Immune Receptor Repertoires
- NovoAlign NGS Quick Start Tutorial
- In-depth-NGS-Data-Analysis-Course: Materials for 12-day course on analyzing RNA-Seq, ChIP-Seq and variant calling data https://hbctraining.github.io/In-depth-NGS-Data-Analysis-Course/
- LncPipe: A Nextflow-based pipeline for comprehensive analyses of long non-coding RNAs from RNA-seq datasets
- NGI-ChIPseq: Nextflow ChIP-seq data analysis pipeline, National Genomics Infrastructure, Science for Life Laboratory in Stockholm
- nf-GATK_Exome_Preprocess: Adapted from the GATK best practice guide to preprocess whole exome sequencing (WES) data
- pyflow-ATACseq: ATAC-seq snakemake pipeline
- pyflow-ChIPseq: a snakemake pipeline to process ChIP-seq files from GEO or in-house
- single-cell-rna-seq: A single cell RNA-seq workflow following http://dx.doi.org/10.12688/f1000research.9501.2
- pyflow-RNAseq: RNAseq pipeline based on snakemake
- atac-seq-pipeline: ENCODE ATAC-seq pipeline
- atac_dnase_pipelines: ATAC-seq and DNase-seq processing pipeline
- m6A-seq_analysis_workflow: A pipeline to process m6A-seq data and down stream analysis
- Harvard FAS Tutorials: Harvard FAS Tutorials and Training
- ENCODE ATAC-seq: ATAC-seq pipeline
- getting-started-with-genomics-tools-and-resources: Unix, R and python tools for genomics
- DNA-seq-analysis: notes on whole exome and whole genome sequencing analysis
- RNA-seq-analysis: RNAseq analysis notes from Tommy Tang
- scRNAseq-analysis-notes: scRNAseq analysis notes
- Zika-RNAseq-Pipeline: An open RNA-Seq data analysis pipeline tutorial with an example of reprocessing data from a recent Zika virus study
- superFreq: Analysis pipeline for cancer exomes
- Longitudinal_myoMyo_transcriptome
- TitanCNA: Analysis of subclonal copy number alterations (CNA) and loss of heterozygosity (LOH) in cancer
- TitanCNA/scripts/snakemake: Snakemake workflow for TITAN
- sequana: a set of Snakemake NGS pipelines
- Segmental duplications (hg38)
- self-chains (hg38)
- Pfam domains (hg38)
- hg38 gtf
- gnomAD
- CADD files for both indels and SNPs (hg38)
- MPC
- whole-exome MTR file
- REVEL file
- pLI file
- ClinVar VCF file
- de novo variants on s3 server
- Short Tandem Repeat DNA
- OMIM
- LDlink
- SIFT
- Polyphen-2
- MutationTaster
- ESP6500
- ExAC
- HGMD
- Tohoku Medical Megabank Genome Reference Panel
- iJGVD
- UK Biobank
- [Database of Genotypes and Phenotypes (dbGaP)]https://www.ncbi.nlm.nih.gov/gap
- FUMA software
- MAGMA software
- mvGWAMA and effective sample size calculation
- LD Score Regression software
- LD Hub (GWAS summary statistics)
- LD scores
- Psychiatric Genomics Consortium (GWAS summary statistics)
- MSigDB curated gene-set database
- msigdb/collections.jsp; NHGRI GWAS catalog
- GSMR software
- credible SNP set analysis software
- IGSR: The International Genome Sample Resource: Providing ongoing support for the 1000 Genomes Project data
- 1000genome vcf
- decipher
- DGVa: Database of Genomic Variants archive
- ClinVar
- COSMIC
- Database of Genomic Variants
- ENCODE
- 3D Genome Browser
- HPO: The Human Phenotype Ontology
- CHPO
- ICGC: International Cancer Genome Consortium
- GWAS Catalog: The NHGRI-EBI Catalog of published genome-wide association studies
- RefSeq GRCh38
- Deciphering Developmental Disorders (DDD) Study
- OMIM (Online Mendelian Inheritance in Man): https://omim.org/downloads/
- gnomAD
- dbVar: Non-Redundant Structural Variation Datasets
- GeneHancer
- Exome Aggregation Consortium (ExAC)
- dbVar
- American College of Medical Genetics and Genomics (ACMG)
- VarCards: an integrated genetic and clinical database for coding variants in the human genome
- 1000 Genomes Consortium GRCh38
- Replication Domain genome browser and analysis tool
- European Nucleotide Archive
- Sequence Read Archive
- ANNOVAR
- CLINVITAE
- dbNSFP
- dbSNP
- Ensembl http://www.ensembl.org/)
- GERPþþ
- GWASdb
- HGMD
- InterVar
- MedGen
- NHLBI Exome Sequencing Project (ESP) Exome Variant Server
- OMIM
- OrphaNet
- PolyPhen-2
- RefSeq
- RepeatMasker
- SIFT
- UCSC Genome Browser
- wIntervar
- GenomeNet: integrated database
- KEGG: Kyoto Encyclopedia of Genes and Genomes
- KOBAS
- InterPro
- ENCODE: ENCODE
- HGNC: is responsible for approving unique symbols and names for human loci, including protein coding genes, ncRNA genes and pseudogenes, to allow unambiguous scientific communication.
- InterVar: A bioinformatics software tool for clinical interpretation of genetic variants by the 2015 ACMG-AMP guideline
- Ira Hall lab
- IPD-IMGT/HLA: https://github.com/ANHIG/IMGTHLA
- iPSYCH download site
- iPSYCH project
- LISA cluster at SURFsara
- Lisa Genetic Cluster Computer
- /giab/ftp/data/NA12878
- GTEx eQTL data
- GTEx portal
- International Paediatric and Congenital Cardiac Codes
- InterProScan
- genome-in-a-bottle: A public-private-academic consortium hosted by NIST to develop reference materials and standards for clinical sequencing
- PharmVar: Pharmacogene Variation Consortium
- dbNSFP: variant annotation tools
- GeneMatcher
- variant annotion tool: Annotating variants using multiple annotation databases
- Mouse Genome Informatics (MGI)
- cyvcf2: fast VCF and BCF processing
- CyVCF document
- CyVCF: A fast Python library for VCF files leveraging Cython for speed.
- rtg-tools: Utilities for accurate VCF comparison and manipulation
- spVCF: Sparse Project VCF: evolution of VCF to encode population genotype matrices efficiently
- vcf2phylip: Convert SNPs in VCF format to PHYLIP, NEXUS, binary NEXUS, or FASTA alignments for phylogenetic analysis
- vcflib: a simple C++ library for parsing and manipulating VCF files, + many command-line utilities
- GTShark: Genotype compression in large projects
- The Cancer Genome Atlas (TCGA)
- International Cancer Genome Consortium (ICGC)
- Catalogue of Somatic Mutations in Cancer (COSMIC)
- cBioPortal: provides visualization, analysis and download of large-scale cancer genomics data sets
- Pan-Cancer Atlas
- DoCM: database of curated mutations
- CKB: JAX-Clinical Knowledgebase
- OncoKB: OncoKB Cancer Genes
- PMKB: Precision Medicine Knowledgebase
- GeneCards: human gene database
- MalaCards: The human disease database
- Cancer Cell Line Encyclopedia
- MiOncoCirc: A compendium of circular RNAs compiled from cancer clinical samples at The University of Michigan
- Trapnell Lab: aim to identify genes that control cellular transitions, primarily using single-cell genomics
- H3ABioNet
- Pacific Biosciences
- Hartwig Medical Foundation
- GenomicParisCentre
- BC Cancer Canada's Michael Smith Genome Sciences Centre
- Global Alliance for Genomics and Health
- SciHub desktop
- sci-hub: http://sci-hub.tw/, http://sci-hub.hk/, http://www.sci-hub.cn/
- libgen
- search code
- omictools
- anaconda cloud
- Gene Regulation Info: Protein-DNA binding: data, tools & models
- STHDA: Statistical tools for high-throughput data analysis
- awesome-cancer-variant-databases: A community-maintained repository of cancer clinical knowledge bases and databases focused on cancer variants.
- libpku: 贵校课程资料民间整理 (https://lib-pku.github.io/)
- REKCARC-TSC-UHT: 清华大学计算机系课程攻略
- USTC-Course: 中国科学技术大学课程资源 (https://ustc-resource.github.io/USTC-Course/)
- linux-command: Linux命令大全搜索工具,内容包含Linux命令手册、详解、学习、搜集。https://git.io/linux https://git.io/linux
- Algorithm_Interview_Notes-Chinese
- awesome-programming-books: A curated list of awesome programming books
- TheAlgorithms/Python: All Algorithms implemented in Python
- TheAlgorithms/C-Plus-Plus: All Algorithms implemented in C++
- git-tips
- PySnooper: Never use print for debugging again (a poor man debugger)
- pandas-cookbook: Recipes for using Python pandas library
- intervaltree: Editable interval tree data structure for Python 2 and 3
- tensorflow_cookbook: Code for Tensorflow Machine Learning Cookbook
- pumpkin-book: 机器学习》(西瓜书)公式推导解析,在线阅读地址:https://datawhalechina.github.io/pumpkin-book
- deeplearningbook-chinese
- nndl.github.io: 《神经网络与深度学习》 Neural Network and Deep Learning https://nndl.github.io
- DeepLearning_Summary: A list of awesome Deep Learning tutorials, projects and communities. Forked from other github repositories.
- Paddle: PArallel Distributed Deep LEarning
- parallel_ml_tutorial: Tutorial on scikit-learn and IPython for parallel machine learning
- DragoNN
- Kaggle machine learning competitions
- Keras
- Keras model zoos
- PyTorch
- PyTorch model zoos
- Tensorflow model zoos
- tensorflow
- STNeuroNet: Software for the paper "Fast and robust active neuron segmentation in two-photon calcium imaging using spatio-temporal deep learning," Proceedings of the National Academy of Sciences (PNAS), 2019
- deep-q-learning: Minimal Deep Q Learning (DQN & DDQN) implementations in Keras
- DeepSVR: a machine learning model approach to somatic variant refinement
- DeepLearning-500-questions: 深度学习500问
- DeepSVR: a machine learning model approach to somatic variant refinement
- deepvariant: an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data
- deTiN: DeTiN is designed to measure tumor-in-normal contamination and improve somatic variant detection sensitivity when using a contaminated matched control.
- homemade-machine-learning: Python examples of popular machine learning algorithms with interactive Jupyter demos and math being explained
- practicalAI: A practical approach to learning machine learning
- vowpal_wabbit: a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.
- janggu: Deep learning infrastructure for bioinformatics
- learn-python: Playground and cheatsheet for learning Python. Collection of Python scripts that are split by topics and contain code examples with explanations.
- TensorFlow-Course
- awesome-nlp: A curated list of resources dedicated to Natural Language Processing (NLP)
- Coursera-ML-AndrewNg-Notes: 吴恩达老师的机器学习课程个人笔记
- flair: A very simple framework for state-of-the-art Natural Language Processing (NLP)