Skip to content

Latest commit

 

History

History
92 lines (66 loc) · 9.05 KB

index.md

File metadata and controls

92 lines (66 loc) · 9.05 KB

This site contains course materials for SISG Module 17: WGS Data Analysis, June 12-14, 2024.

  • Instructors: Laura Raffield and Matthew Conomos

Course Description

This module will provide an introduction to analyzing genotype data generated from whole genome sequencing (WGS). It will focus on extensions of standard GWAS analyses (e.g. rare-variant association tests) and “post-GWAS” follow-up analyses (e.g. conditional analysis, fine-mapping), and how WGS may improve results or be best utilized for these analyses; methods that incorporate variant annotation information will be highlighted.

Methods and examples will be informed by the instructors’ experience in large human genetics consortia (e.g. TOPMed), and, therefore, will focus on analyzing human data, but may be applicable/extendable to other organisms. A basic introduction to cloud computing will be provided, and students will perform hands-on exercises on a genomic analysis cloud platform.

Learning Objectives

After attending this module, participants will be able to:

  1. Understand how to perform association analyses for rare variants measured in WGS data using aggregate tests
  2. Access variant annotation resources and understand how to incorporate annotation information into analyses to improve power and inform results
  3. Understand the theory of, and how and when to perform, various “post-GWAS” follow-up analyses
  4. Leverage multi-ancestry WGS data
  5. Appreciate the utility of existing genomic analysis cloud platforms and get hands-on experience with cloud computing on one of these platforms

Course Format

Lectures

Course material will be presented through lectures. Slides for lectures are linked in the schedule below.

Tutorials

Many of the lectures will be followed with hands-on tutorials/exercises. Students are encouraged to work through the tutorials together. Afterwards, the instructors will walk through the tutorials and lead a discussion.

To run the tutorials, log into NHLBI BioData Catalyst powered by Seven Bridges with your username and password -- we will use this platform for live demonstrations during the course.

  • You will retain access to the Seven Bridges platform, including your SISG Project with all of the course materials even after the course ends. The SISG24 Workshop billing group will remain available to you for a short period of time, after which you will need to set up another payment method to run analyses. You can request pilot cloud credits ($500 worth) from BioData Catalyst. Additionally, there is guidance available for writing BioData Catalyst cloud costs into your grant proposal budget.

All of the R code and data can also be downloaded from the github repository from which the site is built and run on your local machine. Download the complete workshop data and tutorials: https://github.com/UW-GAC/SISG_2024/archive/main.zip

Schedule

NOTE: All times are Eastern Daylight Time (GMT-04:00)

Wednesday, June 12th

Time Topic Lecture Tutorials/Exercises
1:30pm-1:35pm Introduction Slides
1:35pm-3:00pm Intro to Cloud Computing for WGS Data Analysis
Intro to GDS Tutorial
Slides .Rmd | .html
3:00pm-3:30pm Coffee Break
3:30pm-5:00pm GWAS Slides .Rmd | .html
Extra Population Structure and Relatedness Tutorial .Rmd | .html

Thursday, June 13th

Time Topic Lecture Tutorials/Exercises
8:30am-10:00am GWAS: Advanced Model Extenstions Slides .Rmd | .html
Extra GENESIS Model Explorer Tutorial .Rmd | .html
10:00am-10:30am Coffee Break
10:30am-12:00pm Leveraging Multi-Ancestry Data: Lecture Slides
12:00pm-1:30pm Lunch Break
1:30pm-3:00pm Leveraging Multi-Ancestry Data:
LD Exercise
Locus Zoom and Conditional Analysis Tutorials
.docx | NEJM 2020 | Nature 2021 | KEY
.Rmd | .html
3:00pm-3:30pm Coffee Break
3:30pm-5:00pm Variant Annotation: Part 1
Annotation Explorer Tutorial
Slides .Rmd | .html
5:00pm-6:00pm Tutorial Session

Friday, June 14th

Time Topic Lecture Tutorials/Exercises
8:30am-10:00am Variant Annotation: Part 2
UCSC Genome Browser and FAVOR Tutorial
Slides .docx | chr16 SNPS | KEY
10:00am-10:30am Coffee Break
10:30am-12:00pm Multi-Variant Association Tests Slides .Rmd | .html
12:00pm-1:30pm Lunch Break
1:30pm-3:00pm STAAR Slides .Rmd | .html
3:00pm-3:30pm Coffee Break
3:30pm-5:00pm Recent Findings and Resources for WGS Analysis Slides

R packages used

Resources

A detailed tutorial and relevant R scripts for STAAR pipeline are available at https://github.com/xihaoli/STAARpipeline-Tutorial.

If you are new to R, you might find the following material helpful: