-
Notifications
You must be signed in to change notification settings - Fork 0
/
index.qmd
84 lines (66 loc) · 6.28 KB
/
index.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
---
title: "Metagenomics"
subtitle: "Environmental Applications"
# title-block-banner: true
bioschemas:
"@context": "https://schema.org"
"@type": "LearningResource"
"@id": "https://cloud-span.github.io/metagenomics00-overview/"
"http://purl.org/dc/terms/conformsTo":
- "@type": CreativeWork
"@id": "https://bioschemas.org/profiles/TrainingMaterial/1.0-RELEASE"
about:
- "@id": "http://edamontology.org/topic_3174"
- "@id": "http://edamontology.org/topic_3372"
- "@id": "http://edamontology.org/topic_0769"
- "@id": "http://edamontology.org/topic_3365"
abstract: "This course teaches data analysis for metagenomics projects. It covers how to (1) generate and QC a metagenome assembly, (2)' bin’ the assembly into metagenome assembled genomes (MAGs) (also known as bins), (2)identify the taxonomy of these MAGs, and (4) calculate diversity metrics and add functional annotation to identify the products of genes identified in the assembled MAGs."
author: ["Emma Rand", "Sarah Forrester", "Annabel Cansdale", "Jorge Buenabad-Chavez", "Evelyn Greeves"]
contributor: ["James Chong", "Emma Barnes", "University of York", "Software Sustainability Institute"]
educationalLevel: "Intermediate"
identifier: "10.5281/zenodo.7505859"
name: "Cloud-SPAN Genomics Course"
url: "https://cloud-span.github.io/metagenomics00-overview"
inLanguage: "en"
keywords: "metagenomics, assembly, polishing, taxonomic annotation, binning, functional annotation, shell, command line tools, cloud computing, AWS, HPC, data analysis"
license: CC-BY 4.0
timeRequired: "PT12H"
mentions: ["Git for Windows", "Cloud-SPAN Genomics course", "Data Carpentries Genomics workshop"]
---
This hands-on, online course teaches data analysis for metagenomics projects. It is aimed at those with little or no experience of using high performance computing (HPC) for data analysis. In the course we will cover:
- navigating file directories and using the command line
- logging into a remote cloud instance
- using common commands and running analysis programs in the command line
- what is metagenomics?
- following a metagenomics analysis workflow including:
- performing quality control on reads
- assembly of reads into a metagenome
- improving your assembly with polishing
- binning into species/metagenome-assembled genomes (MAGs)
- taxonomic assignment and functional annotation using your binned reads
The course is taught as a mixture of live coding, online lectures, self-study and drop-in sessions.
## Prerequisites
::: callout-important
## Biological concepts and software setup
This course assumes no prior experience with the tools covered in the workshop but learners are expected to have some familiarity with biological concepts, including the concept of genomes and microbiomes. Participants should bring their own laptops and plan to participate actively.
To get started, follow the directions in the "Precourse Instructions" tab to get access to the required software and data for this workshop. **Windows users** need to install Git Bash in their laptop. **Mac users** may need to configure the **terminal** program in their laptop to use the Bash shell.
:::
## Data
::: callout-note
## About the data used in the course
This course uses data from a 2022 paper published in BMC Environmental Microbiome titled [In-depth characterization of denitrifier communities across different soil ecosystems in the tundra](https://environmentalmicrobiome.biomedcentral.com/articles/10.1186/s40793-022-00424-2). In this course we will compare data from two of the sites studied.
You can read more about the data used in the course [here](docs/miscellanea/extras/data.qmd).
:::
## Course format
This workshop is designed to be run on pre-imaged Amazon Web Services (AWS) instances. All the software and data used in the workshop are hosted on an Amazon Machine Image (AMI). We will give you details as to how to access these instances after registration.
The course will take place over three weeks and will combine live coding and teaching with offline work. We will guide you through the analysis each session but some steps may need to completed offline due to the amount of time they take to complete. There will also be drop-in sessions to offer support and troubleshooting help, and a Slack workspace for questions.
## Course Overview
| Lesson | Overview |
|------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------|
| [Files and Directories](docs/lesson01-files-and-directories/index.qmd) | Learn about files and directories, log onto a cloud instance and get a basic introduction to the shell. |
| [Using the Command Line](docs/lesson02-using-the-command-line/index.qmd) | Learn more about using the shell to navigate directories, manipulate and search files, and combine existing commands to do new things. |
| [QC & Assembly](docs/lesson03-qc-assembly/index.qmd) | How to quality control and assemble a genome. |
| [Polishing](docs/lesson04-polishing/index.qmd) | How to use short reads to polish your metagenome assembly |
| [Binning & Functional Annotation](docs/lesson05-binning-functional-annotation/index.qmd) | How to separate an assembly into MAGs and add functional annotation to these MAGs. |
| [Taxonomic Annotations](docs/lesson06-taxonomic-annotations/index.qmd) | How to add taxonomic annotations onto contigs in an assembly. |
<!-- | [Automating Analyses with Bash Scripts](docs/lesson07-automation-bash-scripts/index.qmd) | How to automate the analyses tasks in the previous lessons using Bash scripts. | -->