Building Genome and Transcriptome Indexes for Ostreococcus tauri with HISAT2

Introduction

Ostreococcus tauri is a small eukaryotic organism with a relatively small genome, making it suitable for quick construction of genome and transcriptome indexes. In this README, we will outline the steps to build these indexes.

Prerequisites

Files: Ostreococcus tauri genome file (.fna) and general transfer format file (.gtf)
Software: Hisat2

Building the Genome Index

Run the following command to download the genomic fasta file.

wget https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/214/015/GCF_000214015.3_version_140606/GCF_000214015.3_version_140606_genomic.fna.gz

Run the following command to unzip the genomic fasta file.

gunzip GCF_000214015.3_version_140606_genomic.fna.gz

Run the following command to create a genome index folder.

mkdir GenomeIndex

Run the following command to build the genome index (4 threads used). Take a look in the GenomeIndex folder after executing this command.

hisat2-build GCF_000214015.3_version_140606_genomic.fna GenomeIndex/genome -p 4

Building the Transcriptome Index

Run the following command to download the general transfer file.

wget https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/214/015/GCF_000214015.3_version_140606/GCF_000214015.3_version_140606_genomic.gtf.gz

Run the following command to unzip the general transfer file.

gunzip GCF_000214015.3_version_140606_genomic.gtf.gz

Run the following command to create a transcriptome index folder.

mkdir TranscriptomeIndex

Run the following command to extract the splice sites from the Ostreococcus tauri genome (information about splice sites can be found in the .gtf file). Take a look in the TranscriptomeIndex folder after executing this command.

hisat2_extract_splice_sites.py GCF_000214015.3_version_140606_genomic.gtf > TranscriptomeIndex/SpliceSites.ss

Run the following command to extract the exons from the Ostreococcus tauri genome (information about exons can be found in the .gtf file). Take a look in the TranscriptomeIndex folder after executing this command.

hisat2_extract_exons.py GCF_000214015.3_version_140606_genomic.gtf > TranscriptomeIndex/Exons.exon

Run the following command to build the transcriptome index (4 threads used). Take a look in the TranscriptomeIndex folder after executing this command.

hisat2-build --ss TranscriptomeIndex/SpliceSites.ss --exon TranscriptomeIndex/Exons.exon GCF_000214015.3_version_140606_genomic.fna TranscriptomeIndex/transcriptome -p 4

Testing the Genome and Transcriptome Index

Run the following command to download a few spots from a sample record that is linked to Ostreococcus tauri.

fastq-dump --gzip --split-3 -X 1000 SRR7121135

Run the following command to create a results folder

mkdir Results

Align the RNA-seq reads to the genome index:

hisat2 -x GenomeIndex/genome -U SRR7121135_1.fastq.gz -S Results/SRR7121135_1_genome.sam

Align the RNA-seq reads to the transcriptome index:

hisat2 -x TranscriptomeIndex/transcriptome -U SRR7121135_1.fastq.gz -S Results/SRR7121135_1_transcriptome.sam

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Building Genome and Transcriptome Indexes for Ostreococcus tauri with HISAT2

Introduction

Prerequisites

Building the Genome Index

Building the Transcriptome Index

Testing the Genome and Transcriptome Index

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
GenomeIndex		GenomeIndex
Results		Results
TranscriptomeIndex		TranscriptomeIndex
GCF_000214015.3_version_140606_genomic.fna		GCF_000214015.3_version_140606_genomic.fna
GCF_000214015.3_version_140606_genomic.gtf		GCF_000214015.3_version_140606_genomic.gtf
README.md		README.md
SRR7121135_1.fastq.gz		SRR7121135_1.fastq.gz
SRR7121135_2.fastq.gz		SRR7121135_2.fastq.gz

mathiasverbeke0/Building_index_Ostreococcus_tauri

Folders and files

Latest commit

History

Repository files navigation

Building Genome and Transcriptome Indexes for Ostreococcus tauri with HISAT2

Introduction

Prerequisites

Building the Genome Index

Building the Transcriptome Index

Testing the Genome and Transcriptome Index

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages