- Using the Great Lakes cluster and batch computing with SLURM
- You must establish a user login on Great Lakes by filling out this form.
- Contact IT to be added under
root account if you have not done so.
- People should not run jobs directly from login node, using Greatlakes OnDemand or submit
jobs instead. - Due to regulation, people should not store any Garmire's lab related data and code under default Greatlakes home directory:
- Instead, you should store all lab materials under Garmire's NAS:
/nfs/dcmb-lgarmire/uniqname #personal NAS home
/nfs/dcmb-lgarmire/shared/ #for group shared data
- Use scratch space for large temporary data (purged after 60 days) at
- Use turbo and Armis2 for HIPAA data at
- Greatlakes login node:
- Armis2 login node:
- Garmire lab login node:
- You can
login with terminal, MobaXTerm, PuTTY, etc. (Authentication requires Level 1 password and DUO):
- Access your NAS home directory:
cd /nfs/dcmb-lgarmire/uniqname
- Log into
- Select from
Interactive Apps
menu - Request for the time/memory/cores/modules as you need
- Launch
Full SLURM guide:
Command | Description |
sbatch your_job.sbat |
Submit your job |
squeue -j jobid |
Check job status by jobid |
squeue -u uniqname |
Check job status by user's uniqname |
scancel jobid |
Cancel submitted job by jobid |
seff jobid |
Show total time and memory usage for job |
my_usage uniqname |
List resource usage for user uniqname |
sinfo |
Show node status by partition |
sstate |
Show CPU, GPU, memory allocation for each node |
scontrol show node node_name |
Show details for a job by jobid |
scontrol show job jobid |
Check job status by user's uniqname |
#!/bin/bash #This has to be the first line
#SBATCH --job-name=your_job
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=2
#SBATCH --time=24:00:00
#SBATCH --mem-per-cpu=100g
#SBATCH --partition=largemem #optional, largemem node has 1.5T
#regular node has 180G
#SBATCH [email protected]
#SBATCH --mail-type=END
#SBATCH --output=./%x-%j
#SBATCH --account=lgarmire99
module load packages
cd /nfs/dcmb-lgarmire/uniqname/work_directory/
Rscript "/nfs/dcmb-lgarmire/uniqname/work_directory/script.R"
- Pre-installed bioinformatics tools on Greatlakes including seurat, bcftools, cellprofiler, cellranger, gatk, picard, plink, monocle, etc.:
module load Bioinformatics
- Then load the packages (for example):
module load RSeurat/5.0.1
module load cellranger/7.2.0
- search for a module
module spider <keyword>
- Check full list of available modules:
module avail
- Check loaded modules:
module list
- Remove loaded modules:
module rm RSeurat/5.0.1
Rather than using the modules provided, people can also use conda to manage software dependencies.
- Create a conda environment called
and install R base:
conda create -n myenv
- Activate the environment and install more packages:
conda activate myenv
conda install conda-forge::r-seurat
- Activate the environment
conda activate old_env
- Export list of packages and versions
conda list --export > package-list.txt
- Upload the
to GL and create an identical environment:
conda create -n new_env --file package-list.txt
- General tutorial:
- Slides: Introduction to the Great Lakes cluster and batch computing with SLURM
- Slides: Advanced batch computing with SLURM on the Great Lakes cluster
- Slides: MPI profiling with Allinea MAP
- ARC-TS: Great Lakes overview
- ARC-TS: Great Lakes Cheat Sheet
- ARC-TS: SLURM user guide
- ARC-TS: Migrating from PBS-Torque to SLURM
- ARC-TS: Globus high-speed data transfer
- Kelly's example of using Snakemake on HPC
- Snakemake profile for SLURM
- conda on the cluster