From 48858be4141be5c46d241073fa6a0b718565414c Mon Sep 17 00:00:00 2001 From: Georgios Kafanas Date: Tue, 26 Nov 2024 20:31:19 +0100 Subject: [PATCH] [REFACTOR] Use long form flags in the GPU job example script --- docs/jobs/gpu.md | 35 ++++++++++++++++++++--------------- 1 file changed, 20 insertions(+), 15 deletions(-) diff --git a/docs/jobs/gpu.md b/docs/jobs/gpu.md index ebd4a197..14a3bcf9 100644 --- a/docs/jobs/gpu.md +++ b/docs/jobs/gpu.md @@ -9,31 +9,36 @@ NVlink was designed as an alternative solution to PCI Express with higher bandwi Because of the hardware organization, you **MUST** follow the below recommendations: 1. **Do not run jobs on GPU nodes if you have no use of GPU accelerators!**, _i.e._ if you are not using any of the software compiled against the `{foss,intel}cuda` toolchain. -2. Avoid using more than 4 GPUs, ideally within the same node +2. Avoid using more than 4 GPUs, ideally within the same node. 3. Dedicated 1/4 of the available CPU cores for the management of each GPU card reserved. Thus your typical GPU launcher would match the [AI/DL launcher](../slurm/launchers.md#specialized-bigdatagpu-launchers) example: ```bash -#!/bin/bash -l +#!/usr/bin/bash --login + +#SBATCH --job-name=gpu_example +#SBATCH --output=%x-%j.out +#SBATCH --error=%x-%j.out + ### Request one GPU tasks for 4 hours - dedicate 1/4 of available cores for its management -#SBATCH -N 1 +#SBATCH --nodes=1 #SBATCH --ntasks-per-node=1 -#SBATCH -c 7 -#SBATCH -G 1 -#SBATCH --time=04:00:00 -#SBATCH -p gpu +#SBATCH --cpus-per-task=7 +#SBATCH --gpus-per-task=1 +#SBATCH --time=0-04:00:00 + +### Submit to the `gpu` partition of Iris +#SBATCH --parition=gpu +#SBATCH --qos=normal print_error_and_exit() { echo "***ERROR*** $*"; exit 1; } -module purge || print_error_and_exit "No 'module' command" -module load numlib/cuDNN # Example with cuDNN + +module purge || print_error_and_exit "No 'module' command available" +module load numlib/cuDNN # Example using the cuDNN module [...] ``` - - - - - -You can quickly access a GPU node for [interactive jobs](../jobs/interactive.md) using `si-gpu`. +!!! info "Interactive jobs" + In the UL HPC systems you can use the `si-gpu`, a wrapper for the `salloc` command, that allocates [interactive job](../jobs/interactive.md) in a GPU node with sensible default options.