Merge pull request #94 from gkaf89/refactor/jobs-gpu

[REFACTOR] Use long form flags in the GPU job example script
ULHPC · Nov 26, 2024 · 3aca179 · 3aca179
2 parents 55671bb + 48858be
commit 3aca179
Showing 1 changed file with 20 additions and 15 deletions.
diff --git a/docs/jobs/gpu.md b/docs/jobs/gpu.md
@@ -9,31 +9,36 @@ NVlink was designed as an alternative solution to PCI Express with higher bandwi
 Because of the hardware organization, you **MUST** follow the below recommendations:
 
 1. **Do not run jobs on GPU nodes if you have no use of GPU accelerators!**, _i.e._ if you are not using any of the software compiled against the `{foss,intel}cuda` toolchain.
-2. Avoid using more than 4 GPUs, ideally within the same node
+2. Avoid using more than 4 GPUs, ideally within the same node.
 3. Dedicated 1/4 of the available CPU cores for the management of each GPU card reserved.
 
 Thus your typical GPU launcher would match the [AI/DL launcher](../slurm/launchers.md#specialized-bigdatagpu-launchers) example:
 
 ```bash
-#!/bin/bash -l
+#!/usr/bin/bash --login
+
+#SBATCH --job-name=gpu_example
+#SBATCH --output=%x-%j.out
+#SBATCH --error=%x-%j.out
+
 ### Request one GPU tasks for 4 hours - dedicate 1/4 of available cores for its management
-#SBATCH -N 1
+#SBATCH --nodes=1
 #SBATCH --ntasks-per-node=1
-#SBATCH -c 7
-#SBATCH -G 1
-#SBATCH --time=04:00:00
-#SBATCH -p gpu
+#SBATCH --cpus-per-task=7
+#SBATCH --gpus-per-task=1
+#SBATCH --time=0-04:00:00
+
+### Submit to the `gpu` partition of Iris
+#SBATCH --parition=gpu
+#SBATCH --qos=normal
 
 print_error_and_exit() { echo "***ERROR*** $*"; exit 1; }
-module purge || print_error_and_exit "No 'module' command"
-module load numlib/cuDNN   # Example with cuDNN
+
+module purge || print_error_and_exit "No 'module' command available"
+module load numlib/cuDNN   # Example using the cuDNN module
 
 [...]
 ```
 
-
-
-
-
-
-You can quickly access a GPU node for [interactive jobs](../jobs/interactive.md) using `si-gpu`.
+!!! info "Interactive jobs"
+    In the UL HPC systems you can use the `si-gpu`, a wrapper for the `salloc` command, that allocates [interactive job](../jobs/interactive.md) in a GPU node with sensible default options.