Skip to content

Commit

Permalink
Chap 3 - Final translation 1 / 2
Browse files Browse the repository at this point in the history
  • Loading branch information
plstonge committed Feb 29, 2024
1 parent ad85a67 commit 51fc996
Showing 1 changed file with 37 additions and 43 deletions.
80 changes: 37 additions & 43 deletions 3-task-arrays.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -5,60 +5,57 @@
"metadata": {},
"source": [
"# Task Arrays for Data Parallelism\n",
"Le calcul haute-performance consiste non seulement au calcul parallèle par tâche (***parallélisme des tâches***),\n",
"mais aussi au calcul de données en parallèle dans plusieurs tâches et/ou processus en simultané (***parallélisme de données***).\n",
"Ce chapitre vous donnera les outils nécessaires pour gérer un grand nombre de tâches\n",
"lorsque le projet de recherche requiert plusieurs centaines de résultats."
"While high performance computing is usually designed for\n",
"task parallelism, it can also be used to run multiple\n",
"serial tasks simultaneously for data parallelism.\n",
"This chapter will present useful tools to manage a large number of\n",
"compute tasks when the research project requires hundreds of results."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## GNU Parallel\n",
"La commande `parallel` de\n",
"[GNU Parallel](https://docs.alliancecan.ca/wiki/GNU_Parallel/fr)\n",
"permet d'utiliser pleinement les ressources locales d'un noeud de\n",
"calcul, et ce, en gérant l'exécution d'une **longue liste de tâches\n",
"de *petite* taille**.\n",
"C'est un peu comme l'ordonnanceur Slurm, mais à plus petite échelle et\n",
"en gérant des processus au lieu de scripts de tâche.\n",
"The [GNU `parallel` command](https://docs.alliancecan.ca/wiki/GNU_Parallel)\n",
"allows to fully use the resources on a compute node by managing\n",
"the execution of a **long list of _small_ compute tasks**.\n",
"This is like the Slurm scheduler, but at a smaller\n",
"scale and by managing processes instead of job scripts.\n",
"\n",
"![Fonctionnement de GNU Parallel](images/gnu-parallel.svg)\n",
"![GNU Parallel workflow](images/gnu-parallel.svg)\n",
"\n",
"* [Documentation officielle](https://www.gnu.org/software/parallel/parallel.html)\n",
"* [Tutoriel](https://www.gnu.org/software/parallel/parallel_tutorial.html)"
"* [Official documentation](https://www.gnu.org/software/parallel/parallel.html)\n",
"* [Tutorial](https://www.gnu.org/software/parallel/parallel_tutorial.html)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Why Not Slurm?\n",
"OK, mais pourquoi ne pas tout simplement soumettre\n",
"**des centaines de tâches à Slurm**?\n",
"* À tout moment, Slurm **limite chaque usager à 1000 tâches**\n",
" au total dans `squeue` (*pending* + *running*)\n",
"* Certains calculs sont tellement **courts (< 5 minutes)** que le\n",
" démarrage et la fin de la tâche compteraient pour un pourcentage\n",
" significatif du temps réel utilisé, ce qui diminue leur efficacité"
"Why not simply submit **hundreds of jobs to Slurm**?\n",
"* At anytime, Slurm **limits each user to 1000 jobs**\n",
" in its queue (including *pending* and *running* jobs)\n",
"* Certain compute tasks are so **short (< 5 minutes)**\n",
" that the time to properly start and end these tasks\n",
" individually would significantly reduce their global efficiency"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Les avantages de GNU Parallel à considérer :\n",
"* Nous **évite d'utiliser une boucle** soumettant des centaines de\n",
" scripts similaires, ce qui, dans bien des cas, facilite\n",
" l'exécution de centaines de cas de calcul semblables\n",
"* Le nombre de **processeurs disponibles limite** automatiquement le\n",
" nombre de cas de calcul exécutés en simultané\n",
" * Dans le cas de calculs parallèles, c'est possible de spécifier\n",
" le nombre de cas en simultané\n",
"* GNU Parallel peut\n",
" [reprendre la séquence des cas de calcul](https://docs.alliancecan.ca/wiki/GNU_Parallel/fr#Suivi_des_commandes_ex.C3.A9cut.C3.A9es_ou_des_commandes_ayant_.C3.A9chou.C3.A9.3B_fonctionnalit.C3.A9s_de_red.C3.A9marrage)\n",
" en situation de fin hâtive de la tâche Slurm"
"GNU Parallel advantages:\n",
"* **No need of using a loop**, which makes\n",
" it easier to manage hundreds of compute tasks\n",
"* The number of **available CPU cores automatically limits**\n",
" the number of simultaneous running tasks\n",
" * For a set of parallel tasks, it is possible to specify\n",
" a smaller number of processes than the number of CPU cores\n",
"* GNU Parallel can\n",
" [resume the sequence of compute tasks](https://docs.alliancecan.ca/wiki/GNU_Parallel#Keeping_Track_of_Completed_and_Failed_Commands,_and_Restart_Capabilities)\n",
" in case of a job ending sooner than expected or what is needed"
]
},
{
Expand Down Expand Up @@ -221,21 +218,18 @@
"source": [
"### Other Tools\n",
"* GLOST\n",
" [pour des calculs séquentiels seulement](https://docs.alliancecan.ca/wiki/GLOST/fr)\n",
" [for serial tasks only](https://docs.alliancecan.ca/wiki/GLOST)\n",
"\n",
"* META-Farm\n",
" [pour le meilleur de GNU Parallel et GLOST](https://docs.alliancecan.ca/wiki/META-Farm/fr)\n",
" [for the best of GNU Parallel and GLOST](https://docs.alliancecan.ca/wiki/META-Farm)\n",
"\n",
"Alors que les précédents outils s'utilisent bien avec un lot de\n",
"calculs séquentiels ou parallèles de petite taille (16 processeurs\n",
"ou moins), **ils ne sont pas** vraiment **appropriés pour**\n",
"un lot de **longs calculs parallèles de plus grande taille**\n",
"(plus de 16 processeurs par calcul) :\n",
"* on veut éviter les longues tâches qui dépassent trois (3) jours et\n",
"* on veut réduire le risque de subir une défaillance matérielle.\n",
"While the above tools can be useful with a set of serial tasks or\n",
"small parallel tasks (16 cores or less), **they are not appropriate\n",
"for long and large parallel jobs** (more than 16 cores per task):\n",
"1. we want to avoid jobs longer than 3 days, and\n",
"1. we want to reduce the risk of being affected by a defective node.\n",
"\n",
"C'est pourquoi, dans certains cas, il vaut\n",
"mieux utiliser les vecteurs de tâches."
"That is why, in some cases, it is better to use job arrays."
]
},
{
Expand Down

0 comments on commit 51fc996

Please sign in to comment.