diff --git a/.github/workflows/notebooks.yml b/.github/workflows/notebooks.yml index b66759c..9a5481b 100644 --- a/.github/workflows/notebooks.yml +++ b/.github/workflows/notebooks.yml @@ -35,4 +35,5 @@ jobs: shell: bash -l {0} run: > flux start - papermill notebooks/example.ipynb example-out.ipynb -k "python3" + papermill notebooks/example_config.ipynb example-config-out.ipynb -k "python3" + papermill notebooks/example_queue_type.ipynb example-queue-type-out.ipynb -k "python3" diff --git a/README.md b/README.md index ffe21f0..96686a4 100644 --- a/README.md +++ b/README.md @@ -3,7 +3,7 @@ [![Unittests](https://github.com/pyiron/pysqa/actions/workflows/unittest.yml/badge.svg)](https://github.com/pyiron/pysqa/actions/workflows/unittest.yml) [![Documentation Status](https://readthedocs.org/projects/pysqa/badge/?version=latest)](https://pysqa.readthedocs.io/en/latest/?badge=latest) [![Coverage Status](https://coveralls.io/repos/github/pyiron/pysqa/badge.svg?branch=main)](https://coveralls.io/github/pyiron/pysqa?branch=main) -[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/pyiron/pysqa/HEAD?labpath=example.ipynb) +[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/pyiron/pysqa/HEAD?labpath=example_config.ipynb) High-performance computing (HPC) does not have to be hard. In this context the aim of the Python Simple Queuing System Adapter (`pysqa`) is to simplify the submission of tasks from python to HPC clusters as easy as starting another @@ -57,11 +57,15 @@ from within `pysqa`, which are represented to the user as a single resource. * [SGE](https://pysqa.readthedocs.io/en/latest/queue.html#sge) * [SLURM](https://pysqa.readthedocs.io/en/latest/queue.html#slurm) * [TORQUE](https://pysqa.readthedocs.io/en/latest/queue.html#torque) -* [Python Interface](https://pysqa.readthedocs.io/en/latest/example.html) - * [List available queues](https://pysqa.readthedocs.io/en/latest/example.html#list-available-queues) - * [Submit job to queue](https://pysqa.readthedocs.io/en/latest/example.html#submit-job-to-queue) - * [Show jobs in queue](https://pysqa.readthedocs.io/en/latest/example.html#show-jobs-in-queue) - * [Delete job from queue](https://pysqa.readthedocs.io/en/latest/example.html#delete-job-from-queue) +* [Python Interface Dynamic](https://pysqa.readthedocs.io/en/latest/example_queue_type.html) + * [Submit job to queue](https://pysqa.readthedocs.io/en/latest/example_queue_type.html#submit-job-to-queue) + * [Show jobs in queue](https://pysqa.readthedocs.io/en/latest/example_queue_type.html#show-jobs-in-queue) + * [Delete job from queue](https://pysqa.readthedocs.io/en/latest/example_queue_type.html#delete-job-from-queue) +* [Python Interface Config](https://pysqa.readthedocs.io/en/latest/example_config.html) + * [List available queues](https://pysqa.readthedocs.io/en/latest/example_config.html#list-available-queues) + * [Submit job to queue](https://pysqa.readthedocs.io/en/latest/example_config.html#submit-job-to-queue) + * [Show jobs in queue](https://pysqa.readthedocs.io/en/latest/example_config.html#show-jobs-in-queue) + * [Delete job from queue](https://pysqa.readthedocs.io/en/latest/example_config.html#delete-job-from-queue) * [Command Line Interface](https://pysqa.readthedocs.io/en/latest/command.html) * [Submit job](https://pysqa.readthedocs.io/en/latest/command.html#submit-job) * [Enable reservation](https://pysqa.readthedocs.io/en/latest/command.html#enable-reservation) diff --git a/docs/_toc.yml b/docs/_toc.yml index 9634b6a..90021b3 100644 --- a/docs/_toc.yml +++ b/docs/_toc.yml @@ -3,7 +3,8 @@ root: README chapters: - file: installation.md - file: queue.md -- file: example.ipynb +- file: example_queue_type.ipynb +- file: example_config.ipynb - file: command.md - file: advanced.md - file: debug.md diff --git a/docs/installation.md b/docs/installation.md index 200887f..f626375 100644 --- a/docs/installation.md +++ b/docs/installation.md @@ -14,7 +14,7 @@ On `pypi` the `pysqa` package exists in three different versions: * `pip install pysaq` - base version - with minimal requirements only depends on `jinja2`, `pandas` and `pyyaml`. * `pip install pysaq[sge]` - sun grid engine (SGE) version - in addition to the base dependencies this installs `defusedxml` which is required to parse the `xml` files from `qstat`. -* `pip install pysaq[remote]` - remote version - in addition to the base dependencies this installs `paramiko` and +* `pip install pysaq[remote]` - remote version - in addition, to the base dependencies this installs `paramiko` and `tqdm`, to connect to remote HPC clusters using SSH and report the progress of the data transfer visually. ## conda-based installation diff --git a/docs/queue.md b/docs/queue.md index 2bda170..099cc30 100644 --- a/docs/queue.md +++ b/docs/queue.md @@ -1,11 +1,13 @@ # Queuing Systems -`pysqa` is based on the idea of reusable templates. These templates are defined in the `jinja2` templating language. By -default `pysqa` expects to find these templates in `~/.queues`. Still it is also possible to store them in a different -directory. +The Python simple queuing system adapter `pysqa` is based on the idea of reusable templates. These templates are +defined in the `jinja2` templating language. By default `pysqa` expects to find these templates in the configuration +directory which is specified with the `directory` parameter. Alternatively, they can be defined dynamically by +specifying the queuing system type with the `queue_type` parameter. -In this directory `pysqa` expects to find one queue configuration and one jinja template per queue. The `queue.yaml` -file which defines the available queues and their restrictions in terms of minimum and maximum number of CPU cores, -required memory or run time. In addition, this file defines the type of the queuing system and the default queue. +When using the configuration directory, `pysqa` expects to find one queue configuration and one jinja template per +queue. The `queue.yaml` file which defines the available queues and their restrictions in terms of minimum and maximum +number of CPU cores, required memory or run time. In addition, this file defines the type of the queuing system and the +default queue. A typical `queue.yaml` file looks like this: ``` @@ -55,7 +57,8 @@ The queue named `flux` is defined based on a submission script template named `f {{command}} ``` In this case only the number of cores `cores`, the name of the job `job_name` , the maximum run time of the job -`run_time_max` and the command `command` are communicated. +`run_time_max` and the command `command` are communicated. The same template is stored in the `pysqa` package and can be +imported using `from pysqa.wrapper.flux import template`. So the flux interface can be enabled by setting `queue_type="flux"`. ## LFS For the load sharing facility framework from IBM the `queue.yaml` file defines the `queue_type` as `LSF`: @@ -86,7 +89,8 @@ The queue named `lsf` is defined based on a submission script template named `ls In this case the name of the job `job_name`, the number of cores `cores,` the working directory of the job `working_directory` and the command that is executed `command` are defined as mendatory inputs. Beyond these two optional inputs can be defined, namely the maximum run time for the job `run_time_max` and the maximum memory used by -the job `memory_max`. +the job `memory_max`. The same template is stored in the `pysqa` package and can be imported using +`from pysqa.wrapper.lsf import template`. So the flux interface can be enabled by setting `queue_type="lsf"`. ## MOAB For the Maui Cluster Scheduler the `queue.yaml` file defines the `queue_type` as `MOAB`: @@ -102,7 +106,9 @@ The queue named `moab` is defined based on a submission script template named `m {{command}} ``` -Currently, no template for the Maui Cluster Scheduler is available. +Currently, no template for the Maui Cluster Scheduler is available. The same template is stored in the `pysqa` package +and can be imported using `from pysqa.wrapper.moab import template`. So the flux interface can be enabled by setting +`queue_type="moab"`. ## SGE For the sun grid engine (SGE) the `queue.yaml` file defines the `queue_type` as `SGE`: @@ -134,7 +140,8 @@ The queue named `sge` is defined based on a submission script template named `sg In this case the name of the job `job_name`, the number of cores `cores,` the working directory of the job `working_directory` and the command that is executed `command` are defined as mendatory inputs. Beyond these two optional inputs can be defined, namely the maximum run time for the job `run_time_max` and the maximum memory used by -the job `memory_max`. +the job `memory_max`. The same template is stored in the `pysqa` package and can be imported using +`from pysqa.wrapper.sge import template`. So the flux interface can be enabled by setting `queue_type="sge"`. ## SLURM For the Simple Linux Utility for Resource Management (SLURM) the `queue.yaml` file defines the `queue_type` as `SLURM`: @@ -165,7 +172,8 @@ The queue named `slurm` is defined based on a submission script template named ` In this case the name of the job `job_name`, the number of cores `cores,` the working directory of the job `working_directory` and the command that is executed `command` are defined as mendatory inputs. Beyond these two optional inputs can be defined, namely the maximum run time for the job `run_time_max` and the maximum memory used by -the job `memory_max`. +the job `memory_max`. The same template is stored in the `pysqa` package and can be imported using +`from pysqa.wrapper.sge import template`. So the flux interface can be enabled by setting `queue_type="sge"`. ## TORQUE For the Terascale Open-source Resource and Queue Manager (TORQUE) the `queue.yaml` file defines the `queue_type` as @@ -199,4 +207,5 @@ The queue named `torque` is defined based on a submission script template named In this case the name of the job `job_name`, the number of cores `cores,` the working directory of the job `working_directory` and the command that is executed `command` are defined as mendatory inputs. Beyond these two optional inputs can be defined, namely the maximum run time for the job `run_time_max` and the maximum memory used by -the job `memory_max`. +the job `memory_max`. The same template is stored in the `pysqa` package and can be imported using +`from pysqa.wrapper.slurm import template`. So the flux interface can be enabled by setting `queue_type="slurm"`. diff --git a/notebooks/example.ipynb b/notebooks/example_config.ipynb similarity index 92% rename from notebooks/example.ipynb rename to notebooks/example_config.ipynb index 6867974..9430c6a 100644 --- a/notebooks/example.ipynb +++ b/notebooks/example_config.ipynb @@ -4,7 +4,10 @@ "cell_type": "markdown", "id": "097a5f9f-69a2-42ae-a565-e3cdb17da461", "metadata": {}, - "source": "# Python Interface \nThe `pysqa` package primarily defines one class, that is the `QueueAdapter`. It loads the configuration from a configuration directory, initializes the corrsponding adapter for the specific queuing system and provides a high level interface for users to interact with the queuing system. The `QueueAdapter` can be imported using:" + "source": [ + "# Python Interface Config\n", + "The `pysqa` package primarily defines one class, that is the `QueueAdapter`. It loads the configuration from a configuration directory, initializes the corrsponding adapter for the specific queuing system and provides a high level interface for users to interact with the queuing system. The `QueueAdapter` can be imported using:" + ] }, { "cell_type": "code", @@ -92,7 +95,10 @@ "cell_type": "markdown", "id": "451180a6-bc70-4053-a67b-57357522da0f", "metadata": {}, - "source": "# List available queues \nList available queues as list of queue names: " + "source": [ + "## List available queues\n", + "List available queues as list of queue names: " + ] }, { "cell_type": "code", @@ -149,7 +155,10 @@ "cell_type": "markdown", "id": "42a53d33-2916-461f-86be-3edbe01d3cc7", "metadata": {}, - "source": "# Submit job to queue\nSubmit a job to the queue - if no queue is specified it is submitted to the default queue defined in the queue configuration:" + "source": [ + "## Submit job to queue\n", + "Submit a job to the queue - if no queue is specified it is submitted to the default queue defined in the queue configuration:" + ] }, { "cell_type": "code", @@ -192,7 +201,10 @@ "cell_type": "markdown", "id": "672854fd-3aaa-4287-b29c-d5370e4adc14", "metadata": {}, - "source": "# Show jobs in queue \nGet status of all jobs currently handled by the queuing system:" + "source": [ + "## Show jobs in queue\n", + "Get status of all jobs currently handled by the queuing system:" + ] }, { "cell_type": "code", @@ -275,7 +287,10 @@ "cell_type": "markdown", "id": "f89528d3-a3f5-4adb-9f74-7f70270aec12", "metadata": {}, - "source": "# Delete job from queue \nDelete a job with the queue id `queue_id` from the queuing system:" + "source": [ + "## Delete job from queue\n", + "Delete a job with the queue id `queue_id` from the queuing system:" + ] }, { "cell_type": "code", @@ -321,4 +336,4 @@ }, "nbformat": 4, "nbformat_minor": 5 -} \ No newline at end of file +} diff --git a/notebooks/example_queue_type.ipynb b/notebooks/example_queue_type.ipynb new file mode 100644 index 0000000..b4884ba --- /dev/null +++ b/notebooks/example_queue_type.ipynb @@ -0,0 +1,236 @@ +{ + "metadata": { + "kernelspec": { + "display_name": "Flux", + "language": "python", + "name": "flux" + }, + "language_info": { + "name": "python", + "version": "3.10.14", + "mimetype": "text/x-python", + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "pygments_lexer": "ipython3", + "nbconvert_exporter": "python", + "file_extension": ".py" + } + }, + "nbformat_minor": 5, + "nbformat": 4, + "cells": [ + { + "id": "097a5f9f-69a2-42ae-a565-e3cdb17da461", + "cell_type": "markdown", + "source": [ + "# Dynamic Python Interface\n", + "The `pysqa` package primarily defines one class, that is the `QueueAdapter`. It can either load the configuration from a configuration directory, or initializes the corrsponding adapter for the specific queuing system type `queue_type` and provides a high level interface for users to interact with the queuing system. The `QueueAdapter` can be imported using:" + ], + "metadata": {} + }, + { + "id": "04e9d4a2-6161-448b-81cd-1c6f8689867d", + "cell_type": "code", + "source": "from pysqa import QueueAdapter", + "metadata": { + "tags": [], + "trusted": true + }, + "outputs": [], + "execution_count": 1 + }, + { + "id": "7e3cf646-d4e7-4b1e-ab47-f07342d7a5a2", + "cell_type": "markdown", + "source": "After the initial import the class is initialized using the queuing system type specificed by the `queue_type` parameter. In this example we load the `flux` as queuing system interface: ", + "metadata": {} + }, + { + "id": "7e234eaf-80bc-427e-bd65-9acf70802689", + "cell_type": "code", + "source": "qa = QueueAdapter(queue_type=\"flux\")", + "metadata": { + "tags": [], + "trusted": true + }, + "outputs": [], + "execution_count": 2 + }, + { + "id": "514a7f2e-04ec-4fed-baa5-a181dace7123", + "cell_type": "markdown", + "source": "The configuration is explained in more detail in the [documentation](https://pysqa.readthedocs.io/en/latest/queue.html#flux). ", + "metadata": {} + }, + { + "id": "42a53d33-2916-461f-86be-3edbe01d3cc7", + "cell_type": "markdown", + "source": [ + "## Submit job to queue\n", + "Submit a job to the queue - if no queue is specified it is submitted to the default queue defined in the queue configuration:" + ], + "metadata": {} + }, + { + "id": "a3f2ba3a-0f82-4a0a-aa63-b5e71f8f8b39", + "cell_type": "code", + "source": "queue_id = qa.submit_job(\n job_name=None,\n working_directory=\".\",\n cores=1,\n memory_max=None,\n run_time_max=None,\n dependency_list=None,\n submission_template=None,\n command=\"sleep 5\",\n)\nqueue_id", + "metadata": { + "trusted": true + }, + "outputs": [ + { + "execution_count": 3, + "output_type": "execute_result", + "data": { + "text/plain": "64156073984" + }, + "metadata": {} + } + ], + "execution_count": 3 + }, + { + "id": "9aa0fdf9-0827-4706-bfed-6b95b95dd061", + "cell_type": "markdown", + "source": "The only required parameter is: \n* `command` the command that is executed as part of the job \n\nAdditional options for the submission of the job are:\n* `job_name` the name of the job submitted to the queuing system. \n* `working_directory` the working directory the job submitted to the queuing system is executed in.\n* `cores` the number of cores used for the calculation. If the cores are not defined the minimum number of cores defined for the selected queue are used. \n* `memory_max` the memory used for the calculation. \n* `run_time_max` the run time for the calculation. If the run time is not defined the maximum run time defined for the selected queue is used. \n* `dependency_list` other jobs the calculation depends on.\n* `submission_template` the template submission script.\n* `**kwargs` allows writing additional parameters to the job submission script if they are available in the corresponding template.\n", + "metadata": {} + }, + { + "id": "e9cef4ba-ddf6-4cd5-9519-ba93ce13256a", + "cell_type": "markdown", + "source": "The submission script template can be imported directly using `from pysqa.wrapper.flux import template`: ", + "metadata": {} + }, + { + "id": "5379ef70-39a5-45ac-b325-d71abe1ba4b0", + "cell_type": "code", + "source": "from pysqa.wrapper.flux import template\n\ntemplate.split(\"\\n\")", + "metadata": { + "trusted": true + }, + "outputs": [ + { + "execution_count": 4, + "output_type": "execute_result", + "data": { + "text/plain": "['#!/bin/bash',\n '# flux: --job-name={{job_name}}',\n '# flux: --env=CORES={{cores}}',\n '# flux: --output=time.out',\n '# flux: --error=error.out',\n '# flux: -n {{cores}}',\n '{%- if run_time_max %}',\n '# flux: -t {{ [1, run_time_max // 60]|max }}',\n '{%- endif %}',\n '{%- if dependency %}',\n '{%- for jobid in dependency %}',\n '# flux: --dependency=afterok:{{jobid}}',\n '{%- endfor %}',\n '{%- endif %}',\n '',\n '{{command}}',\n '']" + }, + "metadata": {} + } + ], + "execution_count": 4 + }, + { + "id": "672854fd-3aaa-4287-b29c-d5370e4adc14", + "cell_type": "markdown", + "source": [ + "## Show jobs in queue\n", + "Get status of all jobs currently handled by the queuing system:" + ], + "metadata": {} + }, + { + "id": "73518256-faf8-4fea-bc40-9b2198903bf5", + "cell_type": "code", + "source": "qa.get_queue_status()", + "metadata": { + "trusted": true + }, + "outputs": [ + { + "execution_count": 5, + "output_type": "execute_result", + "data": { + "text/plain": " jobid user jobname status\n0 64156073984 jovyan None running", + "text/html": "
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
jobiduserjobnamestatus
064156073984jovyanNonerunning
\n
" + }, + "metadata": {} + } + ], + "execution_count": 5 + }, + { + "id": "9338f32f-b127-4700-8aba-25aded6b548f", + "cell_type": "markdown", + "source": "With the additional parameter `user` a specific user can be defined to only list the jobs of this specific user. \n\nIn analogy the jobs of the current user can be listed with: ", + "metadata": {} + }, + { + "id": "cf6e59e8-f117-4d4a-9637-f83ec84c62fa", + "cell_type": "code", + "source": "qa.get_status_of_my_jobs()", + "metadata": { + "trusted": true + }, + "outputs": [ + { + "execution_count": 6, + "output_type": "execute_result", + "data": { + "text/plain": " jobid user jobname status\n0 64156073984 jovyan None running", + "text/html": "
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
jobiduserjobnamestatus
064156073984jovyanNonerunning
\n
" + }, + "metadata": {} + } + ], + "execution_count": 6 + }, + { + "id": "d2566873-2d30-4801-9d86-287a247fb7c6", + "cell_type": "markdown", + "source": "Finally, the status of a specific job with the queue id `queue_id` can be received from the queuing system using:", + "metadata": {} + }, + { + "id": "ee8e14db-cc6e-47e7-a1e5-035427ca83a9", + "cell_type": "code", + "source": "qa.get_status_of_job(process_id=queue_id)", + "metadata": { + "trusted": true + }, + "outputs": [ + { + "execution_count": 7, + "output_type": "execute_result", + "data": { + "text/plain": "'running'" + }, + "metadata": {} + } + ], + "execution_count": 7 + }, + { + "id": "f89528d3-a3f5-4adb-9f74-7f70270aec12", + "cell_type": "markdown", + "source": [ + "## Delete job from queue\n", + "Delete a job with the queue id `queue_id` from the queuing system:" + ], + "metadata": {} + }, + { + "id": "06e1535b-eafd-4b94-ba33-ba24da088a33", + "cell_type": "code", + "source": "qa.delete_job(process_id=queue_id)", + "metadata": { + "tags": [], + "trusted": true + }, + "outputs": [ + { + "execution_count": 8, + "output_type": "execute_result", + "data": { + "text/plain": "''" + }, + "metadata": {} + } + ], + "execution_count": 8 + } + ] +}