The HPC Client is a self-service solution that allows Flywheel jobs and gears to run on a High Performance Computing environment. Use on-premise hardware that's already available for highly-concurrent scientific workloads!
The client, also called Cast, can support several queue mechanisms out of the box. Flywheel, however, currently only provides support for Slurm. If you require assistance with other schedulers, contact Flywheel.
Common name | Code name |
---|---|
IBM spectrum LSF | lsf |
Oracle / Sun Grid Engine | sge |
Slurm | slurm |
If your site uses one of these, it may well just need a config file to get running.
Otherwise, some light python development will be required.
Reference this article for the minimum software and computing requirements of the system where the HPC Client will be installed.
-
Before using Cast, you need to decide how it will run on your cluster.
Choose an integration method and keep it in mind for later. This sets how frequently Cast with look for, pull, and queue hpc jobs to your HPC from your Flywheel site. -
It is strongly recommended that you make a private GitHub repo to track your changes.
This will make Cast much easier to manage. -
Perform the initial cluster setup. If you are unfamiliar with
singularity, it is recommended that you read--at a minimum--SingularityCE's introduction
and quick start guides. -
Create an authorization token so Singularity and Flywheel can work with each other.
-
If your queue type is not in the above table, or is sufficiently different, review the guide for adding a queue type.
-
Collaborate with Flywheel staff to install the Flywheel engine in your HPC repo. They will also configure the hold engine on your Flywheel site to ensure that other engines do not pick up gear jobs that are tagged with "hpc".
-
Complete the integration method you chose in step one.
Confirm Cast is running regularly by monitoringlogs/cast.log
and the Flywheel user interface. -
Test and run your first HPC job tests in collaboration with Flywheel. It is recommended that you test with MRIQC (non-BIDS version), a gear that's available from Flywheel's Gear Exchange. Note: as of 11 May 2022, Flywheel will have to change the rootfs-url (location of where the Docker image resides) for any gears installed from the Gear Exchange. For more about how Cast uses a rootfs-url, see Background/Motivation of this article.
-
Enjoy!
How do I set ram and cpu settings for my job?
Starting in version 2.0.0, the HPC Client will perform the following checks for setting ram and cpu settings:-
Was
scheduler_ram
orscheduler_cpu
set in the gear config when the Flywheel job was launched? If so, use this. The gear must have these as config variables to set them. See table below for formatting. -
If no setting was found for that specific job, check the
settings/cast.yml
file for these variables. Setting this will apply to HPC jobs submitted by the HPC Client. Only step 1. overrides this. -
If the setting is still not found, then use the default one set for that specific scheduler type (e.g., Slurm). This is hardcoded and should not be changed.
scheduler/cluster RAM CPU Slurm '8G' '8' LSF 'rusage[mem=4000]' '1' SGE '8G' '4-8' (sets CPU range)
How do I use a custom script template for the jobs submitted to my HPC?
The HPC Client creates a shell script (.sh
) for every job that is submitted to your HPC
through your scheduler (e.g., Slurm). It creates this using a default script template
for the type of scheduler on your HPC. If you would like to use a custom one, you can
do so by using the script
variable in the settings/cast.yml
file. It is not recommended
to edit the default templates in the source code (e.g., src/cluster/slurmpy
)
How do I send my jobs to a specific partition on my HPC?
When you use a custom script template, you can set the partition(s) to which all your jobs will be sent. For example, if your scheduler is Slurm, you can add the following line in your custom script template:
#SBATCH --partition=<partition1_name>,<partition2_name>
Example:
#SBATCH --partition=gpu-1,gpu-2
How do I check my version of the HPC Client?
The version of the HPC Client is in src/__init__.py
under the variable
__version__
. This was not available prior to 2.0.0.