This directory deploy/ can be used to deploy the Daphne System. With these scripts one can:
- build the Daphne System (using build.sh),
- package,
- deliver and install to a deployment platform (e.g. HPC) and
- utilize the resources of multiple machines/nodes.
- It can also be used to just try out DAPHNE on a single machine.
Once deployed, Daphne system consists of multiple DistributedWorker
s and a single coordinator
who is responsible for handling a distributed execution.
- deployDistributed.sh can be used to manually deploy using only SSH. When executed without parameters, it prints out the help message.
- deploy-distributed-on-slurm.sh can be used for environments with Slurm tool. When executed without parameters, it prints out the help message.
DAPHNE Deployment Scheme encompasses the following:
- A Compilation node (where the Daphne System will be compiled)
- Deployment Platform (e.g. an HPC with SLURM support)
- Login Node (or, other type of access)
- HPC Task Submission interface (e.g. SLURM)
- Compute Node(s)
- DAPHNE
coordinator
- DAPHNE
DistributedWorker
s
- DAPHNE
- Login Node (or, other type of access)
DAPHNE Deployment Scheme
+--------------------------------------------------------------------------------------+
| |
| +------------------+ |
| | Compilation node | |
| | | |
| +------------------+ |
| | |
| | |
| | (SSH connection) |
| | |
| | |
| +----------------------------------------------------------------------------------+ |
| | Deployment Platform (e.g. an HPC with SLURM support) | |
| | | |
| | +------------------------------+ | |
| | | Access/Submission/Login Node | | |
| | | | | |
| | +------------------------------+ | |
| | | | |
| | | | |
| | | Network connections, e.g. Infiniband, to e.g. SLURM interfaces, | |
| | | used also for communications between MT and DWs. | |
| | |-------------------------------------------------------------------+ | |
| | | | | | |
| | +--------------------------+ +--------------------------+ +-----------+ | |
| | | Node 1 | | Node 2 | | Node n | | |
| | | - Resources | ... | | ... | | | |
| | | - CPU/GPU/FPGA | | CPU/GPU/FPGAs | | Resources | | |
| | | - Running Tasks | | (e.g. 128+) | | | | |
| | | - `coordinator` | | {DistributedWorker (DW)} | | DWs | | |
| | | - (optional: more DWs) | | (e.g. DWs 1..128) | | | | |
| | +--------------------------+ +--------------------------+ +-----------+ | |
| | | |
| +----------------------------------------------------------------------------------+ |
| |
+--------------------------------------------------------------------------------------+
This directory includes a set of bash scripts providing support for:
- packaging/virtualization of the deployment (installation) package,
- containerized packaging,
- virtualized installation,
- managed deployment,
- deployment of the ˙daphne˙ executable,
- starting and managing Daphne processes within containerized environments (schedule and execute remotely SLURM tasks), and
- stopping and cleaning of a deployment.
- This short README file to explain directory structure and point to more documentation at Deploy.
- A script that builds the "daphne.sif" singularity image from the Docker image daphneeu/daphne-dev
- deploy-distributed-on-slurm script allows the user to deploy DAPHNE with SLURM.
- deployDistributed script builds and sends DAPHNE to remote machines manually with SSH (no tools like Slurm needed).
- example-time.daphne Daphne example script which prints out the running time of a simple operation.
- The Singularity image configuration file.
- Documentation about deployment, including tutorial-like explanation examples about how to package, distributively deploy, manage, and execute workloads using DAPHNE.
- Getting started guide
- Bulding the Daphne System