Skip to content

Cluster parallel processing

Kyle C Weber edited this page Mar 30, 2022 · 11 revisions

Creating jobs on the cluster:

To create massively parallel jobs on the cluster, we first need a few key files.

Shell script for qsub:

The first file we need is one that will launch our job on the cluster.
This file will tell the cluster how much ram we want per thread, the amount of time the process should take, and the program to run.
An example can be found below:


#!/bin/bash

#Change the time you want: time=1:: = 1 hour | time=:1: = 1 min | time=::1 = 1 sec
#Change the allocated ram you want per thread: mem=1G = 1 gig per thread | mem=1M = 1 meg per thread
#$ -l mem=0.1G,time=::30

#Change the number of threads you want: orte 4 = 4 thread | orte 10 = 10 threads
#$ -pe orte 5

#Change the path to the python file you want to run
programPath="/your/job/file/path/here"

#calling the program
/nfs/apps/openmpi/openmpi-4.1.1/bin/mpiexec /ifs/data/ag4522_gp/software/miniconda3/bin/python3 $programPath

The first file takes the mpiexec from the /nfs/apps path (DO NOT CHANGE UNLESS YOU KNOW WHAT YOU ARE DOING).
It then takes the python executable from our group's Conda/Python directory (DO NOT CHANGE UNLESS YOU KNOW WHAT YOU ARE DOING).
And it lastly takes the python script we want to make massively parallelizable (Can be changed).

Python command file 1:

The next file we need is the task distributor file.
What I found to work is to have a python script create commands, split the commands up to the threads, and then subprocess the tasks.
The file example below is good if you need to relax a lot of different structures.


"""DO NOT CHANGE FROM HERE (1)"""

#!/usr/bin/env python
import math
import os
import subprocess
import signal
import ctypes
from mpi4py import MPI

#Function to see if main process is dead. If so, kill the child process.
def _set_pdeathsig(sig=signal.SIGTERM):

    def callable():
        libc = ctypes.CDLL("libc.so.6")
        return libc.prctl(1, sig)
    return callable

#Creating MPI interface
comm = MPI.COMM_WORLD

#Getting the rank and size for dividing the work up
size = comm.Get_size()
rank = comm.Get_rank()

commands = []
rankCommands =[]

"""TO HERE (1)"""

#create your commands here and save them into the commands array
#-> You can change the for loop below:
for x in range(0,5):
	commands.append("rosetta blah blah blah")

#if your using rosetta, you can change the location where you want the files to be saved by selecting the directory below
os.chdir("/ifs/scratch/ag4522_gp/kcw2152/relaxedMono")

"""DO NOT CHANGE FROM HERE (2)"""
processerMask = 1 HERE psutil.Process().cpu_num() # Where HERE is, add two sideways carrots points to the left. Github code does not like bitshifting
"""how many commands a node should have"""
commandPerNode = math.floor(len(commands)/size) + (1 if len(commands) % size > rank % size else 0)
#assigning commands to each node
rankCommands = [commands[rank+i*size] for i in range(commandPerNode)]

#cycles throughs the commands the thread needs to run - waits till the first one is done before moving on to the next one.
for cmd in rankCommands:
    test = subprocess.Popen("/ifs/data/ag4522_gp/software/schedtool/schedtool/schedtool -a " + str(hex(processerMask)) + " -e " + cmd,shell=True,preexec_fn=_set_pdeathsig(signal.SIGTERM))
    test.wait()

"""TO HERE (2)"""

Submitting jobs:

To submit the job you need to run qsub "First File".sh
You can check the status of the job with qstat.

Installing python/conda packages

We are making a lab python bin. If you need to add packages/libraries you can do so.

If you need to pip a package run the following command: /ifs/data/ag4522_gp/software/miniconda3/bin/python3 -m pip install "name"

If you need to conda a package run the following command: /ifs/data/ag4522_gp/software/miniconda3/bin/conda install "name"

Clone this wiki locally