Skip to content

MultiUser Support

tilne edited this page Aug 3, 2021 · 2 revisions

See #170

This approach is a fairly lightweight way of adding users, it gives all users the same permissions. If you want a more robust multi-user approach I suggest you follow the following guide: https://aws.amazon.com/blogs/opensource/managing-aws-parallelcluster-ssh-users-with-openldap/

In order to create a user for the cluster, that user needs to exist on all the compute nodes. If they don't slurm won't be able to schedule jobs and you won't be able to run mpi jobs across multiple nodes.

  1. Create users on the head node:
$ sudo useradd newuser --create-home
$ sudo mkdir -p /home/newuser/.ssh
  1. Copy keypair, authorized_keys over to the new user's .ssh directory:
$ sudo cp ~/.ssh/* /home/newuser/.ssh
$ sudo chmod 600 /home/newuser/.ssh/*
$ sudo chmod 700 /home/newuser/.ssh
$ sudo chown -R newuser:newuser /home/newuser/.ssh
  1. Create a file in the shared directory (assuming /shared) with the user's username and UID like so:
$ echo "newuser,`id -u newuser`" >> /shared/userlistfile
$ echo "newuser2,`id -u newuser2`" >> /shared/userlistfile
  1. Create a script create_users.sh that contains:
#!/bin/bash

. "/etc/parallelcluster/cfnconfig"

get_node_type() {
    # Versions earlier than 3.0.0 use cfn_node_type.
    # Versions 3.0.0 and later use node_type.
    if [ -n "${cfn_node_type}" ]; then
        echo $cfn_node_type
    elif [ -n "${node_type}" ]; then
        echo $node_type
    else
        echo 1>&2 "Unable to determine node type"
        exit 1
    fi
}

IFS=","

if [ "$(get_node_type)" = "ComputeFleet" ]; then
    while read USERNAME USERID
    do
        # -M do not create home since head node is exporting /homes via NFS
        # -u to set UID to match what is set on the head node
        if ! [ $(id -u $USERNAME 2>/dev/null || echo -1) -ge 0 ]; then
            useradd -M -u $USERID $USERNAME
        fi
    done < "/shared/userlistfile"
fi
  1. Upload it to s3 and add to your config:
$ aws s3 cp create_users.sh s3://[your_bucket]/

For versions prior to 3.0.0, update your config like the following:

[cluster clustername]
s3_read_resource = arn:aws:s3:::[your_bucket]/*
post_install = s3://[your_bucket]/create_users.sh

For versions 3.0.0 and later, configure the script to be run for each compute queue (and make sure you grant permission for the nodes to read the script from S3 if necessary):

Scheduling:
  Scheduler: slurm
  SlurmQueues:
  - Name: queue0
    ...
    CustomActions:
      OnNodeConfigured:
        Script: s3://[your_bucket]/create_users.sh
    ...
    Iam:
      S3Access:
        - BucketName: [your_bucket]
          KeyName: create_users.sh
    ...
  - Name: queue1
    ...
    CustomActions:
      OnNodeConfigured:
        Script: s3://[your_bucket]/create_users.sh
    ...
    Iam:
      S3Access:
        - BucketName: [your_bucket]
          KeyName: create_users.sh
    ...
  1. Stop and update the running cluster:

For versions prior to 3.0.0:

$ pcluster stop [clustername]
# no need to wait 
$ pcluster update [clustername]
$ pcluster start [clustername]

For versions 3.0.0 and later:

$ pcluster update-compute-fleet --cluster-name [clustername] --status STOP_REQUESTED
# Wait for the following command to return "STOPPED"
$ pcluster describe-compute-fleet --cluster-name [clustername] --query 'status'
$ pcluster update-cluster --cluster-configuration /path/to/updated/config.yaml --cluster-name [clustername]
# Wait for the following command to return "UPDATE_COMPLETE"
$ pcluster describe-cluster --cluster-name [clustername] --query 'clusterStatus'
$ pcluster update-compute-fleet --cluster-name [clustername] --status START_REQUESTED
# Wait for the following command to return "RUNNING"
$ pcluster describe-compute-fleet --cluster-name [clustername] --query 'status'

Clone this wiki locally