AWS ParallelCluster v3.0.0
lukeseawalker
released this
10 Sep 15:51
·
65 commits
to release-3.0
since this release
We're excited to announce the release of AWS ParallelCluster 3.0.0
Upgrade
How to upgrade?
sudo pip install --upgrade aws-parallelcluster
3.0.0
ENHANCEMENTS
- Add support for pcluster actions (e.g., create-cluster, update-cluster, delete-cluster) through HTTP endpoints
with Amazon API Gateway. - Revamp custom AMI creation and management by leveraging EC2 Image Builder. This also includes the implementation of
build-image
,delete-image
,describe-image
andlist-image
commands to manage custom ParallelCluster images. - Add
list-official-images
command to describe ParallelCluster official AMIs. - Add
export-cluster-logs
,list-cluster-logs
andget-cluster-log-events
commands to retrieve both CloudWatch Logs
and CloudFormation Stack Events. Addexport-image-logs
,list-image-logs
andget-image-log-events
commands to
retrieve both Image Builder Logs and CloudFormation Stack Events. - Enable the possibility to restart / reboot the head node also for instance types with
instance store.
Those operations remain anyway managed by the user that is responsible for the status of the cluster while operating
on the head node, e.g. stopping the compute fleet first. - Add support to use an existing Private Route53 Hosted Zone when using Slurm as scheduler.
- Add the possibility to configure the instance profile as alternative to configuring the IAM role for the head and for
each compute queue. - Add the possibility to configure IAM role, profile and policies for head node and for each compute queue.
- Add possibility to configure different security groups for each queue.
- Allow full control on the name of CloudFormation stacks created by ParallelCluster by removing the
parallelcluster-
prefix. - Add multiple queues and compute resources support for
pcluster configure
when the scheduler is Slurm. - Add prompt for availability zone in
pcluster configure
automated subnets creation. - Add configuration
HeadNode / Imds / Secured
to enable/disable restricted access to Instance Metadata Service (IMDS). - Implement scaling protection mechanism with Slurm scheduler: compute fleet is automatically set to 'PROTECTED'
state in case recurrent failures are encountered when provisioning nodes. - Add
--suppress-validators
and--validation-failure-level
parameters tocreate
andupdate
commands. - Add support for associating an existing Elastic IP to the head node.
- Extend limits for supported number of Slurm queues (10) and compute resources (5).
- Encrypt root EBS volumes and shared EBS volumes by default. Note that if the scheduler is AWS Batch, the root volumes
of the compute nodes cannot be encrypted by ParallelCluster.
CHANGES
- Upgrade EFA installer to version 1.13.0
- EFA configuration:
efa-config-1.9
- EFA profile:
efa-profile-1.5
- EFA kernel module:
efa-1.13.0
- RDMA core:
rdma-core-35
- Libfabric:
libfabric-1.13.0
- Open MPI:
openmpi40-aws-4.1.1-2
- EFA configuration:
- Upgrade NICE DCV to version 2021.1-10851.
- Upgrade Slurm to version 20.11.8.
- Upgrade NVIDIA driver to version 470.57.02.
- Upgrade CUDA library to version 11.4.0.
- Upgrade Cinc Client to version 17.2.29.
- Upgrade Python runtime used by Lambda functions in AWS Batch integration to python3.8.
- Remove support for SGE and Torque schedulers.
- Remove support for CentOS8.
- Change format and syntax of the configuration file to be used to create the cluster, from ini to YAML. A cluster configuration
file now only includes the definition of a single cluster. - Remove
--cluster-template
,--extra-parameters
and--tags
parameters for thecreate
command. - Remove
--cluster-template
,--extra-parameters
,--reset-desired
and--yes
parameters for theupdate
command. - Remove
--config
parameter fordelete
,status
,start
,stop
,instances
andlist
commands. - Remove possibility to specify aliases for
ssh
command in the configuration file. - Distribute AWS Batch commands:
awsbhosts
,awsbkill
,awsbout
,awsbqueues
,awsbstat
andawsbsub
as a
separateaws-parallelcluster-awsbatch-cli
PyPI package. - Add timestamp suffix to CloudWatch Log Group name created for the cluster.
- Remove
pcluster-config
CLI utility. - Remove
amis.txt
file. - Remove additional EBS volume attached to the head node by default.
- Change NICE DCV session storage path to
/home/{UserName}
. - Create a single ParallelCluster S3 bucket for each AWS region rather than for each cluster.
- Adopt inclusive language
- Rename MasterServer to HeadNode in CLI outputs.
- Rename variable exported in the AWS Batch job environment from MASTER_IP to PCLUSTER_HEAD_NODE_IP.
- Rename all CFN outputs from Master* to HeadNode*.
- Rename NodeType and tags from Master to HeadNode.
- Rename tags (Note: the following tags are crucial for ParallelCluster scaling logic):
aws-parallelcluster-node-type
->parallelcluster:node-type
ClusterName
->parallelcluster:cluster-name
aws-parallelcluster-attributes
->parallelcluster:attributes
Version
->parallelcluster:version
- Remove tag:
Application
. - Remove runtime creation method
of custom ParallelCluster AMIs. - Retain CloudWatch logs on cluster deletion by default. If you want to delete the logs during cluster deletion, set
Monitoring / Logs / CloudWatch / RetainOnDeletion
to False in the configuration file. - Remove instance store software encryption option (encrypted_ephemeral) and rely on default hardware encryption provided
by NVMe instance store volumes. - Add tag 'Name' to every shared storage with the value specified in the shared storage name config.
- Remove installation of MPICH and FFTW packages.
- Remove Ganglia support.