Skip to content

AWS ParallelCluster v3.0.0

Compare
Choose a tag to compare
@lukeseawalker lukeseawalker released this 10 Sep 15:51
· 65 commits to release-3.0 since this release

We're excited to announce the release of AWS ParallelCluster 3.0.0

Upgrade

How to upgrade?

sudo pip install --upgrade aws-parallelcluster

3.0.0

ENHANCEMENTS

  • Add support for pcluster actions (e.g., create-cluster, update-cluster, delete-cluster) through HTTP endpoints
    with Amazon API Gateway.
  • Revamp custom AMI creation and management by leveraging EC2 Image Builder. This also includes the implementation of
    build-image, delete-image, describe-image and list-image commands to manage custom ParallelCluster images.
  • Add list-official-images command to describe ParallelCluster official AMIs.
  • Add export-cluster-logs, list-cluster-logs and get-cluster-log-events commands to retrieve both CloudWatch Logs
    and CloudFormation Stack Events. Add export-image-logs, list-image-logs and get-image-log-events commands to
    retrieve both Image Builder Logs and CloudFormation Stack Events.
  • Enable the possibility to restart / reboot the head node also for instance types with
    instance store.
    Those operations remain anyway managed by the user that is responsible for the status of the cluster while operating
    on the head node, e.g. stopping the compute fleet first.
  • Add support to use an existing Private Route53 Hosted Zone when using Slurm as scheduler.
  • Add the possibility to configure the instance profile as alternative to configuring the IAM role for the head and for
    each compute queue.
  • Add the possibility to configure IAM role, profile and policies for head node and for each compute queue.
  • Add possibility to configure different security groups for each queue.
  • Allow full control on the name of CloudFormation stacks created by ParallelCluster by removing the parallelcluster-
    prefix.
  • Add multiple queues and compute resources support for pcluster configure when the scheduler is Slurm.
  • Add prompt for availability zone in pcluster configure automated subnets creation.
  • Add configuration HeadNode / Imds / Secured to enable/disable restricted access to Instance Metadata Service (IMDS).
  • Implement scaling protection mechanism with Slurm scheduler: compute fleet is automatically set to 'PROTECTED'
    state in case recurrent failures are encountered when provisioning nodes.
  • Add --suppress-validators and --validation-failure-level parameters to create and update commands.
  • Add support for associating an existing Elastic IP to the head node.
  • Extend limits for supported number of Slurm queues (10) and compute resources (5).
  • Encrypt root EBS volumes and shared EBS volumes by default. Note that if the scheduler is AWS Batch, the root volumes
    of the compute nodes cannot be encrypted by ParallelCluster.

CHANGES

  • Upgrade EFA installer to version 1.13.0
    • EFA configuration: efa-config-1.9
    • EFA profile: efa-profile-1.5
    • EFA kernel module: efa-1.13.0
    • RDMA core: rdma-core-35
    • Libfabric: libfabric-1.13.0
    • Open MPI: openmpi40-aws-4.1.1-2
  • Upgrade NICE DCV to version 2021.1-10851.
  • Upgrade Slurm to version 20.11.8.
  • Upgrade NVIDIA driver to version 470.57.02.
  • Upgrade CUDA library to version 11.4.0.
  • Upgrade Cinc Client to version 17.2.29.
  • Upgrade Python runtime used by Lambda functions in AWS Batch integration to python3.8.
  • Remove support for SGE and Torque schedulers.
  • Remove support for CentOS8.
  • Change format and syntax of the configuration file to be used to create the cluster, from ini to YAML. A cluster configuration
    file now only includes the definition of a single cluster.
  • Remove --cluster-template, --extra-parameters and --tags parameters for the create command.
  • Remove --cluster-template, --extra-parameters, --reset-desired and --yes parameters for the update command.
  • Remove --config parameter for delete, status, start, stop, instances and list commands.
  • Remove possibility to specify aliases for ssh command in the configuration file.
  • Distribute AWS Batch commands: awsbhosts, awsbkill, awsbout, awsbqueues, awsbstat and awsbsub as a
    separate aws-parallelcluster-awsbatch-cli PyPI package.
  • Add timestamp suffix to CloudWatch Log Group name created for the cluster.
  • Remove pcluster-config CLI utility.
  • Remove amis.txt file.
  • Remove additional EBS volume attached to the head node by default.
  • Change NICE DCV session storage path to /home/{UserName}.
  • Create a single ParallelCluster S3 bucket for each AWS region rather than for each cluster.
  • Adopt inclusive language
    • Rename MasterServer to HeadNode in CLI outputs.
    • Rename variable exported in the AWS Batch job environment from MASTER_IP to PCLUSTER_HEAD_NODE_IP.
    • Rename all CFN outputs from Master* to HeadNode*.
    • Rename NodeType and tags from Master to HeadNode.
  • Rename tags (Note: the following tags are crucial for ParallelCluster scaling logic):
    • aws-parallelcluster-node-type -> parallelcluster:node-type
    • ClusterName -> parallelcluster:cluster-name
    • aws-parallelcluster-attributes -> parallelcluster:attributes
    • Version -> parallelcluster:version
  • Remove tag: Application.
  • Remove runtime creation method
    of custom ParallelCluster AMIs.
  • Retain CloudWatch logs on cluster deletion by default. If you want to delete the logs during cluster deletion, set
    Monitoring / Logs / CloudWatch / RetainOnDeletion to False in the configuration file.
  • Remove instance store software encryption option (encrypted_ephemeral) and rely on default hardware encryption provided
    by NVMe instance store volumes.
  • Add tag 'Name' to every shared storage with the value specified in the shared storage name config.
  • Remove installation of MPICH and FFTW packages.
  • Remove Ganglia support.