AWS ParallelCluster v3.9.0
We're excited to announce the release of AWS ParallelCluster 3.9.0
Upgrade
How to upgrade?
sudo pip install --upgrade aws-parallelcluster
ENHANCEMENTS
- Permit to update the external shared storage of type Efs, FsxLustre, FsxOntap, FsxOpenZfs and FileCache
without replacing compute and login fleet. - Permit to update
MinCount
,MaxCount
,Queue
andComputeResource
configuration parameters without the need to
stop the compute fleet. It's now possible to update them by settingScheduling/SlurmSettings/QueueUpdateStrategy
to TERMINATE. ParallelCluster will terminate only the nodes removed during a resize of the cluster capacity
performed through a cluster update. - Add support for RHEL9.
- Add support for Rocky Linux 9 as
CustomAmi
created throughbuild-image
process. No public official ParallelCluster Rocky9 Linux AMI is made available at this time. - Remove
CommunicationParameters
from the Custom Slurm Settings deny list. - Add the configuration parameter
DeploymentSettings/DefaultUserHome
to allow users to move the default user's home directory to/local/home
instead of/home
(default). - Add configuration parameter
DeploymentSettings/DisableSudoAccessForDefaultUser
to disable sudo access of default user in supported OSes.
CHANGES
- Upgrade Slurm to 23.11.4 (from 23.02.7).
- Upgrade Pmix to 4.2.9 (from 4.2.6).
- Add support for Python 3.11, 3.12 in pcluster CLI and aws-parallelcluster-batch-cli.
- Build network interfaces using network card index from
NetworkCardIndex
list of EC2 DescribeInstances response,
instead of looping overMaximumNetworkCards
range. - Fail cluster creation when using instance types P3, G3, P2 and G2 because their GPU architecture is not compatible with Open Source Nvidia Drivers (OpenRM) introduced as part of 3.8.0 release.
- Upgrade the default FSx Lustre server version managed by ParallelCluster to 2.15.
- Upgrade NVIDIA driver to version 535.154.05.
- Upgrade EFA installer to
1.30.0
.- Efa-driver:
efa-2.6.0-1
- Efa-config:
efa-config-1.15-1
- Efa-profile:
efa-profile-1.6-1
- Libfabric-aws:
libfabric-aws-1.19.0
- Rdma-core:
rdma-core-46.0-1
- Open MPI:
openmpi40-aws-4.1.6-2
andopenmpi50-aws-5.0.0-11
- Efa-driver:
- Upgrade NICE DCV to version
2023.1-16388
.- server:
2023.1.16388-1
- xdcv:
2023.1.565-1
- gl:
2023.1.1047-1
- web_viewer:
2023.1.16388-1
- server:
- Upgrade ARM PL to version 23.10.
- Upgrade third-party cookbook dependencies:
- nfs-5.1.2 (from nfs-5.0.0)
BUG FIXES
- Refactor IAM policies defined in CloudFormation template
parallelclutser-policies.yaml
to prevent ParallelCluster API deployment failure caused by policies exceeding IAM limits. - Fix issue making job fail when submitted as active directory user from login nodes. The issue was caused by an incomplete configuration of the integration with the external Active Directory on the head node.
- Fix issue making login nodes fail to bootstrap when the head node takes more time than expected in writing keys.