Skip to content

AWS ParallelCluster v2.4.0

Compare
Choose a tag to compare
@lukeseawalker lukeseawalker released this 11 Jun 15:29
· 112 commits to master since this release
e94e9c2

We're excited to announce the release of AWS ParallelCluster Cookbook 2.4.0.

This is associated with AWS ParallelCluster v2.4.0.

Enhancements

  • Add support for EFA on Centos 7, Amazon Linux and Ubuntu 1604
  • Add support for Ubuntu in China region cn-northwest-1

Changes

  • SGE: changed following parameters in global configuration
    • max_unheard 00:03:00: allows a faster reaction in case of faulty nodes
    • reschedule_unknown 00:00:30: enables rescheduling of jobs running on failing nodes
    • qmaster_params ENABLE_FORCED_QDEL_IF_UNKNOWN: forces job deletion on unresponsive nodes
    • qmaster_params ENABLE_RESCHEDULE_KILL: forces rescheduling or killing of jobs running on failing nodes
  • Slurm: decrease SlurmdTimeout to 120 seconds to speed up replacement of faulty nodes
  • Always use full master FQDN when mounting NFS on compute nodes. This solves some issues occurring with some networking
    setups and custom DNS configurations
  • Set soft and hard ulimit on open files to 10000 for all supported OSs
  • Pin python supervisor version to 3.4.0
  • Remove unused compute_instance_type from jobwatcher.cfg
  • Removed unused max_queue_size from sqswatcher.cfg
  • Remove double quoting of the post_install args

Bug Fixes

  • Fix issue that was preventing Torque from being used on Centos 7
  • Start node daemons at the end of instance initialization. The time spent for post-install script and node
    initialization is not counted as part of node idletime anymore.
  • Fix issue which was causing an additional and invalid EBS mount point to be added in case of multiple EBS
  • Install Slurm libpmpi/libpmpi2 that is distributed in a separate package since Slurm 17

Support

Need help / have a feature request?
AWS Support: https://console.aws.amazon.com/support/home
ParallelCluster Issues tracker on GitHub: https://github.com/aws/aws-parallelcluster
The HPC Forum on the AWS Forums page: https://forums.aws.amazon.com/forum.jspa?forumID=192