-
Notifications
You must be signed in to change notification settings - Fork 312
Home
Welcome to the AWS ParallelCluster Wiki
- Upgrade NVIDIA GPU Drivers on a cluster
- Upgrade the OpenPMIx package on a Slurm cluster managed with AWS ParallelCluster
- Upgrade Slurm in an AWS ParallelCluster cluster
- Interactive Jobs with qlogin, qrsh (sge) or srun (slurm)
- Deprecation of SGE and Torque in ParallelCluster
- Transition from SGE to SLURM
- How to enable slurmrestd on ParallelCluster
- How to setup Public Private Networking
- Open MPI Install from Source and Uninstall
- Git Pull Request Instructions
- Use ED25519 Keys with Ubuntu 22.04
- Using a Multi-NIC instance as single NIC
- ParallelCluster: Launching a Login Node
- Launch instances with ODCR (On-Demand-Capacity-Reservations)
- Configuring all_or_nothing_batch launches
- MultiUser Support
- ParallelCluster Awesomeness
- Self patch a Cluster Used for Submitting Multi node Parallel Jobs through AWS Batch
- AWS Batch with a custom Dockerfile
- Use an Existing Elastic IP
- Create cluster with encrypted root volumes
- How to use a native NICE DCV Client
- Create Ubuntu AMI with Unattended Upgrades disabled
- Update cluster when snapshot associated to EBS volume is deleted
- Installing Alternate CUDA Versions on AWS ParallelCluster
-
(3.0.0‐3.9.0) Build image CloudFormation stacks fail to delete after images are successfully built
-
(3.6.0) NVIDIA GPU nodes fail to start with custom AMI built from DLAMI
-
(3.0.0-3.6.0) Ptrace_scope not disabled for Ubuntu compute nodes
-
(3.0.0-3.6.0) Compute Nodes Belonging To More Than One Partition Causes Compute Scaling To Overscale
-
(3.2.0-3.5.1) GPU nodes not coming back online after
scontrol reboot
-
(3.3.0-3.5.1) Cluster updates can break Slurm accounting functionality
-
(3.0.0-3.5.0) DCV virtual session on Ubuntu 20.04 might show a black screen
-
(3.3.0-3.4.1) Custom AMI creation fails on Ubuntu 20.04 during MySQL packages installation
-
(3.3.0-3.4.0) Slurm cluster NodeName and NodeAddr mismatch after cluster scaling
-
(3.0.0-3.2.1) Running nodes might be mistakenly replaced when new jobs are scheduled
-
(3.0.0-3.1.4) ParallelCluster API Stack Upgrade Fails for ECR resources
-
(3.0.0-3.1.4) Unable to perform cluster update when using API or documented user policies
-
(3.0.0-3.1.3) AWSBatch Multi node Parallel jobs fail if no EBS defined in cluster
-
(3.1.1-3.1.2) Profiles not loaded when connected through NICE DCV session
-
(3.0.0-3.1.3) build image creates invalid images when using aws-cdk.aws-imagebuilder==1.153
-
(3.0.0-3.1.2) build image stack deletion failed after image successfully created
-
(3.0.0) Cluster scaling fails after a head node reboot on Ubuntu 18.04 and Ubuntu 20.04
-
(3.0.0) Deleting API Infrastructure produces CFN Stacks failure
- (2.2.1 3.3.0) Risk of deletion of managed FSx for Lustre file system when updating a cluster
- (3.0.2 / 2.11.3 and earlier) Possible performance degradation due to log4j cve 2021 44228 hotpatch service on Amazon Linux 2.
- (2.10.1-2.11.2 and 3.0.0) Custom AMI creation (
pcluster createami
orpcluster build-image
) fails with ARM architecture
- (3.0.2 / 2.11.3 and earlier) Custom AMI creation fails for centos7 and ubuntu1804 Issue started on 12/8/2021, resolved on 1/20/2022
- (2.8.0-2.10.1) Configuration validation failure: architecture of AMI and instance type does not match
- (2.10.0) Issue with CentOS 8 Custom AMI creation
- (2.5.0-2.10.0) Issue with Ubuntu 18.04 Custom AMI creation
- (2.10.1-2.10.2) Issue running Ubuntu 18 ARM AMI on first generation AWS Graviton instances
- (2.10.1-2.10.2) P4d support on Amazon Linux 1
- (2.6.0-2.10.3) Custom AMI creation (
pcluster createami
) fails - (2.9.1 and earlier) Custom AMI creation (
pcluster createami
) fails - (2.10.0 and earlier) Cluster creation fails if
enable_intel_hpc_platform=true
is in the configuration file - (2.10.4 and earlier) Batch cluster creation fails in China regions
- (2.11.0) Possible performance degradation on Amazon Linux 2 when enabling CloudWatch Logging
- (2.10.0-2.11.1) NVIDIA Fabric Manager stops running on Ubuntu 18.04 and Ubuntu 20.04
- (2.11.2 and earlier) Custom AMI creation (pcluster createami) fails when building SGE
- (2.11.4) DCV Connection Through Web Browsers Does Not Work
- (2.10.0-2.11.4) Tags in number interpreted as integer instead of string possible cause value error in Compute resource launch template
- (2.11.7 and earlier) Cluster creation fails with awsbatch scheduler