Skip to content

PBSPro/hpcm-pbspro-connector

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 

Repository files navigation

#
# Copyright (C) 1994-2019 Altair Engineering, Inc.
# For more information, contact Altair at www.altair.com.
#
# This file is part of the PBS Professional ("PBS Pro") software.
#
# Open Source License Information:
#
# PBS Pro is free software. You can redistribute it and/or modify it under the
# terms of the GNU Affero General Public License as published by the Free
# Software Foundation, either version 3 of the License, or (at your option) any
# later version.
#
# PBS Pro is distributed in the hope that it will be useful, but WITHOUT ANY
# WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
# FOR A PARTICULAR PURPOSE.
# See the GNU Affero General Public License for more details.
#
# You should have received a copy of the GNU Affero General Public License
# along with this program.  If not, see <http://www.gnu.org/licenses/>.
#
# Commercial License Information:
#
# For a copy of the commercial license terms and conditions,
# go to: (http://www.pbspro.com/UserArea/agreement.html)
# or contact the Altair Legal Department.
#
# Altair’s dual-license business model allows companies, individuals, and
# organizations to create proprietary derivative works of PBS Pro and
# distribute them - whether embedded or bundled with other software -
# under a commercial license agreement.
#
# Use of Altair’s trademarks, including but not limited to "PBS™",
# "PBS Professional®", and "PBS Pro™" and Altair’s logos is subject to Altair's
# trademark licensing policies.
#


Introduction
   PBS Professional (PBS Pro) software optimizes job scheduling and workload
   management in high-performance computing (HPC) and artificial intelligent
   (AI) environments – clusters, clouds, and supercomputers – improving system
   efficiency and people’s productivity.

   HPE Performance Cluster Manager (HPCM) delivers an integrated system
   management solution for Linux®-based high performance computing (HPC)
   clusters. It provides complete provisioning, management, and monitoring for
   clusters scaling to 100,000 nodes. 

   PBS Pro and HPCM Connector project allows Admins using HPCM GUI to perform
   workload manager tasks such as configuring the server, start/stop services,
   monitor and control jobs and nodes etc.

   The Connector is supported on both Flat and Hierarchical systems.


Prerequisites
   HPCM software installed on Admin Node.

   PBS Professional installation on a non HPCM Admin Node
   - PBS Professional Server/Scheduler is installed on the non HPCM Admin Node
   - PBS Professional Client commands are installed on the HPCM Admin Node.
   - PBS Professional MOM is installed in HPCM image_groups (os images).
   - Compute nodes should be reachable from Admin node.
     You may have to configure DNS search path on Admin node.
     Please refer Release Notes of HPCM.

   PBS Professional installation on a HPCM Admin Node
   - PBS Professional Server/Scheduler is installed on the HPCM Admin Node
   - PBS Professional MOM is installed in HPCM image_groups (os images)
   - Compute nodes should be reachable from Admin node.
     You may have to configure DNS search path on Admin node.
     Please refer Release Notes of HPCM.


Installation Instructions
   On the Admin Node do the following:

   1. Download hpcm-pbspro-connector-<version>.tar.gz tarball to /opt/clmgr/contrib

      cp hpcm-pbspro-connector*.tar.gz /opt/clmgr/contrib

   2. Change directory to /opt/clmgr/contrib

      cd /opt/clmgr/contrib

   3. Unpackage hpcm-pbspro-connector tarball

      tar zxvf hpcm-pbspro-connector*.tar.gz

   4. Move hpcm_pbspro_connector directory from hpcm-pbspro-connector-<version>
      to /opt/clmgr/contrib

      mv hpcm-pbspro-connector-<version>/hpcm_pbspro_connector .

   5. Update /opt/clmgr/etc/ActionAndAlertsFile.txt w/ the PBS_Specific_ActionAndAlerts.txt
      NOTE: File contains PBS Professional specific configurations that need to be incorporated in /opt/clmgr/etc/ActionAndAlertsFile.txt file.

      MANUALLY ADDED PBS PROFESSIONAL SPECIFIC CONTENT TO THE DESIGNATED AREAS OF THE FILE.

   6. Update /opt/clmgr/etc/cmu_custom_menu w/ the PBS_Specific_hpcm_custom_menu
      NOTE: File contains PBS Professional specific configurations that need to be incorporated in /opt/clmgr/etc/cmu_custom_menu

      mv ../etc/cmu_custom_menu ../etc/cmu_custom_menu.orig
      cat ../etc/cmu_custom_menu.orig hpcm_pbspro_connector/etc/PBS_Specific_hpcm_custom_menu > ../etc/cmu_custom_menu

   7. Restart CMU service for configuration to take effect

      /etc/init.d/cmu restart

   8. You can optionally add users to be managers to be able to make changes to
      the configuration by adding user to managers list as:
      qmgr -c "set server managers += <site_user_account>"

      Refer PBS Professional Admin Guide for more details on how to add user
      account as manager.


PBS Pro Failover
   - Connector supports PBS Pro Failover out of the box. Refer PBS Professional
     Admin Guide on how to setup failove in PBS Pro.
   - You need to enable start/stop functionality of PBS Pro server in failover mode.
     By default, it is not enabled. If used as is, you might be able to start/
     stop PBS primary server only.
   - cmu_custom_menu file has configuration for this.
   - To enable start/stop of All PBS servers

   1. Simply comment out section for "No PBS Pro Failover".
   2. Uncomment section for "PBS Pro Failover".


Troubleshooting Tips
   - Cannot connect to server or mom.
     Check DNS configuration or /etc/hosts file.
     Check if PBS Pro services are running.
     Check if /etc/pbs.conf file on all nodes is consistent.