-
Notifications
You must be signed in to change notification settings - Fork 15
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Updated services to use Managed Prometheus. I made slight changes to install file to remove conflicts and unused code * Update README.md * working on deployment script * removing deploy script I will aadd later * Update README.md saving * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * more descriptive naming * fix bug not copying config files * fix Alma bugs not allowing service to start --------- Co-authored-by: Ubuntu <rafsalas@ub22h3e8f000003.dqjt2vfkzmou3mpiz4ibgqqi1c.ax.internal.cloudapp.net> Co-authored-by: almalinux Cloud User <hpcuser@hbv22ec6c000000.ypfoet0fe2mefochwlryckov1h.dxbx.internal.cloudapp.net>
- Loading branch information
1 parent
ea3427b
commit 26f88e2
Showing
8 changed files
with
142 additions
and
146 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,74 +3,81 @@ Moneo as a Linux Service | |
Description | ||
----- | ||
Setting up Moneo exporters as Linux service will allow for easy management and deployment of exporters. | ||
This guide will walk you through how to set up Linux services for Moneo exporters. | ||
|
||
Prerequisites | ||
----- | ||
If using [Azure's Ubuntu HPC AI VM image](https://azuremarketplace.microsoft.com/en-us/marketplace/apps/microsoft-dsvm.ubuntu-hpc?tab=overview) all dependencies will already be installed. Dependencies can be installed on workers using this script [Install Script](../src/worker/install/install.sh). | ||
|
||
Bellow are the dependencies needed (installed by the the install script): | ||
1. Python Packages: | ||
- prometheus-client==0.16.0 | ||
- psutil==5.9.4 | ||
- filelock==3.10.0 | ||
2. DCGM 3.1.6 | ||
|
||
Instructions without Publisher service | ||
----- | ||
1. Install dependencies using install script (not needed if dependencies already installed) | ||
- ```sudo ../src/worker/install/install.sh``` | ||
|
||
2. Run the [configure_service.sh](./configure_service.sh) with the full Moneo path as an argument | ||
- ```sudo ./configure_service.sh <Moneo_PATH>``` | ||
- If an argument isn't provide it will use the default directory: i.e. /opt/azurehpc/tools/Moneo | ||
|
||
Note: The configure script will modify the [email protected] file to point to the exporter scripts. | ||
Three launch methods provided: | ||
1. The basic launch method launches the exporters on the compute node. It is up to the user to either: | ||
- Use Moneo CLI to launch the manager Grafana and Prometheus containers on a head node. | ||
- Or use you own method to scrape from the exporter ports ("nvidia_exporter": 8000 "net_exporter": 8001 "node_exporter": 8002). | ||
2. Launch exporters and an [Azure Monitor](../docs/AzureMonitorAgent.md) publisher. | ||
- Before launch you must modify the "azure_monitor_agent_config" section of [publisher_config](../src/worker/publisher/config/publisher_config.json) file with the Azure Monitor workspace connection string. | ||
3. Azure Managed Grafana/Prometheus. | ||
- This will require you to set up Managed Prometheus and Managed Grafana | ||
- See prereqs for [Managed Prometheus](../docs/ManagedPrometheusAgent.md) | ||
- Once Managed Prometheus is set up you can link it to a Grafana Dashboard. | ||
- See [Azure Managed Grafana overview](https://learn.microsoft.com/en-us/azure/managed-grafana/overview) for info on setting up Grafana. | ||
|
||
3. To start the services run the following commands: | ||
- With start script: | ||
``` sudo ./start_moneo_services.sh``` | ||
- Manually: | ||
``` | ||
sudo systemctl start moneo@node_exporter.service | ||
sudo systemctl start moneo@net_exporter.service | ||
sudo systemctl start moneo@nvidia_exporter.service | ||
``` | ||
4. To stop the services run: | ||
- With stop script: | ||
``` sudo ./stop_moneo_services.sh ``` | ||
- Manually: | ||
``` | ||
sudo systemctl stop moneo@node_exporter.service | ||
sudo systemctl stop moneo@net_exporter.service | ||
sudo systemctl stop moneo@nvidia_exporter.service | ||
``` | ||
5. To run these commands on multiple VMs in parallel you can use a tool like parallel-ssh: | ||
- ```parallel-ssh -i -t 0 -h hostfile "<command>"``` | ||
This guide will walk you through how to set up Linux services for Moneo exporters. | ||
|
||
Instructions for Moneo services with Publisher service | ||
Prerequisites | ||
----- | ||
The publisher service is experimental and requires additional setup to use. | ||
1. Modify publisher config files | ||
- Moneo/src/worker/install/config/geneva_config.json | ||
- Moneo/src/worker/publisher/config/publisher_config.json | ||
If using [Azure's Ubuntu HPC AI VM image](https://azuremarketplace.microsoft.com/en-us/marketplace/apps/microsoft-dsvm.ubuntu-hpc?tab=overview) all dependencies will already be installed. Additional dependencies are installed as part of this guide. Please see [Install Script](../src/worker/install/install.sh) for details on what Python and Ubuntu packages are installed. DCGM 3.1.6 and higher is required for GPU nodes. This will also be checked/installed via the install script as part of this guide. | ||
|
||
2. Install dependencies using install script (not needed if dependencies already installed) | ||
- Include Geneva agent install: ```sudo ../src/worker/install/install.sh geneva``` | ||
- Include Azure monitor install: ```sudo ../src/worker/install/install.sh azure_monitor``` | ||
|
||
3. Run the [configure_service.sh](./configure_service.sh) with the full Moneo path as an argument | ||
- ```sudo ./configure_service.sh <Moneo_PATH> <publisher type>``` | ||
- Publisher types: "geneva" and "azure_monitor" | ||
|
||
4. To start the services run the following commands based on the publisher type: | ||
- ```sudo ./start_moneo_services.sh geneva <moneo path>``` | ||
- ```sudo ./start_moneo_services.sh azure_monitor``` | ||
5. To stop the services run: | ||
- ```sudo ./stop_moneo_services.sh ``` | ||
6. To run these commands on multiple VMs in parallel you can use a tool like parallel-ssh: | ||
- ```parallel-ssh -i -t 0 -h hostfile "<command>"``` | ||
Below are the prereqs needed: | ||
- PSSH (This can be interchanged with other tools that can do distributed commands. The instructions will use PSSH for Ubuntu) | ||
- AlmaLinux 8.7 | ||
- Ubuntu 20.04/22.04 | ||
- Moneo cloned/installed in the same directory on all compute nodes. | ||
- A host file with the target compute nodes. | ||
|
||
|
||
Instructions for Configuring, Installing and Launching Moneo services | ||
----- | ||
### Configuration and Installation ### | ||
Configuration/Installation is only required once. Afte that is complete the Linux services can be started and stopped as desired. | ||
1. Configuration and installation of the Linux service is done with the following command: | ||
```parallel-ssh -i -t 0 -h hostfile "sudo <Full Path to Moneo>/linux_service/configure_service.sh <Full Path to Moneo>"``` | ||
- If You will only be launching the exporters without AZ monitor or Managed Prometheus Continue to the Launch Services section else continue. | ||
2. For Azure Monitor or Managed Prometheus methods if you have not yet modified the configuration files reference the following: | ||
- For Azure Managed Prometheus: | ||
- modify [prom_sidecar_config.json](../src/worker/publisher/config) and copy the file to the compute nodes. | ||
- ```parallel-scp -h hostfile <Full Path to Moneo>/src/worker/publisher/config/prom_sidecar_config.json <Full Path to Moneo>/src/worker/publisher/config``` | ||
- Lastly check that that the managed user identity used to set up Managed Prometheus (Azure role assignments) is assigned to your VMSS. | ||
- For Azure Monitor: | ||
- modify the connection string of "azure_monitor_agent_config" section and copy the file to the compute nodes. | ||
- ```parallel-scp -h hostfile <Full Path to Moneo>/src/worker/publisher/config/publisher_config.json <Full Path to Moneo>/src/worker/publisher/config``` | ||
### Launch Services ### | ||
The [start_moneo_services.sh ](./start_moneo_services.sh) script is used to start the Linux services once configuration/installation is complete. | ||
The script takes 3 arguments: | ||
1. Full directory path of Moneo | ||
2. Start with Managed Prometheus (true/false) | ||
3. Start with Azure Monitor (true/false) | ||
An example command would look like (Exporters only): /home/<user>/Moneo/linux_service/start_moneo_services.sh /home/<user>/Moneo false false | ||
|
||
#### Exporters only Launch #### | ||
```parallel-ssh -i -t 0 -h hostfile "sudo <Full Path to Moneo>/linux_service/start_moneo_services.sh <Full Path to Moneo> false false"``` | ||
#### Exporters with Azure Monitor #### | ||
```parallel-ssh -i -t 0 -h hostfile "sudo <Full Path to Moneo>/linux_service/start_moneo_services.sh <Full Path to Moneo> false true"``` | ||
#### Exporters with Managed Prometheus #### | ||
```parallel-ssh -i -t 0 -h hostfile "sudo <Full Path to Moneo>/linux_service/start_moneo_services.sh <Full Path to Moneo> true false"``` | ||
|
||
### Stop Services ### | ||
Stopping services is the same command for all methods. | ||
```parallel-ssh -i -t 0 -h hostfile "sudo <Full Path to Moneo>/linux_service/stop_moneo_services.sh"``` | ||
|
||
### Recap ### | ||
Assuming configuration files have been updated and user managed ID applied if necessary (Managed Prometheus) reference these commands for the work flow: | ||
- Configuration/Install: | ||
```parallel-ssh -i -t 0 -h hostfile "sudo <Full Path to Moneo>/linux_service/configure_service.sh <Full Path to Moneo>"``` | ||
- Extra Configure step for AZ Monitor and/or Managed Prometheus | ||
```parallel-scp -h hostfile <Full Path to Moneo>/src/worker/publisher/config/<Respective config file> <Full Path to Moneo>/src/worker/publisher/config``` | ||
- Start | ||
```parallel-ssh -i -t 0 -h hostfile "sudo <Full Path to Moneo>/linux_service/start_moneo_services.sh <Full Path to Moneo> <Managed Prom true/false> <Az Monitor true/false>"``` | ||
- Stop | ||
```parallel-ssh -i -t 0 -h hostfile "sudo <Full Path to Moneo>/linux_service/stop_moneo_services.sh"``` | ||
|
||
Note: This guide uses PSSH to distribute the commands. Any tool that is similar to PSSH can be used such as PDSH. The scipts can also be called from job schedulers or individually. | ||
|
||
|
||
Updating job ID | ||
----- | ||
|
@@ -80,5 +87,5 @@ To update job name/ID we can use the [job ID update script](../src/worker/jobIdU | |
|
||
or see [Update Job Id With Moneo CLI](../docs/JobFiltering.md) | ||
|
||
Note: use parallel-ssh to distribute this command to a cluster (i.e. step 5 of the instructions) | ||
Note: use parallel-ssh to distribute this command to a cluster | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,37 +1,20 @@ | ||
#!/bin/bash | ||
|
||
MONEO_PATH=$1 | ||
PUBLISHER=$2 | ||
|
||
if [[ -z "$MONEO_PATH" ]]; | ||
then | ||
MONEO_PATH=/opt/azurehpc/tools/Moneo | ||
echo 'default Moneo path used' | ||
fi | ||
|
||
if [[ ! -d "$MONEO_PATH" ]]; | ||
then | ||
echo "Error: Moneo path $MONEO_PATH does not exist. Ensure you are using the correct arguments | ||
(i.e. ./configure_service.sh <Moneo_path>, or ./configure_service.sh <Moneo_path> <publisher-type>). Exiting." | ||
(i.e. ./configure_service.sh <Moneo Full Path>, or ./configure_service.sh <Moneo Full Path> true/false). Exiting." | ||
exit 1 | ||
fi | ||
|
||
if [[ -n $PUBLISHER ]]; | ||
then | ||
if [ "$PUBLISHER" != "geneva" ] && [ "$PUBLISHER" != "azure_monitor" ]; | ||
then | ||
echo "Error: $PUBLISHER is not an acceptable value for publisher type. Options are 'geneva' or 'azure_monitor'. Exiting." | ||
exit 1 | ||
|
||
fi | ||
fi | ||
|
||
# replace the moneo path place holder with actaul moneo path and Move service file to systemd directory | ||
sed "s#<Moneo_Path>#$MONEO_PATH#g" $MONEO_PATH/linux_service/[email protected] > /etc/systemd/system/[email protected] | ||
|
||
if [[ -n $PUBLISHER ]]; | ||
then | ||
sed "s#<Moneo_Path>#$MONEO_PATH#g; s#<pub-type>#$PUBLISHER#g;" $MONEO_PATH/linux_service/moneo_publisher.service > /etc/systemd/system/moneo_publisher.service | ||
fi | ||
echo "configuring publisher service" | ||
sed "s#<Moneo_Path>#$MONEO_PATH#g;" $MONEO_PATH/linux_service/moneo_publisher.service > /etc/systemd/system/moneo_publisher.service | ||
|
||
$MONEO_PATH/src/worker/install/install.sh azure_monitor | ||
|
||
systemctl daemon-reload |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters