Skip to content

kennygarreau/nvidia-collector

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

NVIDIA Collector Extension for AppDynamics Machine Agent

This is an AppDynamics Machine Agent monitor (extension) to gather GPU metrics used in AI (or other) workloads. Metrics are gathered every minute and published to the AppDynamics Metric Browser. The following metrics will be captured:

  • Fan Speed
  • GPU Temperature
  • Power Draw
  • Graphics Clock (Mhz)
  • Max Graphics Clock (Mhz)
  • Mem Clock
  • Max Mem Clock (Mhz)
  • Sm Clock
  • Max Sm Clock (Mhz)
  • Video Clock
  • Max Video Clock (Mhz)
  • Mem Free
  • Mem Reserved
  • Mem Total
  • Mem Used
  • Per-Process Memory Usage

Requirements:

  • nvidia-smi (provided by NVIDIA Drivers)
  • python3

Installation:

git clone this repo
mv nvidia-collector /path/to/machine/agent/monitors
sudo systemctl restart appdynamics-machine-agent

Example metrics screenshots below:

Screenshot 2024-03-07 at 3 05 04 PM Screenshot 2024-03-07 at 3 06 18 PM

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages