Skip to content

Latest commit

 

History

History
34 lines (30 loc) · 1.14 KB

README.md

File metadata and controls

34 lines (30 loc) · 1.14 KB

NVIDIA Collector Extension for AppDynamics Machine Agent

This is an AppDynamics Machine Agent monitor (extension) to gather GPU metrics used in AI (or other) workloads. Metrics are gathered every minute and published to the AppDynamics Metric Browser. The following metrics will be captured:

  • Fan Speed
  • GPU Temperature
  • Power Draw
  • Graphics Clock (Mhz)
  • Max Graphics Clock (Mhz)
  • Mem Clock
  • Max Mem Clock (Mhz)
  • Sm Clock
  • Max Sm Clock (Mhz)
  • Video Clock
  • Max Video Clock (Mhz)
  • Mem Free
  • Mem Reserved
  • Mem Total
  • Mem Used
  • Per-Process Memory Usage

Requirements:

  • nvidia-smi (provided by NVIDIA Drivers)
  • python3

Installation:

git clone this repo
mv nvidia-collector /path/to/machine/agent/monitors
sudo systemctl restart appdynamics-machine-agent

Example metrics screenshots below:

Screenshot 2024-03-07 at 3 05 04 PM Screenshot 2024-03-07 at 3 06 18 PM