You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Came across the metrics exporter, however am not able to set it up,
The errors are:
{"level":"info","ts":1679291005.7844253,"msg":"reading metrics file","metricsFile":""}
{"level":"error","ts":1679291005.7844558,"msg":"failed to read metrics file","error":"open : no such file or directory","stacktrace":"main.main\n\t/workspace/cmd/metricsexporter/metricsexporter.go:62\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250"}
Can someone please point me to set this up? We need to set up per pod GPU utilization metrics
The text was updated successfully, but these errors were encountered:
suchisur
changed the title
Metrics-exporter
Metrics-exporter setup; How to go about it?
Mar 20, 2023
Hi @suchisur, thanks for your interest in nos! The metrics exporter in nos does not provide GPU utilization metrics and is only used to optionally share basic telemetry data during nos installation as described in this documentation page.
For collecting GPU utilization metrics, I'd suggest using Prometheus with the NVIDIA DGCM Exporter. If you are already using the NVIDIA GPU Operator, you can easily set up the DCGM exporter as described here. Hope this helps!
Came across the metrics exporter, however am not able to set it up,
The errors are:
Can someone please point me to set this up? We need to set up per pod GPU utilization metrics
The text was updated successfully, but these errors were encountered: