-
-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Aqua Computer Quadro data read inconsistently #440
Comments
Another thing I just noticed after a fresh boot: While the water was apparently at around 23° C, OhmGraphite was reporting 32° C. Might this be a wrong reading of the bits, maybe because of an old version or something similar? |
Can you try out the build from #441: https://github.com/nickbabcock/OhmGraphite/actions/runs/8214992430/artifacts/1311727196 It adds trace logging of the sensors that will get reported downstream. You'll need to swap the level of the file logger to trace: <logger name="*" minlevel="Trace" writeTo="file" /> Then tail |
I looked at the logs for this sensor "Temperatuer 1" on my Quadro, and while it started out quite right, it slowly gets out of sync with the reporting from my other tools. I am currently sitting at 26.8° C water, while OhmGraphite has
Two minutes later, at I am running Another data point: 3 tools (FanControl, LibreHardwareMonitor, HWInfo) reporting |
Hmm I wonder if somehow previous reported values are influencing current ones. I pushed commit that makes it a bit more explicit. I don't /think/ it'll change anything but perhaps worth a shot: https://github.com/nickbabcock/OhmGraphite/actions/runs/8215332382/artifacts/1311777388 |
Sorry for the late test. I just tested it. Stable version reported 28°C, the one you linked lasted jumped down to 21.8° but now that water is at 22.7 it still says 22.0, lagging behind in reporting. So nothing changed, unfortunately. I can see the same values being reported in the log. Can I look somewhere else for more details? |
Thank you for your patience. I am equally stumped. But a couple of things to try: One is to set the update interval in either LibreHardwareMonitor or OhmGraphite to match each other. By default, LibreHardwareMonitoris every second while OhmGraphite is once every 5 seconds. Maybe try changing the OhmGraphite update interval to 5 second (or if you are using prometheus, update the interval there). Another test is to run OhmGraphite interactively with |
I was actually running the test versions interactively but had already played around with other settings for the service you suggested for simular problems, like "allow service to interact with desktop". Nothing helped. Now, upon setting the interval to "1" (down from "5"), Grafana started reporting correct results! There are minimal differences that are probably due to the fact that my dashboard is only reloading every 5 seconds anyway. While the data is up to date, I notice that the update interval keeps on increasing. It starts with saying "Sent 197 metrics in 70ms". This time slowly increases with each reporting, and after around 20 minutes it was at around 9770ms. Might this be the/a bug? Could also be a problem on my slow server but hadn't seen such behaviour before with the 5 seconds reporting interval. Now, I would like to not have that much data and keep my reporting interval at 5 seconds. |
This is excellent news! We have found the culprit. Nicely done.
Polling hardware at different intervals is not currently supported. It is technically possible to solve it at the OhmGraphite level, but I'm wary of the time effort needed to workaround this bugs.
I'm not sure how the Quadro works under the hood, so your guess is as good as mine. What I would do is report this issue upstream in LibreHardwareMonitor (I'm assuming you can reproduce the issue if you change the LibreHardwareMonitor refresh rate to something like 5 seconds). And tag the people who wrote the feature initially to see if they have any insights on how this problem can be resolved. LibreHardwareMonitor/LibreHardwareMonitor#938
This is concerning. What backend are you using? Hopefully the machine it's running on is something underpowered 😄 Most of the metrics sinks are well optimized to send metrics in batches, but if something still seems fishy, let me know. |
Even better. I set it to 5 seconds in the LibreHardwareMonitor UI, then increased and decreased the actual temperature value (as reported by FanControl) from 29 up to 32, and back to 29. No changes in the UI. You can, however, set the refresh interval to 250ms which works like a "fast-forward" in this case and I was able to observe the increase and decrease after the fact in a sort of timelapse! 😆
Its a VM with an SSD and enough free capacity in space and CPU. My reporting didn't say anything about any spikes and when looking at the time for 5 second report intervals, it says at around 30-40 ms constantly. Might keep this in mind but my first prio is to report the error to upstream. I had already looked at the PR before but couldn't find any documentation on the reverse engineering of the protocol, so maybe they know more. |
I added an Aqua Computer Quadro to my setup and noticed that both its temperature reading and reported fan RPM values were incorrect in Grafana. I checked with Libre Hardware Monitor and FanControl as well as HWInfo64, all report correct values while OhmGraphite seems incorrect.
The observed behavior is very confusing since the reported value seems to lag behind its actual value. Example: The water temperature starts out correct at ~22 degrees C and stays at that level for a few minutes while the water is actually already heating up to say >30° C. Then, while all other programs are reporting the water to cool down to ~28°, it is reported at >30° in Grafana. At some point, it catches up but only keeps it at 30° for a while, while it actually already is at 27-28°. Similar behavior can be seen for the RPM reporting.
Sometimes, values randomly jump or drop to zero.
In the following image, you can see the water pump's speed (yellow) and the reported flow (blue). Other tools have it at zero for a few minutes already (which might be a problem on my side), but OhmGraphite still reports a different value for a while, then jumps up and drops to zero.
Now, since other programs are agreeing on the same values and restarting the OhmGraphite service does trigger jumps to "correct" values, it seems like some sort of wrong reading of the data? Maybe not reading enough of some buffer? Since it's using LibreHardwareMonitor under the hood, I'm not sure how that would be when their app works fine.
Happy to test out things and report back!
The text was updated successfully, but these errors were encountered: