-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
draft: Fix dashboard, add variable for pod and amend each expression to filter on pod. #157
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor nit and a question, but LGTM
examples/dashboard/kube-prometheus-stack-opencost-dashboard.json
Outdated
Show resolved
Hide resolved
}, | ||
"editorMode": "code", | ||
"expr": "topk( 20, \n sum(sum(container_memory_allocation_bytes) by (namespace,instance) * on(instance) group_left() (\n\t\t\t\tnode_ram_hourly_cost{} / 1024 / 1024 / 1024 * 730\n\t\t\t\t+ on(node,instance_type) group_left()\n\t\t\t\t\tlabel_replace\n\t\t\t\t\t(\n\t\t\t\t\t\tkube_node_labels{}, \"instance_type\", \"$1\", \"label_node_kubernetes_io_instance_type\", \"(.*)\"\n\t\t\t\t\t) * 0\n\t\t\t)\n + \n sum(container_cpu_allocation) by (namespace,instance) * on(instance) group_left() (\n\t \t\t\tnode_cpu_hourly_cost{} + on(node,instance_type) group_left()\n\t\t \t\t\tlabel_replace\n\t\t \t\t\t(\n\t\t \t\t\t\tkube_node_labels{}, \"instance_type\", \"$1\", \"label_node_kubernetes_io_instance_type\", \"(.*)\"\n\t\t \t\t\t) * 0\n\t\t \t) * 730) by (namespace)\n)", | ||
"expr": "topk( 20, \n sum(sum(container_memory_allocation_bytes) by (namespace,instance,pod) * on(instance) group_left() (\n\t\t\t\tnode_ram_hourly_cost {pod=~\".*opencost.*\",pod=~\"$pod\"} / 1024 / 1024 / 1024 * 730\n\t\t\t\t+ on(node,instance_type,pod) group_left()\n\t\t\t\t\tlabel_replace\n\t\t\t\t\t(\n\t\t\t\t\t\tkube_node_labels{}, \"instance_type\", \"$1\", \"label_node_kubernetes_io_instance_type\", \"(.*)\"\n\t\t\t\t\t) * 0\n\t\t\t)\n + \n sum(container_cpu_allocation) by (namespace,instance,pod) * on(instance) group_left() (\n\t \t\t\tnode_cpu_hourly_cost {pod=~\".*opencost.*\",pod=~\"$pod\"} + on(node,instance_type,pod) group_left()\n\t\t \t\t\tlabel_replace\n\t\t \t\t\t(\n\t\t \t\t\t\tkube_node_labels{}, \"instance_type\", \"$1\", \"label_node_kubernetes_io_instance_type\", \"(.*)\"\n\t\t \t\t\t) * 0\n\t\t \t) * 730) by (namespace)\n)", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed all the expressions changed to include pod=~\".*opencost.*\",pod=~\"$pod\"
, what's the significance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattray in my testing the dashboard broke when i changed the opencost deployment to 2 replicas. From my understanding this caused the many-to-many matching not allowed: matching labels must be unique on one side
error in the original issue.
By adding the pod variable and changing the expressions to filter on a single pod from my testing this resolved the issue.
Before merging i would urge you to test the same scenario.
- In a values file change the replica value to 2.
- Deploy OpenCost with the values file
- Test the dashboard without these changes. If you get the
many-to-many matching not allowed: matching labels must be unique on one side
error on some of the panes, test updating the dashboard in this branch and see if it resolves the issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattray i have made some more improvements including:
- Simplifying the expressions from
pod=~\".*opencost.*\",pod=~\"$pod\"
topod=~\"$pod\"
, this was achieved by changing the metric for the pod variable tonode_ram_hourly_cost
. - Added some visualisation changes to the tables.
- Added new panel for
Estimated Top 20 by Pod (30 days)
- Added links to
GitHub
,Docs
andopencost.io
- Added
Text
panel at the top with a link toMetrics
. this uses markdown and could include any information you see fit. - All panes using the
Graph (old)
plugin have been migrated toTime Series
due to a deprecation warning.
Can you test the dashboard and give me any feedback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Converting to draft until further testing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got a couple more questions @mattray
Firstly these visualisations under the App All by pod
row doesn't look right to me. For a start they all have the same Title Live Hour Price
Compare that to the visualisation under the By Namespace All
row.
Should they match the Title Live Hour Price
, Live Day Price
and Live Month Price
respectively and then reordered to match the order in the By Namespace All
row for consistency?
More importantly for the visualisations under App All by pod
row the expressions are filtered by "$app.*"
example APP Pods Hour Price
visualisation below:
I think the Label
in the Query
on the app variable should be changed from label_app
to label_app_kubernetes_name
as this is the recommended label
When i made this change in testing the query returned all apps in my EKS and AKS environments instead just of just cert-manager
, kube-prometheus-stack-operator
and a couple of others.
Before:
After:
Hope that makes sense. Let me know and i will make the fixes.
f7b1f20
to
bd9abd2
Compare
bd9abd2
to
bbca123
Compare
bbca123
to
190fee3
Compare
Sorry will cycle back around to this shortly been off work |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shortly tested with my dev setup, dashboard shows data.
@sossickd good to review and merge? |
Closing this in favor of https://github.com/opencost/opencost-grafana-dashboard |
Fix dashboard, add variable for pod and amend each pane to filter on pod.
Issue was caused when running the opencost deployment with more than one replica.
Pod variable:
Example query: