-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix the proxy server backend metric error #295
Conversation
|
Welcome @YRXING! |
Hi @YRXING. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
pkg/server/metrics/metrics.go
Outdated
@@ -164,6 +167,16 @@ func (a *ServerMetrics) SetBackendCount(count int) { | |||
a.backend.WithLabelValues().Set(float64(count)) | |||
} | |||
|
|||
// BackendCountInc increments a new backend connection. | |||
func (a *ServerMetrics) BackendCountInc(manager string, idType string) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution.
we have only one managed supported by the server once, so I guess it's not necessary to record metrics for each backend manager. the introduction of idType
is valuable, but for current use cases, I'd suggest we keep the code as it is now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
multiple backend managers are supported for one proxy-sever, close this comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
anything else I need to do about this issue?
pkg/server/server.go
Outdated
@@ -200,14 +200,17 @@ func (s *ProxyServer) addBackend(agentID string, conn agent.AgentService_Connect | |||
for _, ipv4 := range agentIdentifiers.IPv4 { | |||
klog.V(5).InfoS("Add the agent to DestHostBackendManager", "agent address", ipv4) | |||
s.BackendManagers[i].AddBackend(ipv4, pkgagent.IPv4, conn) | |||
metrics.Metrics.BackendCountInc("DestHostBackendManager", "ipv4") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's leaving this logic in AddBackend
is preferable, which makes adding backend logic more clear.
How about adding filed to backend storage to differentiate backend count for different kinds of backend managers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please review the code.
dibm.mu.RLock() | ||
defer dibm.mu.RUnlock() | ||
if len(dibm.backends) == 0 { | ||
func (drbm *DefaultRouteBackendManager) Backend(ctx context.Context) (Backend, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
@@ -204,7 +214,6 @@ func (s *DefaultBackendStorage) AddBackend(identifier string, idType pkgagent.Id | |||
return addedBackend | |||
} | |||
s.backends[identifier] = []*backend{addedBackend} | |||
metrics.Metrics.SetBackendCount(len(s.backends)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution, @YRXING .
We are complicating the logic of the backend count metrics. we don't have to rewrite AddBackend
for each manager to collect backend count metrics. code maintenance and reusability should be taken into consideration.
How about we keep SetBackendCount
here and add the manager name, idtype into it. manager name, can be defined as member of backendstore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the DefaultBackendStorage need record different idTypes count metric or just record the total backend count?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following your thought already in this PR, record different backend counts from different idTypes and backend managers. SetBackendCount
will finally look like this way after other changes are also included.
metrics.Metrics.SetBackendCount(len(s.backends)) | |
metrics.Metrics.SetBackendCount(s.backendType, idType, len(s.backends)) |
cc @cheftako and @jkh52 who will give the final say to this PR.
/ok-to-test |
/lgtm I like depending on len() only, compared with Inc() / Dec(). |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
[]string{ | ||
"manager", | ||
"idType", | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@logicalhan Is a change like this backward compatible, and if not is it more appropriate to create a new metric and deprecate the old?
https://kubernetes.io/blog/2021/04/23/kubernetes-release-1.21-metrics-stability-ga/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is definitely not backwards compatible. This isn't in k8s-proper, so the same rules of API conformance do not necessarily hold here. I would at the very least add a release note clearly denoting the API change to the metric.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also what's the expected cardinality of manager and idType?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think cardinality is likely not a concern. (manager is currently 3, idType is about 3 or 4).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is definitely not backwards compatible. This isn't in k8s-proper, so the same rules of API conformance do not necessarily hold here. I would at the very least add a release note clearly denoting the API change to the metric.
This is a binary we include in some distros of K8s (i.e. its an optional binary in K8s) so we should probably consider it part of K8s.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think cardinality is likely not a concern. (manager is currently 3, idType is about 3 or 4).
Technically the idType is part of a CLI flag on the agent (Eg https://github.com/kubernetes/kubernetes/blob/master/cluster/gce/addons/konnectivity-agent/konnectivity-agent-ds.yaml#L43), so theoretically its not bounded.
Practically the number of managers and id types's are under the cloud provider's control.
So I would expect this number to be small.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. However that is on the agent side and the agent is at some level a reference implementation.
If we had similar code on the server side to limit the cardinality of idType, then I think we could reasonably claim it was limited.
/remove-lgtm I think we should keep this metric as-is, and add a new one. That way we avoid breaking current integrations with metrics readers. |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close |
@k8s-triage-robot: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/remove-lifecycle rotten |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: YRXING The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@YRXING: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
@YRXING: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close |
@k8s-triage-robot: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
What type of PR is this?
/kind bug
What this PR does / why we need it:
the backend metric should be recorded by proxy server
Which issue(s) this PR fixes:
Fixes #294