Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code review comments from @ialidzhikov #5

Open
23 of 29 tasks
ialidzhikov opened this issue Feb 20, 2024 · 0 comments
Open
23 of 29 tasks

Code review comments from @ialidzhikov #5

ialidzhikov opened this issue Feb 20, 2024 · 0 comments

Comments

@ialidzhikov
Copy link
Collaborator

ialidzhikov commented Feb 20, 2024

First, you could address the findings from Oliver's review against the main branch by creating a PR that addresses his comments.


Mid

Minor

Nits (really, really, really minor)

Questions:

3: Storing the same Pod labels would be a lot waste of memory. I see that you need the Pod labels to allow selecting metrics by object labelSelector. Maybe the whole model has to be adapted. We can for example accept that Pod labels are immutable and store them only once and not for every new metric value. [under-discussion]

  • https://github.com/gardener/gardener-custom-metrics/blob/392b48aab8e985dfbbea4a9cffdf753fcc315cb3/pkg/ha/ha_service.go#L47: IIUC, the benefit of running 2 replicas is only that the 2nd Pod waits in "stand by" mode and on issues with the leader replica, the "stand by" can take over faster. By faster - we don't to wait a new Pod to be scheduled and started. Updating the Endpoint manually to influence the traffic to go to the leader replica looks hacky. We were running metrics-server for Shoots and ManagedSeeds for years with a single replica and I don't recall us having issues related to it. https://github.com/kubernetes-sigs/metrics-server/tree/master?tab=readme-ov-file#high-availability: metrics-server seems to have a real HA mode where 2 of the replicas are serving (?). We can check what they do and how. And I agree with Proposed #3 (comment) - this approach is error-prone a lot.
    • [andrerun]: The main benefit I see in the second replica is that it ties compute resources in another AZ, so it guards against AZ resource shortage disrupting failover. Overall, I have my reservations regarding the need for a second replica, considering the intended use of the component, but that was a hard requirement introduced by the GEP review process. I'll elaborate offline. [under-discussion]
  • I didn't manage to test the component in local setup at all (due to missing docs/instructions) but I wanted to ask how it behaves on restarts and whether the HPA acting on the custom metric is fine with it. I assume on Pod restart the leader will change and the newly elected replica won't report any metrics (or will report 0-ed metrics value). Is HPA able to deal with unavailability of the gardener-custom-metrics component?

Final notes. I didn't deep dive into non-trivial packages like ./pkg/input/metrics_scraper.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant