-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Periodically check connectivity between peer proxies #48838
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The old PR was marked as merged due to an unfortunate rebase/merge incident; the content should be the same save for the QUIC implementation which now lives in #47587 instead. |
greedy52
approved these changes
Nov 12, 2024
rosstimothy
approved these changes
Nov 12, 2024
public-teleport-github-review-bot
bot
removed request for
strideynet and
doggydogworld
November 12, 2024 20:09
strideynet
approved these changes
Nov 13, 2024
Base automatically changed from
espadolini/quic-proxy-peering-preparation
to
master
November 13, 2024 16:24
espadolini
force-pushed
the
espadolini/proxy-peering-ping
branch
from
November 13, 2024 16:42
5cc26a9
to
9422a05
Compare
github-merge-queue
bot
removed this pull request from the merge queue due to failed status checks
Nov 13, 2024
@espadolini See the table below for backport results.
|
github-merge-queue bot
pushed a commit
that referenced
this pull request
Nov 14, 2024
* Make the peer clientConn generic * Convert the peer server to slog * Move lib/proxy/clusterdial to lib/peer/dial * Move peer.clientConn to lib/proxy/peer/internal * Periodically check connectivity between peer proxies (#48838)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The current implementation of proxy peering reports the state of each connection through the
proxy_peer_client_connections
metric, but each connection follows the default gRPC behavior of dropping toIDLE
after 30 minutes of disuse, so any connectivity problems will only be noticed when a new connection is attempted as a result of user interaction.This PR adds a periodic health check of proxy peering connections, initiated by the client side of the connection. The state of the health checks is exposed through two new metrics,
teleport_proxy_peer_client_pings_total
andteleport_proxy_peer_client_failed_pings_total
, labeled with the host ID, hostname and group ID of the peer. The metrics can be used to proactively alert for connectivity issues, either for a specific cluster or across clusters (if the group ID matches some geographical region or deployment group, for example).changelog: added periodic health checks between proxies in proxy peering