-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Client auto updates integration for {tctl,tsh} #47815
Conversation
This pull request is automatically being deployed by Amplify Hosting (learn more). |
cc3662d
to
4ae86ca
Compare
Fix recursive version check for darwin platform Fix cleanup for multi-package support
4ae86ca
to
a4f88ec
Compare
9f6c224
to
9fe5ed7
Compare
Update must be able to be canceled, re-execute with latest version or last updated Show progress bar before request is made
5e27ecf
to
e113ac7
Compare
// Display a progress bar before initiating the update request to inform the user that | ||
// an update is in progress, allowing them the option to cancel before actual response | ||
// which might be delayed with slow internet connection or complete isolation to CDN. | ||
pw, finish := newProgressWriter(10) | ||
defer finish() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to silence the progress writer if this is a non-interactive terminal? If so, no need to do this in this PR, we can fix later.
var pkgNames []string | ||
for _, pkg := range packages { | ||
if err := u.update(ctx, pkg); err != nil { | ||
pkgName := fmt.Sprint(uuid.New().String(), updatePackageSuffix) | ||
if err := u.update(ctx, pkg, pkgName); err != nil { | ||
return trace.Wrap(err) | ||
} | ||
pkgNames = append(pkgNames, pkgName) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The uuid is here to have a temporary package location unique to the process? Then we copy the content and remove it?
Later, we do os.CreateTemp(u.toolsDir)
, putting the temporary package in the user's home rather than the system's default tmp
? Is there any advantage/issue leading us to use the user's home as a temporary directory versus te default temp location?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should store original binary in same directory, since this is not temporary files
~/.tsh/bin/tctl symlink-> ~/.tsh/bin/UUID/tctl
~/.tsh/bin/tsh symlink-> ~/.tsh/bin/UUID/tsh
misinterpreted you comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the issue is with the archive, not the extracted location:
https://github.com/gravitational/teleport/pull/47815/files#diff-6ddd22bd400fc52c2062de3b1fc775fdaf05594f85c5ffa5a02262af40f5d62dR275
In general, it's unusual to use os.CreateTemp
outside of /tmp
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, have you done manual tests on this one yet?
Before backporting, can you add (in another PR) a list of manual test cases we'll want to add to the major test plan, and to execute after backporting the feature? For example:
- [ ] Test tsh updater
- [ ] no call are made when automatic update version is off
- [ ] can install a custom version when `TELEPORT_TOOLS_VERSION` is set
- [ ] does nothing when target version == local version
- [ ] does nothing is AUs are disabled for the cluster
- [ ] only looks up new version on `tsh login`
- [ ] downloads new version just once, then uses the local one
- [ ] Test tctl updater
- [ ] no call are made when automatic update version is off
- [ ] can install a custom version when `TELEPORT_TOOLS_VERSION` is set
- [ ] does nothing when target version == local version
- [ ] does nothing is AUs are disabled for the cluster
- [ ] only looks up new version on `tsh login`
- [ ] downloads new version just once, then uses the local one
Also, i'm very interested in testing the "proxy is not reachable" failure mode. Can we make sure we have proper timeouts to avoid hanging and that we're surfacing a clear and actionable error telling the proxy/CDN is not reachable? I'm worried that users will see a "failed to automatically update tools" error and think AUs are the issue while they just don't have access to the proxy. (This is a very common teleport issue, we usually throw a bunch of "certificate invalid" errors and users think a certificate is expired/not trusted while they just can't reach the proxy). |
We have With unreachable proxy we have timeout in 30 second, which also might be canceled by Ctrl-C as part of update process and continue proper execution of the tsh, tctl |
Address review comments
@hugoShaka @ryanclark @doggydogworld I would appreciate your review |
I'm on PTO until next week |
Thanks, that's very useful. Does it make sense to lower this timeout to 10 seconds? 30 second without feedback is very long for a user-facing tool. cc @sclevine What happens after we hit the timeout (e.g. a firewall rule is silently dropping the traffic)? Do we skip the update check and still try to login, or do we stop and return an error? If it's the latter, can we make sure it's actionable for the user. This will become the first thing people see when they try to login but don't have network connectivity. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM once the remaining active threads are resolved
@hugoShaka update with proxy call happens when we do login action, so if proxy not available we return error for update because login also will fail if we skip update. I also changed timeout for 10 seconds for client tools |
* Client auto updates integration for tctl/tsh * Add version validation Fix recursive version check for darwin platform Fix cleanup for multi-package support * Fix identifying tools removal from home directory * Replace ToolsMode with ToolsAutoUpdate * Reuse insecure flag for tests * Fix CheckRemote with login * Fix windows administrative access requirement Update must be able to be canceled, re-execute with latest version or last updated Show progress bar before request is made * Fix update cancellation for login action Address review comments * Add signal handler with stack context cancellation * Use copy instead of hard link for windows Fix progress bar if we can't receive size of package * Replace with list in order to support manual cancel * Download archive package to temp directory * Decrease timeout for client tools proxy call
* Client auto updates integration for tctl/tsh * Add version validation Fix recursive version check for darwin platform Fix cleanup for multi-package support * Fix identifying tools removal from home directory * Replace ToolsMode with ToolsAutoUpdate * Reuse insecure flag for tests * Fix CheckRemote with login * Fix windows administrative access requirement Update must be able to be canceled, re-execute with latest version or last updated Show progress bar before request is made * Fix update cancellation for login action Address review comments * Add signal handler with stack context cancellation * Use copy instead of hard link for windows Fix progress bar if we can't receive size of package * Replace with list in order to support manual cancel * Download archive package to temp directory * Decrease timeout for client tools proxy call
* Client auto updates integration for tctl/tsh * Add version validation Fix recursive version check for darwin platform Fix cleanup for multi-package support * Fix identifying tools removal from home directory * Replace ToolsMode with ToolsAutoUpdate * Reuse insecure flag for tests * Fix CheckRemote with login * Fix windows administrative access requirement Update must be able to be canceled, re-execute with latest version or last updated Show progress bar before request is made * Fix update cancellation for login action Address review comments * Add signal handler with stack context cancellation * Use copy instead of hard link for windows Fix progress bar if we can't receive size of package * Replace with list in order to support manual cancel * Download archive package to temp directory * Decrease timeout for client tools proxy call
* Client auto updates integration for {tctl,tsh} (#47815) * Client auto updates integration for tctl/tsh * Add version validation Fix recursive version check for darwin platform Fix cleanup for multi-package support * Fix identifying tools removal from home directory * Replace ToolsMode with ToolsAutoUpdate * Reuse insecure flag for tests * Fix CheckRemote with login * Fix windows administrative access requirement Update must be able to be canceled, re-execute with latest version or last updated Show progress bar before request is made * Fix update cancellation for login action Address review comments * Add signal handler with stack context cancellation * Use copy instead of hard link for windows Fix progress bar if we can't receive size of package * Replace with list in order to support manual cancel * Download archive package to temp directory * Decrease timeout for client tools proxy call * Add audit logs for auto update resources (#48218)
// At process startup, check if a version has already been downloaded to | ||
// $TELEPORT_HOME/bin or if the user has set the TELEPORT_TOOLS_VERSION | ||
// environment variable. If so, re-exec that version of {tsh, tctl}. | ||
toolsDir, err := tools.Dir() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This broke VNet in v17.0.1. VNet has a launch daemon on macOS which starts tsh vnet-daemon
in an environment where $HOME
doesn't exist.
Before 9d2b5c5, it was possible to execute tsh version
with no env vars:
env -i "$(which tsh)" version
What do you reckon would be the best way forward? I don't want to set some dummy value as $HOME
because I don't want to interfere with how launch daemons on macOS behave. Setting TELEPORT_TOOLS_VERSION=off
is not enough as tools.Dir()
is read before we check that env var.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to explicitly provide TELEPORT_HOME
with a dummy dir so that tools.Dir()
doesn't need to read HOME
, but that doesn't work either. Not because of autoupdate, but because I think when you explicitly provide TELEPORT_HOME
, tsh later tries to read the config file which it cannot in this case.
~: env -i TELEPORT_HOME=/dev/null TELEPORT_TOOLS_VERSION=off "$(which tsh)" version
ERROR: failed to load tsh config from "/dev/null/config/config.yaml"
failed to execute command /dev/null/config/config.yaml error: not a directory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ravicious thanks for reporting this one, will prepare fix this one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, keep me updated. I'll update the plist for the launch daemon with an appropriate env vars once ready.
// At process startup, check if a version has already been downloaded to | ||
// $TELEPORT_HOME/bin or if the user has set the TELEPORT_TOOLS_VERSION | ||
// environment variable. If so, re-exec that version of {tsh, tctl}. | ||
toolsDir, err := tools.Dir() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, the RFD says that Connect is going to call tsh with TELEPORT_TOOLS_VERSION=off
, but from what I see this hasn't been implemented – grepping for TELEPORT_TOOLS_VERSION
doesn't bring up anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed by #49180
* Expose client tools auto update for find endpoint (#46785) * Expose client tools auto update for find endpoint * Group auto update settings in find response Log error instead returning error Add tests auto update settings in find endpoint Add check for not implemented error * Add more test cases * Client AutoUpdate proto structure changes (#47532) * Update client autoupdate proto structure * Replace with reserved * Fix unit tests * Add more info in proto * Rename proto to be aligned RFD namings * Replace enum type for ToolsMode to string * Add packaging utility for client tools auto updates (#47060) * Add packaging utility for client tools auto updates * Add error handling for close functions * Move archive to existing utils package * Move archive helpers to integration/helper CR changes * CR changes * CR changes * CR changes Replace creating directory with extract path as argument * CR changes * Validate full size before un-archive Extract files to extractDir with ignore dir structure * Change compressing with relative paths Add test for cleanup and fix skip logic * CR changes * CR changes * Fix linter * Client tools auto update (#47466) * Add client tools auto update * Replace fork for posix platform for re-exec Move integration tests to client tools specific dir Use context cancellation with SIGTERM, SIGINT Remove cancelable tee reader with context replacement Renaming * Fix syscall path execution Fix archive cleanup if hash is not valid Limit the archive write bytes * Cover the case with single package for darwin platform after v17 * Move updater logic to tools package * Move context out from the library Base URL renaming * Add more context in comments * Changes in find endpoint * Replace test http server with `httptest` Replace hash for bytes matching Proper temp file close for archive download * Add more context to comments * Move feature flag to main package to be reused * Constant rename * Replace build tag with lib/modules to identify enterprise build * Replace fips tag with modules flag * Client auto updates integration for {tctl,tsh} (#47815) * Client auto updates integration for tctl/tsh * Add version validation Fix recursive version check for darwin platform Fix cleanup for multi-package support * Fix identifying tools removal from home directory * Replace ToolsMode with ToolsAutoUpdate * Reuse insecure flag for tests * Fix CheckRemote with login * Fix windows administrative access requirement Update must be able to be canceled, re-execute with latest version or last updated Show progress bar before request is made * Fix update cancellation for login action Address review comments * Add signal handler with stack context cancellation * Use copy instead of hard link for windows Fix progress bar if we can't receive size of package * Replace with list in order to support manual cancel * Download archive package to temp directory * Decrease timeout for client tools proxy call * Add audit logs for auto update resources (#48218) * Connect: Make sure tsh auto-updates are turned off (#49180) * Add dir for code shared between Node.js processes * Connect: Make sure tsh auto-updates are turned off * Pass TELEPORT_TOOLS_VERSION=off to tsh vnet-daemon * Disable client tools auto update disabled if there are no home dir (#49159) Move updater to general tools package * Move client auto update helper to lib package (#49247) --------- Co-authored-by: Rafał Cieślak <[email protected]>
* Expose client tools auto update for find endpoint (#46785) * Expose client tools auto update for find endpoint * Group auto update settings in find response Log error instead returning error Add tests auto update settings in find endpoint Add check for not implemented error * Add more test cases * Client AutoUpdate proto structure changes (#47532) * Update client autoupdate proto structure * Replace with reserved * Fix unit tests * Add more info in proto * Rename proto to be aligned RFD namings * Replace enum type for ToolsMode to string * Add packaging utility for client tools auto updates (#47060) * Add packaging utility for client tools auto updates * Add error handling for close functions * Move archive to existing utils package * Move archive helpers to integration/helper CR changes * CR changes * CR changes * CR changes Replace creating directory with extract path as argument * CR changes * Validate full size before un-archive Extract files to extractDir with ignore dir structure * Change compressing with relative paths Add test for cleanup and fix skip logic * CR changes * CR changes * Fix linter * Client tools auto update (#47466) * Add client tools auto update * Replace fork for posix platform for re-exec Move integration tests to client tools specific dir Use context cancellation with SIGTERM, SIGINT Remove cancelable tee reader with context replacement Renaming * Fix syscall path execution Fix archive cleanup if hash is not valid Limit the archive write bytes * Cover the case with single package for darwin platform after v17 * Move updater logic to tools package * Move context out from the library Base URL renaming * Add more context in comments * Changes in find endpoint * Replace test http server with `httptest` Replace hash for bytes matching Proper temp file close for archive download * Add more context to comments * Move feature flag to main package to be reused * Constant rename * Replace build tag with lib/modules to identify enterprise build * Replace fips tag with modules flag * Client auto updates integration for {tctl,tsh} (#47815) * Client auto updates integration for tctl/tsh * Add version validation Fix recursive version check for darwin platform Fix cleanup for multi-package support * Fix identifying tools removal from home directory * Replace ToolsMode with ToolsAutoUpdate * Reuse insecure flag for tests * Fix CheckRemote with login * Fix windows administrative access requirement Update must be able to be canceled, re-execute with latest version or last updated Show progress bar before request is made * Fix update cancellation for login action Address review comments * Add signal handler with stack context cancellation * Use copy instead of hard link for windows Fix progress bar if we can't receive size of package * Replace with list in order to support manual cancel * Download archive package to temp directory * Decrease timeout for client tools proxy call * Add audit logs for auto update resources (#48218) * Connect: Make sure tsh auto-updates are turned off * Add dir for code shared between Node.js processes * Connect: Make sure tsh auto-updates are turned off * Pass TELEPORT_TOOLS_VERSION=off to tsh vnet-daemon * Disable client tools auto update disabled if there are no home dir (#49159) Move updater to general tools package * Move client auto update helper to lib package (#49247) --------- Co-authored-by: Rafał Cieślak <[email protected]>
In this PR implemented integration of client auto updates proposed in RFD
changelog: Client tools {tctl,tsh} auto-updates controlled by cluster configuration
client_tools_autoupdate.mov