Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[teleport-update] Add support for reloading the agent & reverting symlinks on failed reload #47929

Merged
merged 24 commits into from
Nov 4, 2024

Conversation

sclevine
Copy link
Member

@sclevine sclevine commented Oct 25, 2024

This PR adds support for reloading the Teleport agent process when the enable subcommand to the teleport-update binary is executed. Additionally, this PR reverts the system to the previously installed version of the agent on failure.

This PR also adds:

  • Additional validation when creating symlinks so that they cannot replace regular files (e.g., real Teleport binaries)
  • Fixes to symlink directory permissions (0750 -> 0755)
  • Ungraceful restart fallback when graceful reloading fails

Notably, this PR is missing:

  • Proper healthchecking for the Teleport service (see TODO)

This is the fourth in a series of PRs implementing teleport-update:
Linking: #47879
Enable Command: #47565
Initial scaffolding PR: #46418

The teleport-update binary will be used to enable, disable, and trigger automatic Teleport agent updates. The new auto-updates system manages a local installation of the cluster-specified version of Teleport stored in /var/lib/teleport/versions.

RFD: #47126
Goal (internal): https://github.com/gravitational/cloud/issues/10289


Example:

ubuntu@legendary-mite:~/mounts/teleport/tool/teleport-update$ sudo ./teleport-update enable --proxy=levine.teleport.sh --force-version=16.4.3
2024-10-28T21:25:23Z INFO [UPDATER]   Version already present. version:16.4.3 agent/installer.go:145
2024-10-28T21:25:23Z INFO [UPDATER]   Target version successfully installed. version:16.4.3 agent/updater.go:334
2024-10-28T21:25:23Z INFO [UPDATER]   Teleport gracefully reloaded. agent/process.go:75
2024-10-28T21:25:23Z INFO [UPDATER]   Backup version set. version:16.4.2 agent/updater.go:356
2024-10-28T21:25:23Z INFO [UPDATER]   Configuration updated. agent/updater.go:375
ubuntu@legendary-mite:~/mounts/teleport/tool/teleport-update$ ls -la /usr/local/bin
total 8
drwxr-xr-x  2 root root 4096 Oct 28 21:27 .
drwxr-xr-x 10 root root 4096 Oct  8 23:12 ..
lrwxrwxrwx  1 root root   53 Oct 28 21:27 fdpass-teleport -> /var/lib/teleport/versions/16.4.3/bin/fdpass-teleport
lrwxrwxrwx  1 root root   42 Oct 28 21:27 tbot -> /var/lib/teleport/versions/16.4.3/bin/tbot
lrwxrwxrwx  1 root root   42 Oct 28 21:27 tctl -> /var/lib/teleport/versions/16.4.3/bin/tctl
lrwxrwxrwx  1 root root   46 Oct 28 21:27 teleport -> /var/lib/teleport/versions/16.4.3/bin/teleport
lrwxrwxrwx  1 root root   41 Oct 28 21:27 tsh -> /var/lib/teleport/versions/16.4.3/bin/tsh
ubuntu@legendary-mite:~/mounts/teleport/tool/teleport-update$ ls -la /usr/local/lib/systemd/system/
total 12
drwxr-xr-x 2 root root 4096 Oct 28 21:27 .
drwxr-xr-x 3 root root 4096 Oct 28 19:03 ..
lrwxrwxrwx 1 root root   62 Oct 28 21:27 teleport.service -> /var/lib/teleport/versions/16.4.3/etc/systemd/teleport.service
ubuntu@legendary-mite:~/mounts/teleport/tool/teleport-update$ cat /var/lib/teleport/versions/update.yaml 
version: v1
kind: update_config
spec:
    proxy: levine.teleport.sh
    group: ""
    url_template: ""
    enabled: true
status:
    active_version: 16.4.3
    backup_version: 16.4.2

Base automatically changed from sclevine/teleport-update-link1 to master October 28, 2024 16:23
@sclevine sclevine force-pushed the sclevine/teleport-update-systemd branch from 9935bf5 to 02738ef Compare October 28, 2024 17:17
@sclevine sclevine changed the title [teleport-update] Add support for reloading the agent and reverting symlinks on failure [teleport-update] Add support for reloading the agent & reverting symlinks on failed reload Oct 28, 2024
@sclevine sclevine marked this pull request as ready for review October 29, 2024 15:42
@sclevine sclevine requested review from hugoShaka and vapopov October 29, 2024 15:42
@sclevine sclevine added the no-changelog Indicates that a PR does not require a changelog entry label Oct 29, 2024
@sclevine sclevine force-pushed the sclevine/teleport-update-systemd branch from 7c53b9b to 4429a58 Compare October 30, 2024 00:31
lib/autoupdate/agent/process.go Show resolved Hide resolved
lib/autoupdate/agent/process.go Show resolved Hide resolved
lib/autoupdate/agent/process.go Show resolved Hide resolved
lib/autoupdate/agent/process.go Show resolved Hide resolved
@sclevine sclevine requested a review from hugoShaka November 1, 2024 16:07
)

// SystemdService manages a Teleport systemd service.
type SystemdService struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: btw Teleport has indirect dependency of https://github.com/coreos/go-systemd which we might use https://github.com/coreos/go-systemd/blob/main/dbus/methods_test.go#L205
I don't say that is better, just suggestion

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out. I lean towards using systemctl for now, since it's guaranteed to be present in the context the agent is executing on a machine with systemd, and we're likely going to port the existing updater logic for now. We can revisit later.

@sclevine sclevine added this pull request to the merge queue Nov 4, 2024
Merged via the queue into master with commit 323e56e Nov 4, 2024
39 checks passed
@sclevine sclevine deleted the sclevine/teleport-update-systemd branch November 4, 2024 22:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no-changelog Indicates that a PR does not require a changelog entry size/md
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants