-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Machine ID: Include Bot Instance ID in audit logs #43786
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This creates new instances for bots when they initially join the cluster, and persists instance IDs in new certificate fields on join and during renewal. Note that this does not yet handle instance reuse for non-token join methods. Additionally, bot instance creation is locked behind a `BOT_INSTANCE_EXPERIMENT` flag; it must be set to `1` to enable creation.
This makes various (admittedly breaking) protobuf changes, including removing the TTL field (calculating resource expiry based on cert requests), removing public key fingerprints, and changing the data type of the generation counter to match the preexisting internal datatype. These changes _should_ be safe as no consumers of the proto API currently exist. Additionally, this also updates bot authentications on renewal.
This now creates bot instances for bots whose certs are missing the BotInstanceID field. Additionally, it fixes two related bugs: expiration dates are extended on renewal, the generated UUID is properly appended to certs on initial join, and instances are only created or updated when the experiment is enabled.
strideynet
added
the
no-changelog
Indicates that a PR does not require a changelog entry
label
Jul 3, 2024
10 tasks
espadolini
approved these changes
Jul 24, 2024
AntonAM
approved these changes
Jul 24, 2024
timothyb89
added a commit
that referenced
this pull request
Aug 6, 2024
* Create Bot Instances during initial bot join This creates new instances for bots when they initially join the cluster, and persists instance IDs in new certificate fields on join and during renewal. Note that this does not yet handle instance reuse for non-token join methods. Additionally, bot instance creation is locked behind a `BOT_INSTANCE_EXPERIMENT` flag; it must be set to `1` to enable creation. * Proto cleanup, and update bot auth records on cert renewal This makes various (admittedly breaking) protobuf changes, including removing the TTL field (calculating resource expiry based on cert requests), removing public key fingerprints, and changing the data type of the generation counter to match the preexisting internal datatype. These changes _should_ be safe as no consumers of the proto API currently exist. Additionally, this also updates bot authentications on renewal. * Fix proto lints * Fix misleading doc comment in the bot instance experiment * Create bot instances for old bots on join; other fixes This now creates bot instances for bots whose certs are missing the BotInstanceID field. Additionally, it fixes two related bugs: expiration dates are extended on renewal, the generated UUID is properly appended to certs on initial join, and instances are only created or updated when the experiment is enabled. * Add a minimal test for bot instance creation on initial join * Validate bot instance state in generation counter checks * Remove outdated TODO comment and fix test lints * Add an expiration change check to the generation test * Add BotInstanceID to audit events * Fix borked conflict resolution * Fix further borked conflict resolution * Add test for cert create/join --------- Co-authored-by: Tim Buckley <[email protected]>
timothyb89
added a commit
that referenced
this pull request
Aug 8, 2024
* Create Bot Instances during initial bot join This creates new instances for bots when they initially join the cluster, and persists instance IDs in new certificate fields on join and during renewal. Note that this does not yet handle instance reuse for non-token join methods. Additionally, bot instance creation is locked behind a `BOT_INSTANCE_EXPERIMENT` flag; it must be set to `1` to enable creation. * Proto cleanup, and update bot auth records on cert renewal This makes various (admittedly breaking) protobuf changes, including removing the TTL field (calculating resource expiry based on cert requests), removing public key fingerprints, and changing the data type of the generation counter to match the preexisting internal datatype. These changes _should_ be safe as no consumers of the proto API currently exist. Additionally, this also updates bot authentications on renewal. * Fix proto lints * Fix misleading doc comment in the bot instance experiment * Create bot instances for old bots on join; other fixes This now creates bot instances for bots whose certs are missing the BotInstanceID field. Additionally, it fixes two related bugs: expiration dates are extended on renewal, the generated UUID is properly appended to certs on initial join, and instances are only created or updated when the experiment is enabled. * Add a minimal test for bot instance creation on initial join * Validate bot instance state in generation counter checks * Remove outdated TODO comment and fix test lints * Add an expiration change check to the generation test * Add BotInstanceID to audit events * Fix borked conflict resolution * Fix further borked conflict resolution * Add test for cert create/join --------- Co-authored-by: Tim Buckley <[email protected]>
github-merge-queue bot
pushed a commit
that referenced
this pull request
Aug 26, 2024
* Add basic CRUD for BotInstanceService (#42899) * Add basic CRUD for BotInstanceService This adds an early backend for the bot instance service, with resource definitions and basic CRUD operations. * Update api/proto/teleport/machineid/v1/bot_instance.proto Co-authored-by: Noah Stride <[email protected]> * Further work on BotInstance boilerplate Among other things, this adds event definitions, RBAC and presets, and client definitions. This also includes a few proto changes, namely moving bot name and instance ID from .Status to .Spec so metadata can be uniformly generated. This also adds tests for the backend service showing functional CRUD. * gRPC tests; fix filtering; fix import lints; misc cleanup * Consistently use `instance_id` in gRPC fields * Add tctl support for bot_instance resource * protos update * Various typo fixes * Fix more comment typos * More doc comments and minor test tweaks --------- Co-authored-by: Noah Stride <[email protected]> * Create Bot Instances during initial bot join (#43577) * Create Bot Instances during initial bot join This creates new instances for bots when they initially join the cluster, and persists instance IDs in new certificate fields on join and during renewal. Note that this does not yet handle instance reuse for non-token join methods. Additionally, bot instance creation is locked behind a `BOT_INSTANCE_EXPERIMENT` flag; it must be set to `1` to enable creation. * Proto cleanup, and update bot auth records on cert renewal This makes various (admittedly breaking) protobuf changes, including removing the TTL field (calculating resource expiry based on cert requests), removing public key fingerprints, and changing the data type of the generation counter to match the preexisting internal datatype. These changes _should_ be safe as no consumers of the proto API currently exist. Additionally, this also updates bot authentications on renewal. * Fix proto lints * Fix misleading doc comment in the bot instance experiment * Create bot instances for old bots on join; other fixes This now creates bot instances for bots whose certs are missing the BotInstanceID field. Additionally, it fixes two related bugs: expiration dates are extended on renewal, the generated UUID is properly appended to certs on initial join, and instances are only created or updated when the experiment is enabled. * Add a minimal test for bot instance creation on initial join * Validate bot instance state in generation counter checks * Remove outdated TODO comment and fix test lints * Add an expiration change check to the generation test * Machine ID: Provide existing identity when rejoining as part of renewals (#43732) * Allow existing bot identity to be used for token based renewals * Add test for register with auth client * Fix tests * Fix test * Fix partial assignment to interface * Improve loggging * Fix spelling * Fix missing import * Fix built gRPC * Machine ID: Bot Instance heartbeating (#43463) * Add basic CRUD for BotInstanceService This adds an early backend for the bot instance service, with resource definitions and basic CRUD operations. * Update api/proto/teleport/machineid/v1/bot_instance.proto Co-authored-by: Noah Stride <[email protected]> * Further work on BotInstance boilerplate Among other things, this adds event definitions, RBAC and presets, and client definitions. This also includes a few proto changes, namely moving bot name and instance ID from .Status to .Spec so metadata can be uniformly generated. This also adds tests for the backend service showing functional CRUD. * gRPC tests; fix filtering; fix import lints; misc cleanup * Consistently use `instance_id` in gRPC fields * Add tctl support for bot_instance resource * protos update * Various typo fixes * Fix more comment typos * More doc comments and minor test tweaks * Add SubmitHeartbeat RPC implementation * Fix log keys * Update to use proper BotInstanceID field in identiy * Add rough outline of heartbeat service --------- Co-authored-by: Tim Buckley <[email protected]> * Allow bots to provide client cert authentication when rejoining (#44257) * Allow bots to provide client cert authentication when rejoining Traditionally, bots using non-`token` join methods "renew" their certs by rejoining, taking advantage of their repeatable join methods to avoid relying on `token` rejoin logic. However, as part of [RFD 162], we need a way for rejoining bots to assert an existing identity (if any) even when rejoining "from scratch". When paired with #43732, this change allows bots to use an authenticated auth client for all "initial" joins. If present, an existing `BotInstanceID` is extracted from the client identity and used to associate the bot with its previous identity. As certain join methods (TPM, Azure) are initially handled by the `JoinService` using gRPC, this additionally adds a new `BotInstanceID` field to `RegisterUsingTokenRequest` so the `JoinService` can verify the client identity and extract the `BotInstanceID` if available. Like `RemoteAddr`, this value is only trusted when set by the proxy and should be ignored in all other cases. [RFD 162]: #36510 * Add tests for authenticated bot rejoining * Create a new bot instance if the requested instance is not found Also, includes a new check in for this behavior in `TestRegisterBotWithInvalidInstanceID`. * Fix imports * Update lib/auth/auth_with_roles.go Co-authored-by: Zac Bergquist <[email protected]> * Fix `cmp` import --------- Co-authored-by: Zac Bergquist <[email protected]> * Machine ID: Include Bot Instance ID in audit logs (#43786) * Create Bot Instances during initial bot join This creates new instances for bots when they initially join the cluster, and persists instance IDs in new certificate fields on join and during renewal. Note that this does not yet handle instance reuse for non-token join methods. Additionally, bot instance creation is locked behind a `BOT_INSTANCE_EXPERIMENT` flag; it must be set to `1` to enable creation. * Proto cleanup, and update bot auth records on cert renewal This makes various (admittedly breaking) protobuf changes, including removing the TTL field (calculating resource expiry based on cert requests), removing public key fingerprints, and changing the data type of the generation counter to match the preexisting internal datatype. These changes _should_ be safe as no consumers of the proto API currently exist. Additionally, this also updates bot authentications on renewal. * Fix proto lints * Fix misleading doc comment in the bot instance experiment * Create bot instances for old bots on join; other fixes This now creates bot instances for bots whose certs are missing the BotInstanceID field. Additionally, it fixes two related bugs: expiration dates are extended on renewal, the generated UUID is properly appended to certs on initial join, and instances are only created or updated when the experiment is enabled. * Add a minimal test for bot instance creation on initial join * Validate bot instance state in generation counter checks * Remove outdated TODO comment and fix test lints * Add an expiration change check to the generation test * Add BotInstanceID to audit events * Fix borked conflict resolution * Fix further borked conflict resolution * Add test for cert create/join --------- Co-authored-by: Tim Buckley <[email protected]> * Show bot instance ID in tbot log output (#44578) This tweaks the "fetched new bot identity" message to show the bot name and instance ID as embedded in the bot's certificate. Example: ``` 2024-07-23T15:51:20-06:00 INFO [TBOT:IDEN] Fetched new bot identity identity:tpm-test, id=5a2865d3-d3dc-4eaa-853c-5377a1fe83f6 | valid: after=2024-07-23T21:50:20Z, before=2024-07-23T21:56:19Z, duration=5m59s | kind=tls, renewable=false, disallow-reissue=false, roles=[bot-tpm-test], principals=[-teleport-internal-join], generation=0 tbot/service_bot_identity.go:223 ``` * Machine ID: Validate generation counter using bot instances (#44583) * Show bot instance ID in tbot log output This tweaks the "fetched new bot identity" message to show the bot name and instance ID as embedded in the bot's certificate. Example: ``` 2024-07-23T15:51:20-06:00 INFO [TBOT:IDEN] Fetched new bot identity identity:tpm-test, id=5a2865d3-d3dc-4eaa-853c-5377a1fe83f6 | valid: after=2024-07-23T21:50:20Z, before=2024-07-23T21:56:19Z, duration=5m59s | kind=tls, renewable=false, disallow-reissue=false, roles=[bot-tpm-test], principals=[-teleport-internal-join], generation=0 tbot/service_bot_identity.go:223 ``` * Machine ID: Validate generation counter using bot instances This change moves the generation counter from bot user labels to a field in `BotInstanceStatusAuthentication`. This allows for tracking of individual generations regardless of join method, allows for multiple `token`-type joins per bot, and opens the door for multi-use `token`-type tokens in the future. For now, the new behavior remains behind the `BOT_INSTANCE_EXPERIMENT` feature flag, and generation counter handling is largely unchanged when it is unset. * Update legacy generation counter for downgrade compatibility This updates the legacy user label generation counter to support some downgrade compatibility if a user either disables the bot instance experiment or downgrades to a Teleport version predating it. Also includes some logger cleanup, and adds a warning when a generation counter mismatch is detected but not enforced. * Set initial generation value for legacy mode `validateGenerationLabel` expects a nonzero generation counter value for renewable certs to populate the counter, so this provides it again. * Add new generation counter tests This adds various new tests to validate new per-instance generation counter behavior. Also includes backcompat fixes to import old generation counters, and a small fix to ensure downgrades don't leave bots in a bad state. * Fix imports * Tweak TODO message * Fix HostCAs not being returned during bot renewal (#44832) * Machine ID: Remove `BOT_INSTANCE_EXPERIMENT` feature flag (#44904) This removes the `BOT_INSTANCE_EXPERIMENT` feature flag to enable the feature universally. It also adjusts some comments to add deprecation notices. * Machine ID: Only set legacy generation counter for renewable joins (#44993) We unconditionally committed the generation counter to the legacy user label to allow for downgrade compatibility. This is not valid for non-token join methods as previous Teleport versions will attempt to check the generation counter whenever the label is nonzero, and because these older versions don't support authenticated rejoining, will never be able to compare to the client's actual generation value. (Plus, it was hard-coded to zero.) This disables the legacy generation counter for non-renewable join methods. This should not create a later upgrade problem as bots will lose their instance ID upon their first rejoin after downgrading, and so will be issued a fresh ID and generation counter if and when the cluster is upgraded again. * Remove unused function sshPublicKeyToPKIXPEM Left-over function from backport * Always use PEM PKIX DER representation of public keys for BotInstances (#43845) * Always use PEM PKIX DER representation of public keys for BotInstances * Prefer using `keys.MarshalPublicKey` * Remove mistakenly included TestVerifyALPNUpgradedConn The recent merge included an unrelated unit test. This removes it. --------- Co-authored-by: Noah Stride <[email protected]> Co-authored-by: Zac Bergquist <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Part of #41456
Adds the BotInstanceID and BotName to most common audit events.