Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UserTasks for Discover EKS: create tasks when auto enrolling fails #50024

Merged
merged 2 commits into from
Dec 13, 2024

Conversation

marcoandredinis
Copy link
Contributor

This PRs changes the DiscoveryService to ensure it creates UserTasks when EKS Clusters fail to auto-enroll.

Demo:
tctl get user_tasks

kind: user_task
metadata:
  expires:
    nanos: 62859000
    seconds: 1733832779
  name: 66ab3b42-6c41-5923-9e27-bfda6c767127
  revision: 9f9de9e6-39f8-469e-8c89-1fda10abd69b
spec:
  discover_eks:
    account_id: "123456789012"
    app_auto_discover: true
    clusters:
      MarcoUserTasks01:
        discovery_config: b118fe1b-4db6-4705-80cd-a7142063ff78
        discovery_group: aws-prod
        name: MarcoUserTasks01
        sync_time:
          nanos: 59118000
          seconds: 1733832179
      MarcoUserTasks03:
        discovery_config: b118fe1b-4db6-4705-80cd-a7142063ff78
        discovery_group: aws-prod
        name: MarcoUserTasks03
        sync_time:
          nanos: 40327000
          seconds: 1733832179
    region: eu-west-2
  integration: teleportdev
  issue_type: eks-agent-not-connecting
  state: OPEN
  task_type: discover-eks
version: v1
---
kind: user_task
metadata:
  expires:
    nanos: 70007000
    seconds: 1733832779
  name: dd4d668c-5cea-591d-a2e8-0f0c125db479
  revision: c9887168-0b80-4323-8551-7e1ebe98c8a4
spec:
  discover_eks:
    account_id: "123456789012"
    app_auto_discover: true
    clusters:
      MarcoUserTasks02:
        discovery_config: b118fe1b-4db6-4705-80cd-a7142063ff78
        discovery_group: aws-prod
        name: MarcoUserTasks02
        sync_time:
          nanos: 68190000
          seconds: 1733832179
    region: eu-west-2
  integration: teleportdev
  issue_type: eks-cluster-unreachable
  state: OPEN
  task_type: discover-eks
version: v1

@marcoandredinis marcoandredinis added no-changelog Indicates that a PR does not require a changelog entry backport/branch/v16 backport/branch/v17 labels Dec 10, 2024
@marcoandredinis marcoandredinis force-pushed the marco/usertasks_discovereks_proto branch from b841ab3 to b7a9d46 Compare December 11, 2024 08:49
@marcoandredinis marcoandredinis force-pushed the marco/usertasks_discovereks_impl branch from 2353a8d to 05458f1 Compare December 11, 2024 08:50
@marcoandredinis marcoandredinis marked this pull request as ready for review December 11, 2024 09:48
@marcoandredinis marcoandredinis force-pushed the marco/usertasks_discovereks_proto branch 2 times, most recently from dbf3063 to 0e77b4d Compare December 11, 2024 10:31
Base automatically changed from marco/usertasks_discovereks_proto to master December 11, 2024 11:09
appAutoDiscoverString := strconv.FormatBool(parts.AppAutoDiscover)
bs = append(bs, binary.LittleEndian.AppendUint64(nil, uint64(len(appAutoDiscoverString)))...)
bs = append(bs, []byte(appAutoDiscoverString)...)
return uuid.NewSHA1(discoverEKSNamespace, bs).String()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious: do you know why instead of hashing a result of fmt.Sprintf (or even simply using plain strings), we decided to use this kind of pattern for generating task names? It seems brittle.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also: we are appending string lengths as if we were about to parse the resulting binary data, but we only hash it. What does appending lengths give us here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I think I can answer my own question: it's because we want to be able to unambiguously serialize groups of strings and be able to tell apart "ab"+"c" from "a"+"bc". Here's the thing: I still think it's brittle and very easy to get wrong if we repeat this pattern. Perhaps we could use some well-established serialization here? Protobuf, JSON?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using protobof would add the dependency here, which I don't think we should introduce.

json.Marshal does not ensure determinism (at least not officially), so I would rather not use it.

I can pursue other types of hashing, but we are already using this method of DiscoverEC2 tasks

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. I guess the current way is our best choice, then.

// - EKS is not reachable by the Teleport Auth Service
// In the first case, it should be handled in a pre-install check, however, for the second one, we'll get the following message:
// > Kubernetes cluster unreachable: Get \"https://<longid>.gr7.<region>.eks.amazonaws.com/version\": dial tcp: lookup <longid>.gr7.<region>.eks.amazonaws.com: no such host"
if strings.Contains(checkErr.Error(), "Kubernetes cluster unreachable: Get") && strings.Contains(checkErr.Error(), "eks.amazonaws.com: no such host") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this error message also generated by us? If so, can we put a link to the place that generates it, similar to what we wrote above to "link" with the UI logic? (Or better yet, if it's our Go code, we could extract constants.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not. It's coming from helm kube client.

// awsEKSTasks contains the Discover EKS User Tasks that must be reported to the user.
type awsEKSTasks struct {
mu sync.RWMutex
// clusterIssues maps the Discover EKS User Task grouping parts to a set of clusters metadata.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My parser failed on this sentence. :D Can you explain what is the meaning of the [string] key in this map?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've simplified the types.
Can you please take another look?
The string was the EKS Cluster name.

appAutoDiscover bool
}

// iterationStarted clears out any in memory issues that were recorded.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm... Was reset previously named iterationStarted?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct 😅
Thank you, fixed.

d.mu.Lock()
defer d.mu.Unlock()

d.clusterIssues = make(map[awsEKSTaskKey]map[string]*usertasksv1.DiscoverEKSCluster)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be useful to extract this to a named type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've simplified the types

This PRs changes the DiscoveryService to ensure it creates UserTasks
when EKS Clusters fail to auto-enroll.
@marcoandredinis marcoandredinis force-pushed the marco/usertasks_discovereks_impl branch from 05458f1 to d4fae69 Compare December 13, 2024 11:29
@marcoandredinis marcoandredinis added this pull request to the merge queue Dec 13, 2024
Merged via the queue into master with commit d37da51 Dec 13, 2024
41 checks passed
@marcoandredinis marcoandredinis deleted the marco/usertasks_discovereks_impl branch December 13, 2024 12:05
@public-teleport-github-review-bot

@marcoandredinis See the table below for backport results.

Branch Result
branch/v17 Failed

marcoandredinis added a commit that referenced this pull request Dec 16, 2024
…50024)

* UserTasks for Discover EKS: create tasks when auto enrolling fails

This PRs changes the DiscoveryService to ensure it creates UserTasks
when EKS Clusters fail to auto-enroll.

* simplify types
marcoandredinis added a commit that referenced this pull request Dec 16, 2024
…50024)

* UserTasks for Discover EKS: create tasks when auto enrolling fails

This PRs changes the DiscoveryService to ensure it creates UserTasks
when EKS Clusters fail to auto-enroll.

* simplify types
marcoandredinis added a commit that referenced this pull request Dec 17, 2024
…50024)

* UserTasks for Discover EKS: create tasks when auto enrolling fails

This PRs changes the DiscoveryService to ensure it creates UserTasks
when EKS Clusters fail to auto-enroll.

* simplify types
marcoandredinis added a commit that referenced this pull request Dec 17, 2024
…50024)

* UserTasks for Discover EKS: create tasks when auto enrolling fails

This PRs changes the DiscoveryService to ensure it creates UserTasks
when EKS Clusters fail to auto-enroll.

* simplify types
github-merge-queue bot pushed a commit that referenced this pull request Dec 17, 2024
…50024) (#50272)

* UserTasks for Discover EKS: create tasks when auto enrolling fails

This PRs changes the DiscoveryService to ensure it creates UserTasks
when EKS Clusters fail to auto-enroll.

* simplify types
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/branch/v17 discovery no-changelog Indicates that a PR does not require a changelog entry size/lg
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants