Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v17] Add debugging steps for DiscoverEC2 User Task issues #50329

Merged
merged 1 commit into from
Dec 17, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions api/types/usertasks/object.go
Original file line number Diff line number Diff line change
Expand Up @@ -154,8 +154,8 @@ const (
AutoDiscoverEC2IssueSSMInvocationFailure = "ec2-ssm-invocation-failure"
)

// discoverEC2IssueTypes is a list of issue types that can occur when trying to auto enroll EC2 instances.
var discoverEC2IssueTypes = []string{
// DiscoverEC2IssueTypes is a list of issue types that can occur when trying to auto enroll EC2 instances.
var DiscoverEC2IssueTypes = []string{
AutoDiscoverEC2IssueSSMInstanceNotRegistered,
AutoDiscoverEC2IssueSSMInstanceConnectionLost,
AutoDiscoverEC2IssueSSMInstanceUnsupportedOS,
Expand Down Expand Up @@ -261,8 +261,8 @@ func validateDiscoverEC2TaskType(ut *usertasksv1.UserTask) error {
)
}

if !slices.Contains(discoverEC2IssueTypes, ut.GetSpec().IssueType) {
return trace.BadParameter("invalid issue type state, allowed values: %v", discoverEC2IssueTypes)
if !slices.Contains(DiscoverEC2IssueTypes, ut.GetSpec().IssueType) {
return trace.BadParameter("invalid issue type state, allowed values: %v", DiscoverEC2IssueTypes)
}

if len(ut.Spec.DiscoverEc2.Instances) == 0 {
Expand Down
39 changes: 39 additions & 0 deletions lib/usertasks/descriptions.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
/*
* Teleport
* Copyright (C) 2024 Gravitational, Inc.
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU Affero General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*/

package usertasks

import (
"embed"
"fmt"
)

//go:embed descriptions/*.md
var descriptionsFS embed.FS

// DescriptionForDiscoverEC2Issue returns the description of the issue and fixing steps.
// The returned string contains a markdown document.
// If issue type is not recognized or doesn't have a specific description, them an empty string is returned.
func DescriptionForDiscoverEC2Issue(issueType string) string {
filename := fmt.Sprintf("descriptions/%s.md", issueType)
bs, err := descriptionsFS.ReadFile(filename)
if err != nil {
return ""
}
return string(bs)
}
25 changes: 25 additions & 0 deletions lib/usertasks/descriptions/ec2-ssm-agent-connection-lost.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
Auto enrolling EC2 instances requires the SSM Agent to be installed and running on them.
Some instances appear to have lost connection to Amazon Systems Manager.

You can see which instances lost connection using the [SSM Fleet Manager](https://console.aws.amazon.com/systems-manager/fleet-manager/managed-nodes).

The most common issues for instances losing connection:

**SSM Agent is not running**

Ensure the SSM Agent is running in the instance and is not reporting any error.
Please check the instructions [here](https://docs.aws.amazon.com/systems-manager/latest/userguide/ssm-agent-status-and-restart.html).

**SSM Agent can't reach the Amazon Systems Manager service**

Ensure the instance's security groups allows outbound connections to Amazon Systems Manager endpoints.
Allowing outbound on port 443 is enough for the agent to connect to AWS.

**Instance is missing IAM policy**

The SSM Agent requires the `AmazonSSMManagedInstanceCore` managed policy.
Ensure the instance has an IAM Profile and that it includes the above policy.
For more information please refer to [this page](https://docs.aws.amazon.com/systems-manager/latest/userguide/session-manager-getting-started-instance-profile.html).

After following the steps above, you can mark the task as resolved.
Teleport will try to auto-enroll these instances again.
25 changes: 25 additions & 0 deletions lib/usertasks/descriptions/ec2-ssm-agent-not-registered.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
Auto enrolling EC2 instances requires the SSM Agent to be installed and running on them.
Some instances failed to connect to Amazon Systems Manager.

You can see which instances were able to connect by opening the [SSM Fleet Manager](https://console.aws.amazon.com/systems-manager/fleet-manager/managed-nodes).

The most common issues for instances not being visible are:

**SSM Agent is not running**

Ensure the SSM Agent is installed and running in the instance.
Please check the instructions [here](https://docs.aws.amazon.com/systems-manager/latest/userguide/ssm-agent-status-and-restart.html).

**SSM Agent can't reach the Amazon Systems Manager service**

Ensure the instance's security groups allows outbound connections to Amazon Systems Manager endpoints.
Allowing outbound on port 443 is enough for the agent to connect to AWS.

**Instance is missing IAM policy**

The SSM Agent requires the `AmazonSSMManagedInstanceCore` managed policy.
Ensure the instance has an IAM Profile and that it includes the above policy.
For more information please refer to [this page](https://docs.aws.amazon.com/systems-manager/latest/userguide/session-manager-getting-started-instance-profile.html).

After following the steps above, you can mark the task as resolved.
Teleport will try to auto-enroll these instances again.
19 changes: 19 additions & 0 deletions lib/usertasks/descriptions/ec2-ssm-invocation-failure.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
Teleport failed to access the SSM Agent to auto enroll the instance.
Some instances failed to communicate with the AWS Systems Manager service to execute the install script.

Usually this happens when:

**Missing policies**

The IAM Role used by the integration might be missing some required permissions.
Ensure the following actions are allowed in the IAM Role used by the integration:
- `ec2:DescribeInstances`
- `ssm:DescribeInstanceInformation`
- `ssm:GetCommandInvocation`
- `ssm:ListCommandInvocations`
- `ssm:SendCommand`

**SSM Document is invalid**

Teleport uses an SSM Document to run an installation script.
If the document is changed or removed, it might no longer work.
3 changes: 3 additions & 0 deletions lib/usertasks/descriptions/ec2-ssm-script-failure.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Teleport was able to reach the SSM Agent inside the EC2 instance, however the install script returned an error.

You can click below in the Invocation URL and get further details on why the script failed.
3 changes: 3 additions & 0 deletions lib/usertasks/descriptions/ec2-ssm-unsupported-os.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Auto enrolling EC2 instances requires a compatible Operating System.

Teleport only supports Linux instances when auto-enrolling them into the cluster.
33 changes: 33 additions & 0 deletions lib/usertasks/descriptions_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
/*
* Teleport
* Copyright (C) 2024 Gravitational, Inc.
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU Affero General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*/

package usertasks

import (
"testing"

"github.com/stretchr/testify/require"

usertasksapi "github.com/gravitational/teleport/api/types/usertasks"
)

func TestAllDescriptions(t *testing.T) {
for _, issueType := range usertasksapi.DiscoverEC2IssueTypes {
require.NotEmpty(t, DescriptionForDiscoverEC2Issue(issueType), "issue type %q is missing descriptions/%s.md file", issueType, issueType)
}
}
6 changes: 5 additions & 1 deletion lib/web/ui/usertask.go
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ import (
"github.com/gravitational/trace"

usertasksv1 "github.com/gravitational/teleport/api/gen/proto/go/teleport/usertasks/v1"
"github.com/gravitational/teleport/lib/usertasks"
)

// UserTask describes UserTask fields.
Expand All @@ -45,8 +46,10 @@ type UserTask struct {

// UserTaskDetail contains all the details for a User Task.
type UserTaskDetail struct {
// UserTask has the basic fields that all taks include.
// UserTask has the basic fields that all tasks include.
UserTask
// Description is a markdown document that explains the issue and how to fix it.
Description string `json:"description,omitempty"`
// DiscoverEC2 contains the task details for the DiscoverEC2 tasks.
DiscoverEC2 *usertasksv1.DiscoverEC2 `json:"discoverEc2,omitempty"`
}
Expand Down Expand Up @@ -91,6 +94,7 @@ func MakeUserTasks(uts []*usertasksv1.UserTask) []UserTask {
func MakeDetailedUserTask(ut *usertasksv1.UserTask) UserTaskDetail {
return UserTaskDetail{
UserTask: MakeUserTask(ut),
Description: usertasks.DescriptionForDiscoverEC2Issue(ut.GetSpec().GetIssueType()),
DiscoverEC2: ut.GetSpec().GetDiscoverEc2(),
}
}
Expand Down
Loading