Skip to content

Terraform module to provision an AWS AutoScaling Group, IAM Role, and Security Group for EKS Workers

License

Notifications You must be signed in to change notification settings

SMR39/terraform-aws-eks-workers

 
 

Repository files navigation

README Header

Cloud Posse

terraform-aws-eks-workers Build Status Latest Release Slack Community

Terraform module to provision AWS resources to run EC2 worker nodes for Elastic Container Service for Kubernetes.

Instantiate it multiple times to create many EKS worker node pools with specific settings such as GPUs, EC2 instance types, or autoscale parameters.


This project is part of our comprehensive "SweetOps" approach towards DevOps.

Terraform Open Source Modules

It's 100% Open Source and licensed under the APACHE2.

We literally have hundreds of terraform modules that are Open Source and well-maintained. Check them out!

Introduction

The module provisions the following resources:

  • IAM Role and Instance Profile to allow Kubernetes nodes to access other AWS services
  • Security Group with rules for EKS workers to allow networking traffic
  • AutoScaling Group with Launch Template to configure and launch worker instances
  • AutoScaling Policies and CloudWatch Metric Alarms to monitor CPU utilization on the EC2 instances and scale the number of instance in the AutoScaling Group up or down. If you don't want to use the provided functionality, or want to provide your own policies, disable it by setting the variable autoscaling_policies_enabled to "false".

Usage

For a complete example, see examples/complete

provider "aws" {
  region = "${var.region}"
}

locals {
  # The usage of the specific kubernetes.io/cluster/* resource tags below are required
  # for EKS and Kubernetes to discover and manage networking resources
  # https://www.terraform.io/docs/providers/aws/guides/eks-getting-started.html#base-vpc-networking
  tags = "${merge(var.tags, map("kubernetes.io/cluster/${var.cluster_name}", "shared"))}"
}

data "aws_availability_zones" "available" {}

module "vpc" {
  source     = "git::https://github.com/cloudposse/terraform-aws-vpc.git?ref=master"
  namespace  = "${var.namespace}"
  stage      = "${var.stage}"
  name       = "${var.name}"
  attributes = "${var.attributes}"
  tags       = "${local.tags}"
  cidr_block = "10.0.0.0/16"
}

module "subnets" {
  source              = "git::https://github.com/cloudposse/terraform-aws-dynamic-subnets.git?ref=master"
  availability_zones  = ["${data.aws_availability_zones.available.names}"]
  namespace           = "${var.namespace}"
  stage               = "${var.stage}"
  name                = "${var.name}"
  attributes          = "${var.attributes}"
  tags                = "${local.tags}"
  region              = "${var.region}"
  vpc_id              = "${module.vpc.vpc_id}"
  igw_id              = "${module.vpc.igw_id}"
  cidr_block          = "${module.vpc.vpc_cidr_block}"
  nat_gateway_enabled = "true"
}

module "eks_workers" {
  source                             = "git::https://github.com/cloudposse/terraform-aws-eks-workers.git?ref=master"
  namespace                          = "${var.namespace}"
  stage                              = "${var.stage}"
  name                               = "${var.name}"
  attributes                         = "${var.attributes}"
  tags                               = "${var.tags}"
  image_id                           = "${var.image_id}"
  eks_worker_ami_name_filter         = "${var.eks_worker_ami_name_filter}"
  instance_type                      = "${var.instance_type}"
  vpc_id                             = "${module.vpc.vpc_id}"
  subnet_ids                         = ["${module.subnets.public_subnet_ids}"]
  health_check_type                  = "${var.health_check_type}"
  min_size                           = "${var.min_size}"
  max_size                           = "${var.max_size}"
  wait_for_capacity_timeout          = "${var.wait_for_capacity_timeout}"
  associate_public_ip_address        = "${var.associate_public_ip_address}"
  cluster_name                       = "${var.cluster_name}"
  cluster_endpoint                   = "${var.cluster_endpoint}"
  cluster_certificate_authority_data = "${var.cluster_certificate_authority_data}"
  cluster_security_group_id          = "${var.cluster_security_group_id}"

  # Auto-scaling policies and CloudWatch metric alarms
  autoscaling_policies_enabled           = "${var.autoscaling_policies_enabled}"
  cpu_utilization_high_threshold_percent = "${var.cpu_utilization_high_threshold_percent}"
  cpu_utilization_low_threshold_percent  = "${var.cpu_utilization_low_threshold_percent}"
}

Makefile Targets

Available targets:

  help                                Help screen
  help/all                            Display help for all targets
  help/short                          This help short screen
  lint                                Lint terraform code

Inputs

Name Description Type Default Required
allowed_cidr_blocks List of CIDR blocks to be allowed to connect to the worker nodes list <list> no
allowed_security_groups List of Security Group IDs to be allowed to connect to the worker nodes list <list> no
associate_public_ip_address Associate a public IP address with an instance in a VPC string false no
attributes Additional attributes (e.g. 1) list <list> no
autoscaling_policies_enabled Whether to create aws_autoscaling_policy and aws_cloudwatch_metric_alarm resources to control Auto Scaling string true no
aws_iam_instance_profile_name The name of the existing instance profile that will be used in autoscaling group for EKS workers. If empty will create a new instance profile. string `` no
block_device_mappings Specify volumes to attach to the instance besides the volumes specified by the AMI list <list> no
bootstrap_extra_args Passed to the bootstrap.sh script to enable --kublet-extra-args or --use-max-pods. string `` no
cluster_certificate_authority_data The base64 encoded certificate data required to communicate with the cluster string - yes
cluster_endpoint EKS cluster endpoint string - yes
cluster_name The name of the EKS cluster string - yes
cluster_security_group_id Security Group ID of the EKS cluster string - yes
cpu_utilization_high_evaluation_periods The number of periods over which data is compared to the specified threshold string 2 no
cpu_utilization_high_period_seconds The period in seconds over which the specified statistic is applied string 300 no
cpu_utilization_high_statistic The statistic to apply to the alarm's associated metric. Either of the following is supported: SampleCount, Average, Sum, Minimum, Maximum string Average no
cpu_utilization_high_threshold_percent The value against which the specified statistic is compared string 90 no
cpu_utilization_low_evaluation_periods The number of periods over which data is compared to the specified threshold string 2 no
cpu_utilization_low_period_seconds The period in seconds over which the specified statistic is applied string 300 no
cpu_utilization_low_statistic The statistic to apply to the alarm's associated metric. Either of the following is supported: SampleCount, Average, Sum, Minimum, Maximum string Average no
cpu_utilization_low_threshold_percent The value against which the specified statistic is compared string 10 no
credit_specification Customize the credit specification of the instances list <list> no
default_cooldown The amount of time, in seconds, after a scaling activity completes before another scaling activity can start string 300 no
delimiter Delimiter to be used between namespace, stage, name and attributes string - no
disable_api_termination If true, enables EC2 Instance Termination Protection string false no
ebs_optimized If true, the launched EC2 instance will be EBS-optimized string false no
eks_worker_ami_name_filter AMI name filter to lookup the most recent EKS AMI if image_id is not provided string amazon-eks-node-* no
eks_worker_ami_name_regex A regex string to apply to the AMI list returned by AWS string ^amazon-eks-node-[1-9,\.]+-v\d{8}$ no
elastic_gpu_specifications Specifications of Elastic GPU to attach to the instances list <list> no
enable_monitoring Enable/disable detailed monitoring string true no
enabled Whether to create the resources. Set to false to prevent the module from creating any resources string true no
enabled_metrics A list of metrics to collect. The allowed values are GroupMinSize, GroupMaxSize, GroupDesiredCapacity, GroupInServiceInstances, GroupPendingInstances, GroupStandbyInstances, GroupTerminatingInstances, GroupTotalInstances list <list> no
environment Environment, e.g. 'testing', 'UAT' string `` no
force_delete Allows deleting the autoscaling group without waiting for all instances in the pool to terminate. You can force an autoscaling group to delete even if it's in the process of scaling a resource. Normally, Terraform drains all the instances before deleting the group. This bypasses that behavior and potentially leaves resources dangling string false no
health_check_grace_period Time (in seconds) after instance comes into service before checking health string 300 no
health_check_type Controls how health checking is done. Valid values are EC2 or ELB string EC2 no
image_id EC2 image ID to launch. If not provided, the module will lookup the most recent EKS AMI. See https://docs.aws.amazon.com/eks/latest/userguide/eks-optimized-ami.html for more details on EKS-optimized images string `` no
instance_initiated_shutdown_behavior Shutdown behavior for the instances. Can be stop or terminate string terminate no
instance_market_options The market (purchasing) option for the instances list <list> no
instance_type Instance type to launch string - yes
key_name SSH key name that should be used for the instance string `` no
load_balancers A list of elastic load balancer names to add to the autoscaling group names. Only valid for classic load balancers. For ALBs, use target_group_arns instead list <list> no
max_size The maximum size of the autoscale group string - yes
metrics_granularity The granularity to associate with the metrics to collect. The only valid value is 1Minute string 1Minute no
min_elb_capacity Setting this causes Terraform to wait for this number of instances to show up healthy in the ELB only on creation. Updates will not wait on ELB instance number changes string 0 no
min_size The minimum size of the autoscale group string - yes
name Solution name, e.g. 'app' or 'cluster' string app no
namespace Namespace, which could be your organization name, e.g. 'eg' or 'cp' string - yes
placement The placement specifications of the instances list <list> no
placement_group The name of the placement group into which you'll launch your instances, if any string `` no
protect_from_scale_in Allows setting instance protection. The autoscaling group will not select instances with this setting for terminination during scale in events string false no
scale_down_adjustment_type Specifies whether the adjustment is an absolute number or a percentage of the current capacity. Valid values are ChangeInCapacity, ExactCapacity and PercentChangeInCapacity string ChangeInCapacity no
scale_down_cooldown_seconds The amount of time, in seconds, after a scaling activity completes and before the next scaling activity can start string 300 no
scale_down_policy_type The scalling policy type, either SimpleScaling, StepScaling or TargetTrackingScaling string SimpleScaling no
scale_down_scaling_adjustment The number of instances by which to scale. scale_down_scaling_adjustment determines the interpretation of this number (e.g. as an absolute number or as a percentage of the existing Auto Scaling group size). A positive increment adds to the current capacity and a negative value removes from the current capacity string -1 no
scale_up_adjustment_type Specifies whether the adjustment is an absolute number or a percentage of the current capacity. Valid values are ChangeInCapacity, ExactCapacity and PercentChangeInCapacity string ChangeInCapacity no
scale_up_cooldown_seconds The amount of time, in seconds, after a scaling activity completes and before the next scaling activity can start string 300 no
scale_up_policy_type The scalling policy type, either SimpleScaling, StepScaling or TargetTrackingScaling string SimpleScaling no
scale_up_scaling_adjustment The number of instances by which to scale. scale_up_adjustment_type determines the interpretation of this number (e.g. as an absolute number or as a percentage of the existing Auto Scaling group size). A positive increment adds to the current capacity and a negative value removes from the current capacity string 1 no
service_linked_role_arn The ARN of the service-linked role that the ASG will use to call other AWS services string `` no
stage Stage, e.g. 'prod', 'staging', 'dev', or 'test' string - yes
subnet_ids A list of subnet IDs to launch resources in list - yes
suspended_processes A list of processes to suspend for the AutoScaling Group. The allowed values are Launch, Terminate, HealthCheck, ReplaceUnhealthy, AZRebalance, AlarmNotification, ScheduledActions, AddToLoadBalancer. Note that if you suspend either the Launch or Terminate process types, it can prevent your autoscaling group from functioning properly. list <list> no
tags Additional tags (e.g. { BusinessUnit = "XYZ" } map <map> no
target_group_arns A list of aws_alb_target_group ARNs, for use with Application Load Balancing list <list> no
termination_policies A list of policies to decide how the instances in the auto scale group should be terminated. The allowed values are OldestInstance, NewestInstance, OldestLaunchConfiguration, ClosestToNextInstanceHour, Default list <list> no
use_custom_image_id If set to true, will use variable image_id to run EKS workers inside autoscaling group string false no
vpc_id VPC ID for the EKS cluster string - yes
wait_for_capacity_timeout A maximum duration that Terraform should wait for ASG instances to be healthy before timing out. Setting this to '0' causes Terraform to skip all Capacity Waiting behavior string 10m no
wait_for_elb_capacity Setting this will cause Terraform to wait for exactly this number of healthy instances in all attached load balancers on both create and update operations. Takes precedence over min_elb_capacity behavior string false no

Outputs

Name Description
autoscaling_group_arn ARN of the AutoScaling Group
autoscaling_group_default_cooldown Time between a scaling activity and the succeeding scaling activity
autoscaling_group_desired_capacity The number of Amazon EC2 instances that should be running in the group
autoscaling_group_health_check_grace_period Time after instance comes into service before checking health
autoscaling_group_health_check_type EC2 or ELB. Controls how health checking is done
autoscaling_group_id The AutoScaling Group ID
autoscaling_group_max_size The maximum size of the AutoScaling Group
autoscaling_group_min_size The minimum size of the AutoScaling Group
autoscaling_group_name The AutoScaling Group name
config_map_aws_auth Kubernetes ConfigMap configuration for worker nodes to join the EKS cluster. https://www.terraform.io/docs/providers/aws/guides/eks-getting-started.html#required-kubernetes-configuration-to-join-worker-nodes
launch_template_arn ARN of the launch template
launch_template_id The ID of the launch template
security_group_arn ARN of the worker nodes Security Group
security_group_id ID of the worker nodes Security Group
security_group_name Name of the worker nodes Security Group
worker_role_arn ARN of the worker nodes IAM role
worker_role_name Name of the worker nodes IAM role

Share the Love

Like this project? Please give it a ★ on our GitHub! (it helps us a lot)

Are you using this project or any of our other projects? Consider leaving a testimonial. =)

Related Projects

Check out these related projects.

Help

Got a question?

File a GitHub issue, send us an email or join our Slack Community.

README Commercial Support

Commercial Support

Work directly with our team of DevOps experts via email, slack, and video conferencing.

We provide commercial support for all of our Open Source projects. As a Dedicated Support customer, you have access to our team of subject matter experts at a fraction of the cost of a full-time engineer.

E-Mail

  • Questions. We'll use a Shared Slack channel between your team and ours.
  • Troubleshooting. We'll help you triage why things aren't working.
  • Code Reviews. We'll review your Pull Requests and provide constructive feedback.
  • Bug Fixes. We'll rapidly work to fix any bugs in our projects.
  • Build New Terraform Modules. We'll develop original modules to provision infrastructure.
  • Cloud Architecture. We'll assist with your cloud strategy and design.
  • Implementation. We'll provide hands-on support to implement our reference architectures.

Terraform Module Development

Are you interested in custom Terraform module development? Submit your inquiry using our form today and we'll get back to you ASAP.

Slack Community

Join our Open Source Community on Slack. It's FREE for everyone! Our "SweetOps" community is where you get to talk with others who share a similar vision for how to rollout and manage infrastructure. This is the best place to talk shop, ask questions, solicit feedback, and work together as a community to build totally sweet infrastructure.

Newsletter

Signup for our newsletter that covers everything on our technology radar. Receive updates on what we're up to on GitHub as well as awesome new projects we discover.

Contributing

Bug Reports & Feature Requests

Please use the issue tracker to report any bugs or file feature requests.

Developing

If you are interested in being a contributor and want to get involved in developing this project or help out with our other projects, we would love to hear from you! Shoot us an email.

In general, PRs are welcome. We follow the typical "fork-and-pull" Git workflow.

  1. Fork the repo on GitHub
  2. Clone the project to your own machine
  3. Commit changes to your own branch
  4. Push your work back up to your fork
  5. Submit a Pull Request so that we can review your changes

NOTE: Be sure to merge the latest changes from "upstream" before making a pull request!

Copyright

Copyright © 2017-2019 Cloud Posse, LLC

License

License

See LICENSE for full details.

Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements.  See the NOTICE file
distributed with this work for additional information
regarding copyright ownership.  The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License.  You may obtain a copy of the License at

  https://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied.  See the License for the
specific language governing permissions and limitations
under the License.

Trademarks

All other trademarks referenced herein are the property of their respective owners.

About

This project is maintained and funded by Cloud Posse, LLC. Like it? Please let us know by leaving a testimonial!

Cloud Posse

We're a DevOps Professional Services company based in Los Angeles, CA. We ❤️ Open Source Software.

We offer paid support on all of our projects.

Check out our other projects, follow us on twitter, apply for a job, or hire us to help with your cloud strategy and implementation.

Contributors

Erik Osterman
Erik Osterman
Andriy Knysh
Andriy Knysh
Igor Rodionov
Igor Rodionov

README Footer Beacon

About

Terraform module to provision an AWS AutoScaling Group, IAM Role, and Security Group for EKS Workers

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • HCL 95.1%
  • Smarty 2.4%
  • Makefile 1.3%
  • Shell 1.2%