Skip to content

Terraform Style Guide

Cas edited this page Mar 19, 2019 · 42 revisions

Terraform Style Guide


Project Structure

Project Structure, Documentation and our our opinionated approach on how to Terraform


Table of Contents

  1. Project Structure
  2. TLDR
  3. Overview
  4. Basic Usage
  5. Conventions for Variable Names and the use of locals
  6. Data Sources
  7. Resource Names
  8. Argument Names
  9. Use of modules
  10. Remote State
  11. Naming of GitHub repos
  12. Handling Secrets
  13. Direnv and dot env files
  14. HCL and Terraform Style Guides
  15. GitHub Pull Request and Deployment Steps

TLDR

  • We used Charity Major model for a control repo as our starting point
  • We are trying to adhere to the 12 factor app
  • All environment specific .tf files and variables go in the env- directories
  • We "base" env fairly sparingly
  • Spines and Spikes
    • A Spine is a "parent" repo that contains the foundation for application configuration, including Hosted Zones and VPC
    • A Spike is the configuration for an application deployment that uses the resources provided in the Spine

And yes... "Spine" and "Spikes" were inspired by @tfhartmann watching far to much children's television recently:

Figure 1-1


Overview

We designed our Terraform design patterns based around the "control repo" pattern described by Charity Majors but have modified it slightly to enable collaborators to work on services and applications that may use the core infrastructure, but don't necessarily need to be coded along side it.

For example, to deploy or manage a particular application you may need use a particular set of networks and resources within a given VPC, but don't necessarily need to manage the networks. To address this, we came up with concept of Spines and Spikes. A Spine is where resources such as VPC, DNS, Public Networks, Private Networks, and core services live. *Spikes are complementary repos that use or reference the Spines infrastructure but can deploy and redeploy the application independently of the Spine. This is accomplished through the use of Terraform data sources, and the magic of Terraform shared state.

We are following a pretty standard multi-tier pattern where we have an infrastructure repo with three tiers: dev, stage, and production; i.e FitnessKeeper/terraform-runkeeper. We use that control repo to build VPC, ECS Clusters, DNS Zones, and other resources that can be presented as a platform for use by services.

Services are created using atomic control repos using the FitnessKeeper/terraform-reference repo as a skeleton. We have distinct state files for each of the tiers. This way, we can make changes to the state of a service living atop our infrastructure without having to push changes to the underlying resources.


Basic usage of the terraform-reference repo.

  • There are two ways of getting started with the reference repo:
    • Clone the repo using: git clone https://github.com/FitnessKeeper/terraform-reference
    • Clone or move the repo into an existing code repo under a terraform/ directory in the root of the project. There are a couple of ways to do this:
      • If starting a new project:
        • Download the .zip file from GitHub Into the root of your new project/git repo
        • unzip the file: unzip terraform-reference-master.zip
        • Rename the director: mv terraform-reference-master terraform
        • git add terraform to add to the repo, and follow the instructions below
      • If migrating a project from an older pattern (for example, if you have an app repo and a terraform-app repo where your terraform code lives and you would like to consolidate them):
        • cd $HOME/terraform-project
        • mkdir terraform
        • mv !(terraform) terraform (Make sure you've moved all the hidden files too)
        • `git commit -a -m "Preparing old project for move"``
        • Move to your new project, i.e. the repo you want to move the terraform code into. (app repo)
        • cd $HOME/app-project
        • git remote add temp $HOME/terraform-project
        • git fetch temp
        • git merge --allow-unrelated-histories temp/master
        • git remote rm temp
  • Edit .env in the root of the repo, in particular make sure you add a TF_PROJECT_NAME env variable
  • Initialize variables.tf:
    • This only needs to be done once.
    • When the repo is created, run ./init-variables.tf.sh
  • Remove the old origin: git remote rm origin
  • Add your new repo: git remote add origin https://github.com/FitnessKeeper/terraform-reference.git
  • Commit your changes
  • git push -u origin master
  • Edit variables.tf to reflect your new service

To use an environment in the control repo.

  • cd into the base dir for the env you want to work on
    • cd terraform-<service>/env-development/
  • ./init.sh to initialize your environment
  • Once you have run the init.sh scripts make sure to delete the init.sh from your repo.
    • This step only needs to be run once.
  • terraform plan to manage all the things!

A word about the "Base env"

Charity described the base env as:

“Base” has a few resources that don’t correspond to any environment — s3 buckets, certain IAM roles and policies, the root route53 zone

In our environment, we haven't yet found uses for the base env. By and large it exists for completeness.


The magic of "${var.env}" and "{$var.stack}"

Nearly all of our TF code relies heavily on both the ${var.env} and "${var.stack}" variables. We use stack to differentiate different code bases separated by repos. For example, in FitnessKeeper/terraform-runkeeper we set the stack to rk, and in the slack lunchbot Spine we set the stack to lunchbot.

  • "${var.stack}" is defined globally in variables.tf
  • ${var.env} is defined in the env-<enviroment>/<enviroment>.tfvars files

Naming Conventions:


Naming Variables and the use of Locals

Use a local when:

As a best practice try to keep the names of variables and resources within about a 20 character limit.


Naming Datasources

Place datasources at the top of main.tf when possible, or at the top of a resource specific .tf file.

For example, when calling a datasource specific to consul in consul.tf, you should put that data source at the top of that file. If the data source gets used by multiple resources, place it in main.tf.


Naming Terraform Resources

Terraform resource names should use underscores when declaring a new resource. Please note, this is somewhat different from the name argument on specific resources, which need to be unique and human readable.

In the below example, note that the resource name consul_cluster uses underscores, whereas the name argument uses a different convention. See below for the convention used on argument names.

resource "aws_alb" "consul_cluster" {
  name            = "tf-consul-cluster-${var.env}"
}

Naming arguments

This is for resources that have a name argument that can be passed. (For example, AWS ALB). These values need to be unique and human readable.

Where possible, use use hyphens and camelCase combined with unique variables for an argument like name. We do this because we may need to create multiple resources under a single AWS account. The below example allows us to re-use the same block of code in dev, stage, and prod:

resource "aws_alb" "consul_cluster" {
  name            = "tf-consul-cluster-${var.env}"
}

Using Modules

We use, and publish, modules heavily. Where ever possible we should attempt to encapsulate and version code that is shared across multiple repos in modules.

If creating a new module, see https://www.terraform.io/docs/registry/modules/publish.html regarding naming of the module:

  • The repository name must be terraform-PROVIDER-NAME where:
    • PROVIDER is the primary provider to associate with the module
    • NAME is a unique name for the module
  • The name may contain hyphens.
  • Example: terraform-aws-consul or terraform-google-vault

Modules should be broken out into separate repositories, versioned semantically, and the version declared explicitly for each environment.

For example, a module declaration in env-development/ecs.tf might look like this:

module "infra-svc-pub" {
  source                      = "github.com/terraform-community-modules/tf_aws_ecs?ref=v5.1.0"
  ami                         = "${var.infra_svc_pub_ami}"
  name                        = "${var.stack}-${var.env}-infra-svc-pub"
  servers                     = "${var.cluster_size_infra_svc_pub}"
  instance_type               = "${var.instance_type_infra_svc_pub}"
  docker_storage_size         = "${var.docker_storage_size_infra_svc}"
  subnet_id                   = "${module.vpc.public_subnets}"
  allowed_cidr_blocks         = "${concat("${var.public_cidrs}", "${var.private_cidrs}")}"
  vpc_id                      = "${module.vpc.vpc_id}"
  key_name                    = "${var.aws_key_name}"
  dockerhub_token             = "${var.rk_devops_dockerhub_token}"
  dockerhub_email             = "${var.rk_devops_dockerhub_email}"
  additional_user_data_script = "${data.template_file.ecs_consul_agent_json.rendered}"
  region                      = "${data.aws_region.current.name}"
  iam_path                    = "/tf/${data.aws_region.current.name}/"
}

This way, different environments/tiers can make use of different versions of the same module without affecting each other or other declarations of the same module within any given environment.

Note, when developing on a submodule:

  • Use a feature branch of the submodule
  • Use the module source of ref=<feature branch name> in the dev environment for testing
  • Make sure to use safe characters - ie, use _ not / notation - because the <feature branch name> will be in the terraform module source url

This way the versioned release process of the submodule is not blocked by the ongoing development if another change needs to be pushed through the submodule. When development is done, release a new semantic version of the submodule and use that new version number in dev again.


Remote State

We use the Terraform S3 remote state backend for all of our repos. The terraform configuration for state is initialize via the repo init scripts.


Naming of GitHub Repos

GitHub repos should be named terraform-<${var.stack}>.

For example:

  • terraform-runkeeper
  • terraform-lunchbot
  • terraform-smartling
  • terraform-aciscs-services

Modules should be named following the Hashicorp guidelines of terraform-<provider>-<topic>.

For example:

  • terraform-aws-consul
  • terraform-aws-vault

Handling Secrets

Secrets should never be checked the repositories. If for some reason a secret is required in a Terraform Spine or Spike, it should be protected by strong encryption. For example, by using Amazon KMS and checking in the encrypted string to the repo, then utilizing the KVM Datasource to decrypt the value of the secret.

Ideally, secrets should be pulled from Vault and either injected into the environment, or a _*.auto.tfvars variables file should be created that can be used during the terraform run. We have helper functions in direnv (more info below) that are used to query vault and populate variable files.

For example, to query Vault in order to populate a basic auth password for use by Consul, you would put this code into your env-development/.envrc file:

if get_vault_kv "secret/consul_htpasswd"; then
  echo "consul_htpasswd = \"${VAULT_KV}\"" > _consul_htpasswd.auto.tfvars
fi

Then each time you cd into the env-development directory, the _consul_htpasswd.auto.tfvars is auto populated with the current password from vault.


Direnv and Dot Env Files

When working locally, we use direnv to manage our local environment variables by sourcing in directory specific .env files. Things to note:

  • Variable files must match the _*.auto.tfvars pattern
  • Direnv configuration files are hidden files named .envrc and are checked into the repository. As such they should never contain any secrets.
  • Both .env files and _*.auto.tfvars files may contain secrets, and as such should not be checked into the repository.

Install the following via Homebrew:

  • Direnv: brew install direnv
  • Vault: brew install vault
  • Consul: brew install consul

Data Source and Shared State

Wherever possible we attempt to use Terraform data sources for AWS. Using data sources allows for a looser coupling of the code bases, whereas reading the remote state of the Spine repo from a Spike adds additional consideration.

For example, if a Spike is deployed using Terraform version 0.9.0 and the Spine is deployed using 0.9.3, as long as the Spike uses only native TF datasources the particular version of Terraform used by the Spine isn't a consideration since the datasources query the particular AWS APIs directly.

However, the datasources often do not expose as much information back to Terraform when compared to reading the remote state. Both options are valid, but come with tradeoffs.


HCL and Terraform Style Guides


Some notes/thoughts on items to review and include in the doc:


Merge and Deployment Steps


We use CircleCI to push infrastructure changes. Changes can be tagged and deployed whenever a tag is pushed into the repo. We use semantic versioning to version and release the master branch of our repos, and use a number of tags to trigger terraform to apply changes. Whenever possible work out of a feature branch.


To Merge into master

  • Create a Pull Request e.g on master. Click Pull request.
  • Compare master to branch. Enter a good comment.
  • Add reviewer(s) for code review.
  • Approve pull request.
    • Merge Pull Request drop down - Pick Squash and Merge.
  • Confirm squash merge.
  • Delete branch.

Deploying Changes

To deploy changes to development, staging, or production you can push a tag or use the GitHub releases feature to deploy a release. The following tag styles are supported:

  • ^deploy-dev-.+ - deploys changes from tag into development, typically done from a feature branch
  • ^deploy-staging-.+ - deploys changes from tag into staging, typically done done from a feature branch
  • ^deploy-hotfix-.+ - deploys changes from a feature branch into production - for use when changes need to be deployed but are not yet merged to master
  • v[0-9]+(\.[0-9]+)* - deploys changes from tag into production, typically done done from master

A few example tagged release are:

deploy-dev-tfhartmann-1
deploy-dev-tfhartmann-2
deploy-staging-tfhartmann-1
deploy-staging-tfhartmann-2
v0.0.1
v0.0.2

Merging to a release branch:

Our older pattern of deployment is being deprecated. We now use the master branch as our acceptance branch and use the convention *-release as release branches for infrastructure.

We have a two phased process to add changes to our Spine and Spikes. First, changes are merged into master with a "Squash and Merge" - i.e. a rebase that squash commits. Secondly, we then open a PR and preform a Merge Commit into our release branches.

  • Create a Pull Request
  • Base production-release compare to master.
  • Merge Pull Request drop down - Create from merge.
  • Confirm merge.
  • Check CircleCI for results.

Sample CircleCI configuration file here