-
Notifications
You must be signed in to change notification settings - Fork 13
Terraform Style Guide
Project Structure, Documentation and our our opinionated approach on how to Terraform
- Project Structure
- TLDR
- Overview
- Basic Usage
- Conventions for Variable Names and the use of locals
- Data Sources
- Resource Names
- Argument Names
- Use of modules
- Remote State
- Naming of GitHub repos
- Handling Secrets
- Direnv and dot env files
- HCL and Terraform Style Guides
- GitHub Pull Request and Deployment Steps
- We used Charity Major model for a control repo as our starting point
- We are trying to adhere to the 12 factor app
- All environment specific .tf files and variables go in the
env-
directories - We "base" env fairly sparingly
-
Spines and Spikes
- A Spine is a "parent" repo that contains the foundation for application configuration, including Hosted Zones and VPC
- A Spike is the configuration for an application deployment that uses the resources provided in the Spine
And yes... "Spine" and "Spikes" were inspired by @tfhartmann watching far to much children's television recently:
We designed our Terraform design patterns based around the "control repo" pattern described by Charity Majors but have modified it slightly to enable collaborators to work on services and applications that may use the core infrastructure, but don't necessarily need to be coded along side it.
For example, to deploy or manage a particular application you may need use a particular set of networks and resources within a given VPC, but don't necessarily need to manage the networks. To address this, we came up with concept of Spines and Spikes. A Spine is where resources such as VPC, DNS, Public Networks, Private Networks, and core services live. *Spikes are complementary repos that use or reference the Spines infrastructure but can deploy and redeploy the application independently of the Spine. This is accomplished through the use of Terraform data sources, and the magic of Terraform shared state.
We are following a pretty standard multi-tier pattern where we have an infrastructure repo with three tiers: dev, stage, and production; i.e FitnessKeeper/terraform-runkeeper. We use that control repo to build VPC, ECS Clusters, DNS Zones, and other resources that can be presented as a platform for use by services.
Services are created using atomic control repos using the FitnessKeeper/terraform-reference repo as a skeleton. We have distinct state files for each of the tiers. This way, we can make changes to the state of a service living atop our infrastructure without having to push changes to the underlying resources.
- There are two ways of getting started with the reference repo:
- Clone the repo using:
git clone https://github.com/FitnessKeeper/terraform-reference
- Clone or move the repo into an existing code repo under a
terraform/
directory in the root of the project. There are a couple of ways to do this:- If starting a new project:
- Download the
.zip
file from GitHub Into the root of your new project/git repo - unzip the file:
unzip terraform-reference-master.zip
- Rename the director:
mv terraform-reference-master terraform
-
git add terraform
to add to the repo, and follow the instructions below
- Download the
- If migrating a project from an older pattern (for example, if you have an
app
repo and aterraform-app
repo where your terraform code lives and you would like to consolidate them):-
cd $HOME/terraform-project
-
mkdir terraform
-
mv !(terraform) terraform
(Make sure you've moved all the hidden files too) - `git commit -a -m "Preparing old project for move"``
- Move to your new project, i.e. the repo you want to move the terraform code into. (app repo)
-
cd $HOME/app-project
-
git remote add temp $HOME/terraform-project
-
git fetch temp
-
git merge --allow-unrelated-histories temp/master
-
git remote rm temp
-
- If starting a new project:
- Clone the repo using:
- Edit .env in the root of the repo, in particular make sure you add a TF_PROJECT_NAME env variable
- Initialize variables.tf:
- This only needs to be done once.
- When the repo is created, run
./init-variables.tf.sh
- Remove the old origin:
git remote rm origin
- Add your new repo:
git remote add origin https://github.com/FitnessKeeper/terraform-reference.git
- Commit your changes
git push -u origin master
- Edit variables.tf to reflect your new service
-
cd
into the base dir for the env you want to work oncd terraform-<service>/env-development/
-
./init.sh
to initialize your environment - Once you have run the init.sh scripts make sure to delete the init.sh from your repo.
- This step only needs to be run once.
-
terraform plan
to manage all the things!
Charity described the base env as:
“Base” has a few resources that don’t correspond to any environment — s3 buckets, certain IAM roles and policies, the root route53 zone
In our environment, we haven't yet found uses for the base env. By and large it exists for completeness.
Nearly all of our TF code relies heavily on both the ${var.env}
and "${var.stack}"
variables. We use stack
to differentiate different code bases separated by repos. For example, in FitnessKeeper/terraform-runkeeper we set the stack
to rk
, and in the slack lunchbot Spine we set the stack to lunchbot
.
-
"${var.stack}"
is defined globally in variables.tf -
${var.env}
is defined in theenv-<enviroment>/<enviroment>.tfvars
files
Use a local when:
- In the "normal cases" as described in the docs https://www.terraform.io/docs/configuration/locals.html
- calling a function and so on
- When referencing a datasource and the attribute requires some level of indexing into an array or hash
As a best practice try to keep the names of variables and resources within about a 20 character limit.
Place datasources at the top of main.tf when possible, or at the top of a resource specific .tf file.
For example, when calling a datasource specific to consul in consul.tf, you should put that data source at the top of that file. If the data source gets used by multiple resources, place it in main.tf.
Terraform resource names should use underscores when declaring a new resource. Please note, this is somewhat different from the name argument on specific resources, which need to be unique and human readable.
In the below example, note that the resource name consul_cluster uses underscores, whereas the name argument uses a different convention. See below for the convention used on argument names.
resource "aws_alb" "consul_cluster" {
name = "tf-consul-cluster-${var.env}"
}
This is for resources that have a name argument that can be passed. (For example, AWS ALB). These values need to be unique and human readable.
Where possible, use use hyphens and camelCase combined with unique variables for an argument like name. We do this because we may need to create multiple resources under a single AWS account. The below example allows us to re-use the same block of code in dev, stage, and prod:
resource "aws_alb" "consul_cluster" {
name = "tf-consul-cluster-${var.env}"
}
We use, and publish, modules heavily. Where ever possible we should attempt to encapsulate and version code that is shared across multiple repos in modules.
If creating a new module, see https://www.terraform.io/docs/registry/modules/publish.html regarding naming of the module:
- The repository name must be terraform-PROVIDER-NAME where:
- PROVIDER is the primary provider to associate with the module
- NAME is a unique name for the module
- The name may contain hyphens.
- Example: terraform-aws-consul or terraform-google-vault
Modules should be broken out into separate repositories, versioned semantically, and the version declared explicitly for each environment.
For example, a module declaration in env-development/ecs.tf
might look like this:
module "infra-svc-pub" {
source = "github.com/terraform-community-modules/tf_aws_ecs?ref=v5.1.0"
ami = "${var.infra_svc_pub_ami}"
name = "${var.stack}-${var.env}-infra-svc-pub"
servers = "${var.cluster_size_infra_svc_pub}"
instance_type = "${var.instance_type_infra_svc_pub}"
docker_storage_size = "${var.docker_storage_size_infra_svc}"
subnet_id = "${module.vpc.public_subnets}"
allowed_cidr_blocks = "${concat("${var.public_cidrs}", "${var.private_cidrs}")}"
vpc_id = "${module.vpc.vpc_id}"
key_name = "${var.aws_key_name}"
dockerhub_token = "${var.rk_devops_dockerhub_token}"
dockerhub_email = "${var.rk_devops_dockerhub_email}"
additional_user_data_script = "${data.template_file.ecs_consul_agent_json.rendered}"
region = "${data.aws_region.current.name}"
iam_path = "/tf/${data.aws_region.current.name}/"
}
This way, different environments/tiers can make use of different versions of the same module without affecting each other or other declarations of the same module within any given environment.
Note, when developing on a submodule:
- Use a feature branch of the submodule
- Use the module source of
ref=<feature branch name>
in the dev environment for testing - Make sure to use safe characters - ie, use
_
not/
notation - because the<feature branch name>
will be in the terraform module source url
This way the versioned release process of the submodule is not blocked by the ongoing development if another change needs to be pushed through the submodule. When development is done, release a new semantic version of the submodule and use that new version number in dev again.
We use the Terraform S3 remote state backend for all of our repos. The terraform configuration for state is initialize via the repo init scripts.
GitHub repos should be named terraform-<${var.stack}>
.
For example:
- terraform-runkeeper
- terraform-lunchbot
- terraform-smartling
- terraform-aciscs-services
Modules should be named following the Hashicorp guidelines of terraform-<provider>-<topic>
.
For example:
- terraform-aws-consul
- terraform-aws-vault
Secrets should never be checked the repositories. If for some reason a secret is required in a Terraform Spine or Spike, it should be protected by strong encryption. For example, by using Amazon KMS and checking in the encrypted string to the repo, then utilizing the KVM Datasource to decrypt the value of the secret.
Ideally, secrets should be pulled from Vault and either injected into the environment, or a _*.auto.tfvars
variables file should be created that can be used during the terraform run. We have helper functions in direnv
(more info below) that are used to query vault and populate variable files.
For example, to query Vault in order to populate a basic auth password for use by Consul, you would put this code into your env-development/.envrc
file:
if get_vault_kv "secret/consul_htpasswd"; then
echo "consul_htpasswd = \"${VAULT_KV}\"" > _consul_htpasswd.auto.tfvars
fi
Then each time you cd
into the env-development
directory, the _consul_htpasswd.auto.tfvars
is auto populated with the current password from vault.
When working locally, we use direnv to manage our local environment variables by sourcing in directory specific .env
files. Things to note:
- Variable files must match the
_*.auto.tfvars
pattern - Direnv configuration files are hidden files named
.envrc
and are checked into the repository. As such they should never contain any secrets. - Both
.env
files and_*.auto.tfvars
files may contain secrets, and as such should not be checked into the repository.
Install the following via Homebrew:
- Direnv:
brew install direnv
- Vault:
brew install vault
- Consul:
brew install consul
Wherever possible we attempt to use Terraform data sources for AWS. Using data sources allows for a looser coupling of the code bases, whereas reading the remote state of the Spine repo from a Spike adds additional consideration.
For example, if a Spike is deployed using Terraform version 0.9.0 and the Spine is deployed using 0.9.3, as long as the Spike uses only native TF datasources the particular version of Terraform used by the Spine isn't a consideration since the datasources query the particular AWS APIs directly.
However, the datasources often do not expose as much information back to Terraform when compared to reading the remote state. Both options are valid, but come with tradeoffs.
Some notes/thoughts on items to review and include in the doc:
- HCL Style Guide: https://github.com/hashicorp/hcl/blob/master/README.md#syntax
- Other reference style guide: https://github.com/bsnape/terraform-style-guide
We use CircleCI to push infrastructure changes. Changes can be tagged and deployed whenever a tag is pushed into the repo. We use semantic versioning to version and release the master branch of our repos, and use a number of tags to trigger terraform to apply changes. Whenever possible work out of a feature branch.
- Create a Pull Request e.g on master. Click Pull request.
- Compare master to branch. Enter a good comment.
- Add reviewer(s) for code review.
- Approve pull request.
- Merge Pull Request drop down - Pick Squash and Merge.
- Confirm squash merge.
- Delete branch.
To deploy changes to development, staging, or production you can push a tag or use the GitHub releases feature to deploy a release. The following tag styles are supported:
-
^deploy-dev-.+
- deploys changes from tag into development, typically done from a feature branch -
^deploy-staging-.+
- deploys changes from tag into staging, typically done done from a feature branch -
^deploy-hotfix-.+
- deploys changes from a feature branch into production - for use when changes need to be deployed but are not yet merged to master -
v[0-9]+(\.[0-9]+)*
- deploys changes from tag into production, typically done done from master
A few example tagged release are:
deploy-dev-tfhartmann-1
deploy-dev-tfhartmann-2
deploy-staging-tfhartmann-1
deploy-staging-tfhartmann-2
v0.0.1
v0.0.2
Our older pattern of deployment is being deprecated. We now use the master branch as our acceptance branch and use the convention *-release
as release branches for infrastructure.
We have a two phased process to add changes to our Spine and Spikes. First, changes are merged into master with a "Squash and Merge" - i.e. a rebase that squash commits. Secondly, we then open a PR and preform a Merge Commit into our release branches.
- Create a Pull Request
- Base production-release compare to master.
- Merge Pull Request drop down - Create from merge.
- Confirm merge.
- Check CircleCI for results.
Sample CircleCI configuration file here