Name	Name	Last commit message	Last commit date
Latest commit History 171 Commits
.github	.github
packer	packer
LICENSE	LICENSE
README.md	README.md
autoscaler.tf	autoscaler.tf
cert_manager.tf	cert_manager.tf
cilium.tf	cilium.tf
client.tf	client.tf
firewall.tf	firewall.tf
floating_ip.tf	floating_ip.tf
hcloud.tf	hcloud.tf
image.tf	image.tf
ingress_nginx.tf	ingress_nginx.tf
load_balancer.tf	load_balancer.tf
longhorn.tf	longhorn.tf
metrics_server.tf	metrics_server.tf
network.tf	network.tf
nodepool.tf	nodepool.tf
outputs.tf	outputs.tf
placement_group.tf	placement_group.tf
rdns.tf	rdns.tf
server.tf	server.tf
ssh_key.tf	ssh_key.tf
talos.tf	talos.tf
talos_backup.tf	talos_backup.tf
talos_config.tf	talos_config.tf
terraform.tf	terraform.tf
variables.tf	variables.tf

Hcloud Kubernetes

Terraform Module to deploy Kubernetes on Hetzner Cloud!

📔 Table of Contents

🌟 About the Project
🚀 Getting Started
⚒️ Advanced Configuration
♻️ Lifecycle
🧭 Roadmap
👋 Contributing
⚖️ License
💎 Acknowledgements

🌟 About the Project

Hcloud Kubernetes is a Terraform module for deploying a fully declarative, managed Kubernetes cluster on Hetzner Cloud. It utilizes Talos, a secure, immutable, and minimal operating system specifically designed for Kubernetes, featuring a streamlined architecture with just 12 binaries and managed entirely through an API.

This project is committed to production-grade configuration and lifecycle management, ensuring all components are set up for high availability. It includes a curated selection of widely used and officially recognized Kubernetes components. If you encounter any issues, suboptimal settings, or missing elements, please file an issue to help us improve this project.

Tip

If you don't yet have a Hetzner account, feel free to use this Hetzner Cloud Referral Link to claim a €20 credit and support this project.

✨ Features

This setup includes several features for a seamless, best-practice Kubernetes deployment on Hetzner Cloud:

Fully Declarative & Immutable: Utilize Talos Linux for a completely declarative and immutable Kubernetes setup on Hetzner Cloud.
Cross-Architecture: Supports both AMD64 and ARM64 architectures, with integrated image upload to Hetzner Cloud.
High Availability: Configured for production-grade high availability for all components, ensuring consistent and reliable system performance.
Distributed Storage: Implements Longhorn for cloud-native block storage with snapshotting and automatic replica rebuilding.
Autoscaling: Includes Cluster Autoscaler to dynamically adjust node counts based on workload demands, optimizing resource allocation.
Plug-and-Play Kubernetes: Equipped with an optional Ingress Controller and Cert Manager, facilitating rapid workload deployment.
Geo-Redundant Ingress: Supports high availability and massive scalability through geo-redundant Load Balancer pools.
Dual-Stack Support: Employs Load Balancers with Proxy Protocol to efficiently route both IPv4 and IPv6 traffic to the Ingress Controller.
Enhanced Security: Built with security as a priority, incorporating firewalls and encryption by default to protect your infrastructure.
Automated Backups: Leverages Talos Backup with support for S3-compatible storage solutions like Hetzner's Object Storage.

📦 Components

This project includes commonly used and essential Kubernetes software, optimized for seamless integration with Hetzner Cloud.

Talos Cloud Controller Manager (CCM)
Manages node resources by updating with cloud metadata, handling lifecycle deletions, and automatically approving node CSRs.
Talos Backup
Automates etcd snapshots and S3 storage for backup in Talos Linux-based Kubernetes clusters.
Hcloud Cloud Controller Manager (CCM)
Manages the integration of Kubernetes clusters with Hetzner Cloud services, ensuring the update of node data, private network traffic control, and load balancer setup.
Hcloud Container Storage Interface (CSI)
Manages persistent storage in Kubernetes clusters using Hetzner Cloud Volumes, ensuring seamless storage integration and management.
Longhorn
Delivers distributed block storage for Kubernetes, facilitating high availability and easy management of persistent volumes with features like snapshotting and automatic replica rebuilding.
Cilium Container Network Interface (CNI)
A high performance CNI plugin that enhances and secures network connectivity and observability for container workloads through the use of eBPF technology in Linux kernels.
Ingress NGINX Controller
Provides a robust web routing and load balancing solution for Kubernetes, utilizing NGINX as a reverse proxy to manage traffic and enhance network performance.
Cert Manager
Automates the management of certificates in Kubernetes, handling the issuance and renewal of certificates from various sources like Let's Encrypt, and ensures certificates are valid and updated.
Cluster Autoscaler
Dynamically adjusts Kubernetes cluster size based on resource demands and node utilization, scaling nodes in or out to optimize cost and performance.
Metrics Server
Collects and provides container resource metrics for Kubernetes, enabling features like autoscaling by interacting with Horizontal and Vertical Pod Autoscalers.

🛡️ Security

Talos Linux is a secure, minimal, and immutable OS for Kubernetes, removing SSH and shell access to reduce attack surfaces. Managed through a secure API with mTLS, Talos prevents configuration drift, enhancing both security and predictability. It follows NIST and CIS hardening standards, operates in memory, and is built to support modern, production-grade Kubernetes environments.

Firewall Protection: This module uses Hetzner Cloud Firewalls to manage external access to nodes. For internal pod-to-pod communication, support for Kubernetes Network Policies is provided through Cilium CNI.

Encryption in Transit: In this module, all pod network traffic is encrypted by default using WireGuard via Cilium CNI. It includes automatic key rotation and efficient in-kernel encryption, covering all traffic types.

Encryption at Rest: In this module, the STATE and EPHEMERAL partitions are encrypted by default with Talos Disk Encryption using LUKS2. Each node is secured with individual encryption keys derived from its unique nodeID.

🚀 Getting Started

✔️ Prerequisites

terraform to deploy Kubernetes on Hetzner Cloud
packer to upload Talos Images to Hetzner Cloud
talosctl to control the Talos Cluster
kubectl to control Kubernetes (optional)

Important

Keep the CLI tools up to date. Ensure that talosctl matches your Talos version for compatibility, especially before a Talos upgrade.

🎯 Installation

Create kubernetes.tf file with the module configuration:

module "kubernetes" {
  source  = "hcloud-k8s/kubernetes/hcloud"
  version = "<version>"

  cluster_name = "k8s"
  hcloud_token = "<hcloud-token>"

  # Export configs for Talos and Kube API access
  cluster_kubeconfig_path  = "kubeconfig"
  cluster_talosconfig_path = "talosconfig"

  # Optional Ingress Controller and Cert Manager
  cert_manager_enabled  = true
  ingress_nginx_enabled = true

  control_plane_nodepools = [
    { name = "control", type = "cax11", location = "fsn1", count = 3 }
  ]
  worker_nodepools = [
    { name = "worker", type = "cax11", location = "fsn1", count = 3 }
  ]
}

Note

Each Control Plane node requires at least 4GB of memory and each Worker node at least 2GB. For High-Availability (HA), at least 3 Control Plane nodes and 3 Worker nodes are required.

Initialize Terraform and deploy the cluster:

terraform init --upgrade
terraform apply

🔑 Cluster Access

Set config file locations:

export TALOSCONFIG=talosconfig
export KUBECONFIG=kubeconfig

Display cluster nodes:

talosctl get member
kubectl get nodes -o wide

Display all pods:

kubectl get pods -A

For more detailed information and examples, please visit:

💥 Teardown

To destroy the cluster, first disable the delete protection by setting:

cluster_delete_protection = false

Apply this change before proceeding. Once the delete protection is disabled, you can teardown the cluster using the following Terraform commands:

terraform state rm 'module.kubernetes.talos_machine_configuration_apply.worker'
terraform state rm 'module.kubernetes.talos_machine_configuration_apply.control_plane'
terraform state rm 'module.kubernetes.talos_machine_secrets.this'
terraform destroy

⚒️ Advanced Configuration

Cluster Access

Public Cluster Access

By default, the cluster is accessible over the public internet. The firewall is automatically configured to use the IPv4 address and /64 IPv6 CIDR of the machine running this module. To disable this automatic configuration, set the following variables to false:

firewall_use_current_ipv4 = false
firewall_use_current_ipv6 = false

To manually specify source networks for the Talos API and Kube API, configure the firewall_talos_api_source and firewall_kube_api_source variables as follows:

firewall_talos_api_source = [
  "1.2.3.0/32",
  "1:2:3::/64"
]
firewall_kube_api_source = [
  "1.2.3.0/32",
  "1:2:3::/64"
]

This allows explicit control over which networks can access your APIs, overriding the default behavior when set.

Internal Cluster Access

If your internal network is routed and accessible, you can directly access the cluster using internal IPs by setting:

cluster_access = "private"

For integrating Talos nodes with an internal network, configure a default route (0.0.0.0/0) in the Hetzner Network to point to your router or gateway. Additionally, add specific routes on the Talos nodes to encompass your entire network CIDR:

talos_extra_routes = ["10.0.0.0/8"]

# Optionally, disable NAT for your globally routed CIDR
network_native_routing_cidr = "10.0.0.0/8"

# Optionally, use an existing Network
hcloud_network_id = 123456789

This setup ensures that the Talos nodes can route traffic appropriately across your internal network.

Access to Kubernetes API

Optionally, a hostname can be configured to direct access to the Kubernetes API through a node IP, load balancer, or Virtual IP (VIP):

kube_api_hostname = "kube-api.example.com"

Access from Public Internet

For accessing the Kubernetes API from the public internet, choose one of the following options based on your needs:

Use a Load Balancer (Recommended):
Deploy a load balancer to manage API traffic, enhancing availability and load distribution.
```
kube_api_load_balancer_enabled = true
```
Use a Virtual IP (Floating IP):
A Floating IP is configured to automatically move between control plane nodes in case of an outage, ensuring continuous access to the Kubernetes API.
```
control_plane_public_vip_ipv4_enabled = true

# Optionally, specify an existing Floating IP
control_plane_public_vip_ipv4_id = 123456789
```

Access from Internal Network

When accessing the Kubernetes API via an internal network, an internal Virtual IP (Alias IP) is utilized by default to route API requests within the network. This feature can be disabled with the following configuration:

control_plane_private_vip_ipv4_enabled = false

To enhance internal availability, a load balancer can be used:

kube_api_load_balancer_enabled = true

This setup ensures secure and flexible access to the Kubernetes API, accommodating different networking environments.

Cluster Autoscaler

The Cluster Autoscaler dynamically adjusts the number of nodes in a Kubernetes cluster based on the demand, ensuring that there are enough nodes to run all pods and no unneeded nodes when the workload decreases.

Example kubernetes.tf snippet:

# Configuration for cluster autoscaler node pools
cluster_autoscaler_nodepools = [
  {
    name     = "autoscaler"
    type     = "cax11"
    location = "fsn1"
    min      = 0
    max      = 6
    labels   = { "autoscaler-node" = "true" }
    taints   = [ "autoscaler-node=true:NoExecute" ]
  }
]

Optionally, pass additional Helm values to the cluster autoscaler configuration:

cluster_autoscaler_helm_values = {
  extraArgs = {
    enforce-node-group-min-size   = true
    scale-down-delay-after-add    = "45m"
    scale-down-delay-after-delete = "4m"
    scale-down-unneeded-time      = "5m"
  }
}

Egress Gateway

Cilium offers an Egress Gateway to ensure network compatibility with legacy systems and firewalls requiring fixed IPs. The use of Cilium Egress Gateway does not provide high availability and increases latency due to extra network hops and tunneling. Consider this configuration only as a last resort.

Example kubernetes.tf snippet:

# Enable Cilium Egress Gateway
cilium_egress_gateway_enabled = true

# Define worker nodepools including an egress-specific node pool
worker_nodepools = [
  # ... (other node pool configurations)
  {
    name     = "egress"
    type     = "cax11"
    location = "fsn1"
    labels   = { "egress-node" = "true" }
    taints   = [ "egress-node=true:NoSchedule" ]
  }
]

Example Egress Gateway Policy:

apiVersion: cilium.io/v2
kind: CiliumEgressGatewayPolicy
metadata:
  name: sample-egress-policy
spec:
  selectors:
    - podSelector:
        matchLabels:
          io.kubernetes.pod.namespace: sample-namespace
          app: sample-app

  destinationCIDRs:
    - "0.0.0.0/0"

  egressGateway:
    nodeSelector:
      matchLabels:
        egress-node: "true"

Please visit the Cilium documentation for more details.

Firewall Configuration

By default, a firewall is configured that can be extended with custom rules. If no egress rules are configured, outbound traffic remains unrestricted. However, inbound traffic is always restricted to mitigate the risk of exposing Talos nodes to the public internet, which could pose a serious security vulnerability.

Each rule is defined with the following properties:

description: A brief description of the rule.
direction: The direction of traffic (in for inbound, out for outbound).
source_ips: A list of source IP addresses for outbound rules.
destination_ips: A list of destination IP addresses for inbound rules.
protocol: The protocol used (valid options: tcp, udp, icmp, gre, esp).
port: The port number (required for tcp and udp protocols, must not be specified for icmp, gre, and esp).

Example kubernetes.tf snippet:

firewall_extra_rules = [
  {
    description = "Custom UDP Rule"
    direction   = "in"
    source_ips  = ["0.0.0.0/0", "::/0"]
    protocol    = "udp"
    port        = "12345"
  },
  {
    description = "Custom TCP Rule"
    direction   = "in"
    source_ips  = ["1.2.3.4", "1:2:3:4::"]
    protocol    = "tcp"
    port        = "8080-9000"
  },
  {
    description = "Allow ICMP"
    direction   = "in"
    source_ips  = ["0.0.0.0/0", "::/0"]
    protocol    = "icmp"
  }
]

For access to Talos and the Kubernetes API, please refer to the Cluster Access configuration section.

Ingress Load Balancer

The ingress controller uses a default load balancer service to manage external traffic. For geo-redundancy and high availability, ingress_load_balancer_pools can be configured as an alternative, replacing the default load balancer with the specified pool of load balancers.

Configuring Load Balancer Pools

To replace the default load balancer, use ingress_load_balancer_pools in the Terraform configuration. This setup ensures high availability and geo-redundancy by distributing traffic from various locations across all targets in all regions.

Example kubernetes.tf configuration:

ingress_load_balancer_pools = [
  {
    name     = "lb-nbg"
    location = "nbg1"
    type     = "lb11"
  },
  {
    name     = "lb-fsn"
    location = "fsn1"
    type     = "lb11"
  }
]

Local Traffic Optimization

Configuring local traffic handling enhances network efficiency by reducing latency. Processing traffic closer to its source eliminates unnecessary routing delays, ensuring consistent performance for low-latency or region-sensitive applications.

Example kubernetes.tf configuration:

ingress_nginx_kind = "DaemonSet"
ingress_nginx_service_external_traffic_policy = "Local"

ingress_load_balancer_pools = [
  {
    name          = "regional-lb-nbg"
    location      = "nbg1"
    local_traffic = true
  },
  {
    name          = "regional-lb-fsn"
    location      = "fsn1"
    local_traffic = true
  }
]

Key settings in this configuration:

local_traffic: Limits load balancer targets to nodes in the same geographic location as the load balancer, reducing data travel distances and keeping traffic within the region.
ingress_nginx_service_external_traffic_policy set to Local: Ensures external traffic is handled directly on the local node, avoiding extra network hops.
ingress_nginx_kind set to DaemonSet: Deploys an ingress controller instance on every node, enabling requests to be handled locally for faster response times.

Topology-aware routing in ingress-nginx can optionally be enabled by setting the ingress_nginx_topology_aware_routing variable to true. This functionality routes traffic to the nearest upstream endpoints, enhancing efficiency for supported services. Note that this feature is only applicable to services that support topology-aware routing. For more information, refer to the Kubernetes documentation.

Network Segmentation

By default, this module calculates optimal subnets based on the provided network CIDR (network_ipv4_cidr). The network is segmented automatically as follows:

1st Quarter: Reserved for other uses such as classic VMs.
2nd Quarter:
- 1st Half: Allocated for Node Subnets (network_node_ipv4_cidr)
- 2nd Half: Allocated for Service IPs (network_service_ipv4_cidr)
3rd and 4th Quarters:
- Full Span: Allocated for Pod Subnets (network_pod_ipv4_cidr)

Each Kubernetes node requires a /24 subnet within network_pod_ipv4_cidr. To support this configuration, the optimal node subnet size (network_node_ipv4_subnet_mask_size) is calculated using the formula:
32 - (24 - subnet_mask_size(network_pod_ipv4_cidr)).

With the default 10.0.0.0/16 network CIDR (network_ipv4_cidr), the following values are calculated:

Node Subnet Size: /25 (Max. 128 Nodes per Subnet)
Node Subnets: 10.0.64.0/19 (Max. 64 Subnets, each with /25)
Service IPs: 10.0.96.0/19 (Max. 8192 Services)
Pod Subnet Size: /24 (Max. 256 Pods per Node)
Pod Subnets: 10.0.128.0/17 (Max. 128 Nodes, each with /24)

Please consider the following Hetzner Cloud limits:

Up to 100 servers can be attached to a network.
Up to 100 routes can be created per network.
Up to 50 subnets can be created per network.
A project can have up to 50 placement groups.

A /16 Network CIDR is sufficient to fully utilize Hetzner Cloud's scaling capabilities. It supports:

Up to 100 nodes, each with its own /24 Pod subnet route.
Configuration of up to 50 nodepools, one nodepool per subnet, each with at least one placement group.

Here is a table with more example calculations:

Network CIDR	Node Subnet Size	Node Subnets	Service IPs	Pod Subnets
10.0.0.0/16	/25 (128 IPs)	10.0.64.0/19 (64)	10.0.96.0/19 (8192)	10.0.128.0/17 (128)
10.0.0.0/17	/26 (64 IPs)	10.0.32.0/20 (64)	10.0.48.0/20 (4096)	10.0.64.0/18 (64)
10.0.0.0/18	/27 (32 IPs)	10.0.16.0/21 (64)	10.0.24.0/21 (2048)	10.0.32.0/19 (32)
10.0.0.0/19	/28 (16 IPs)	10.0.8.0/22 (64)	10.0.12.0/22 (1024)	10.0.16.0/20 (16)
10.0.0.0/20	/29 (8 IPs)	10.0.4.0/23 (64)	10.0.6.0/23 (512)	10.0.8.0/21 (8)
10.0.0.0/21	/30 (4 IPs)	10.0.2.0/24 (64)	10.0.3.0/24 (256)	10.0.4.0/22 (4)

Talos Backup

This module natively supports Hcloud Object Storage. Below is an example of how to configure backups with MinIO Client (mc) and Hcloud Object Storage. While it's possible to create the bucket through the Hcloud Console, this method does not allow for the configuration of automatic retention policies.

Create an alias for the endpoint using the following command:

mc alias set <alias> \
  https://<location>.your-objectstorage.com \
  <access-key> <secret-key> \
  --api "s3v4" \
  --path "off"

Create a bucket with automatic retention policies to protect your backups:

mc mb --with-lock --region <location> <alias>/<bucket>
mc retention set GOVERNANCE 14d --default <alias>/<bucket>

Configure your kubernetes.tf file:

talos_backup_s3_hcloud_url = "https://<bucket>.<location>.your-objectstorage.com"
talos_backup_s3_access_key = "<access-key>"
talos_backup_s3_secret_key = "<secret-key>"

# Optional: AGE X25519 Public Key for encryption
talos_backup_age_x25519_public_key = "<age-public-key>"

# Optional: Change schedule (cron syntax)
talos_backup_schedule = "0 * * * *"

For users of other object storage providers, configure kubernetes.tf as follows:

talos_backup_s3_region   = "<region>"
talos_backup_s3_endpoint = "<endpoint>"
talos_backup_s3_bucket   = "<bucket>"
talos_backup_s3_prefix   = "<prefix>"

# Use path-style URLs (set true if required by your provider)
talos_backup_s3_path_style = true

# Access credentials
talos_backup_s3_access_key = "<access-key>"
talos_backup_s3_secret_key = "<secret-key>"

# Optional: AGE X25519 Public Key for encryption
talos_backup_age_x25519_public_key = "<age-public-key>"

# Optional: Change schedule (cron syntax)
talos_backup_schedule = "0 * * * *"

To recover from a snapshot, please refer to the Talos Disaster Recovery section in the Documentation.

♻️ Lifecycle

The Talos Terraform Provider does not support declarative upgrades of Talos or Kubernetes versions. This module compensates for these limitations using talosctl to implement the required functionalities. Any minor or major upgrades to Talos and Kubernetes will result in a major version change of this module. Please be aware that downgrades are typically neither supported nor tested.

Important

Before upgrading to the next major version of this module, ensure you are on the latest release of the current major version. Do not skip any major release upgrades.

✅ Version Compatibility Matrix

Hcloud K8s	K8s	Talos	Talos CCM	Hcloud CCM	Hcloud CSI	Long-horn	Cilium	Ingress NGINX	Cert Mgr.	Auto-scaler
(2)	(1.32)	(1.9)	?	?	?	?	?	?	?	?
(1)	1.31	1.8	1.8	1.21	2.10	?	(1.17)	(4.12)	1.15	9.38
0	1.30	1.7	1.6	1.20	2.9	1.7.1	1.16	4.10.1	1.14	9.37

In this module, upgrades are conducted with care and conservatism. You will consistently receive the most tested and compatible releases of all components, avoiding the latest untested or incompatible releases that could disrupt your cluster.

Warning

Do not change any software versions in this project on your own. Each component is tailored to ensure compatibility with new Kubernetes releases. This project specifies versions that are supported and have been thoroughly tested to work together.

🧭 Roadmap

Upgrade to Talos 1.8 and Kubernetes 1.31
Once all components have compatible versions, the upgrade can be performed.
Integrate native IPv6 for pod traffic
Completion requires Hetzner's addition of IPv6 support to cloud networks, expected at the beginning of 2025 as announced at Hetzner Summit 2024.

👋 Contributing

Contributions are always welcome!

⚖️ License

Distributed under the MIT License. See LICENSE for more information.

💎 Acknowledgements

Talos Linux for its impressively secure, immutable, and minimalistic Kubernetes distribution.
Hetzner Cloud for offering excellent cloud infrastructure with robust Kubernetes integrations.
Other projects like Kube-Hetzner and Terraform - Hcloud - Talos, where we’ve contributed and gained valuable insights into Kubernetes deployments on Hetzner.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hcloud Kubernetes

📔 Table of Contents

🌟 About the Project

✨ Features

📦 Components

🛡️ Security

🚀 Getting Started

✔️ Prerequisites

🎯 Installation

🔑 Cluster Access

💥 Teardown

⚒️ Advanced Configuration

Public Cluster Access

Internal Cluster Access

Access to Kubernetes API

Access from Public Internet

Access from Internal Network

Configuring Load Balancer Pools

Local Traffic Optimization

♻️ Lifecycle

✅ Version Compatibility Matrix

🧭 Roadmap

👋 Contributing

⚖️ License

💎 Acknowledgements

About

Releases 29

Packages

Contributors 4

Languages

License

hcloud-k8s/terraform-hcloud-kubernetes

Folders and files

Latest commit

History

Repository files navigation

Hcloud Kubernetes

📔 Table of Contents

🌟 About the Project

✨ Features

📦 Components

🛡️ Security

🚀 Getting Started

✔️ Prerequisites

🎯 Installation

🔑 Cluster Access

💥 Teardown

⚒️ Advanced Configuration

Public Cluster Access

Internal Cluster Access

Access to Kubernetes API

Access from Public Internet

Access from Internal Network

Configuring Load Balancer Pools

Local Traffic Optimization

♻️ Lifecycle

✅ Version Compatibility Matrix

🧭 Roadmap

👋 Contributing

⚖️ License

💎 Acknowledgements

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 29

Packages 0

Contributors 4

Languages

Packages