All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog and this project adheres to Semantic Versioning.
- removed hard coded image in cinder
- added timeouts for traefik (fixes harbor)
- disable FRR in metallb since we only use L2 announcements
- if ncsa_security, disable snap
- if ncsa_security, limit ssh hosts to ncsa only
This allows to create a cluster that is RKE2 or K3S as well as RKE1. RKE1 is deprecated and will stop to be supported on July 31st, 2025. If you want to use either RKE2 or K3S you will need to change the network_plugin
.
In version 3.5.0 the default network for RKE1 will be set to canal, please make sure to either upgrade or explicitly say to use weave. In version 4.0.0 RKE1 will be removed
- can use RKE2 or K3S clusters by setting kubernetes_version (leave blank to create RKE1 cluster)
- can specify the key to use for the cluster, and not create a new key for each cluster (
openstack_ssh_key
)
- renamed rke1 module to cluster module, until version 4.0.0 rke1 module will be pushed as well as cluster module.
- added commands to clean up default chrony sources
- removed rke2 module, this is now part of cluster module
- use curl https://ncsa.illinois.edu/ to see if network is alive
- healthmonitor/longhorn are now disabled by default
- missing secret/storageclass additional helm charts for manila
- ability to enable/disable permissions fix for acme
- can specify the region name when connecting to openstack
- added manilla storage class
This removes the old variables for creating machines that were deprecated, and removes references to centos.
- removed all deprecated code, clusters are defined in cluster.json
- ability to set network. Default is weave to be compatible with previous version but this should be changed. Weave is EOL 12/31/2024
- canal (rancher default)
- calico
- flannel
- weave (deprecated)
- none
- ubuntu is an alias for ubuntu22 as an os type in cluster. This is in preperation for ubuntu 24.04.
- removed centos image reference.
- changed default priority for redirect to https to be part 9999
- move metallb specific pieces from raw to metallb application
- traefik doesn't use persistant volumes if acme is not enabled
- Use apt-get instead of apt in node provisioning
- Parameterize OpenStack region name
- added pod-security on namespaces to work correctly (needed for talos)
- metallb
- cinder
- longhorn
- rancher monitoring
- cinder plugins volume for cacert uses /tmp folder (/etc is readonly for talos)
- cert-manager can now be installed
- nodes are labeled with
ncsa.role
andncsa.flavor
from cluster.json - added option
install_docker
to disable Docker installation when provisioning nodes - added option
taiga_enabled
to disable Taiga actions in node provisioning - added option
ncsa_security
to install ncsa specific security options- disable IPv6
- configure chrony for NCSA
- configure rsyslog for NCSA
- add qualys account
- Change in traefik from redirectTo to be redirectTo.port
- forgot to update the template
- added rancher monitoring chart, this can now be managed through argocd.
CRITICAL the version 2.2.0 - 2.3.1 could result in all nodes in the cluster being deleted in the case of changes to the userdata.
- don't remove nodes when there are changes to userdata, key, availability zone, block_device
- fix broken cinder, missing v1.28.0 imaes
- point argocd to git.ncsa.illinois.edu instead of github
- allow to specify what machines you can ssh from to controlplanes
- removed nodeports in securitygroup
- use /32 instead of /16 for rancher ips
In the next major update all backwards compatible code will be removed. Please migrate to teh cluster_machine setup and set controlplane_count and worker_count to 0
- This add backwards compatibility to the stack, you still need ot define the cluster machines
This is a breaking change. You will need to update your terraform code to use this new version. This is an example of the variable cluster_machine
.
[
{
"name": "controlplane",
"role": "controlplane",
"count": 3,
"flavor": "gp.medium",
"os": "centos"
},
{
"name": "worker",
"count": 3,
"flavor": "gp.large",
"disk": 40,
"os": "centos"
}
]
- Can use ubuntu for OS
- Can have differt types of machines (e.g. gpu and no cpu)
- Removed all variables to specify machines used in cluster
- Ability to set iprange that can access the kubapi (port 6443)
- disabled argocd deployment of monitoring since it never synchronizes in argocd
- ignore changes to os/flavor of the nodes
- monitoring is now managed in argocd, this will make it such that the latest version will be installed/upgraded
- removed the argocd-master flag, now all clusters are assumed to be external, including where argocd runs
- compute nodes in rke1 now set availability zone (default nova), availabilty zone is ignored for existing nodes.
- traefik has many major versions released, right now it is set to *
- update openstack-cinder-csi from 1.* to 2.*
- allow multiple nfs servers to be specified in charts/apps
- if an app is disabled, don't populate values
This is the first version. This has evolved and now works on the current setup of radiant. This is split in 3 pieces
- argocd : template to install argocd on server, this is probably not needed and is only used to install a central argocd.
- charts : this contains two charts
- healthmonitor : a simple monitor to see if services are up
- apps : the infrastructure components for a cluster
- ingresscontroller : traefik (v1, v2)
- storageclasses : cinder, longhorn and nfs
- sealedsecrets
- metallb (load balancer)
- raw (raw kubernetes, also used by metallb)
- terraform : creates the cluster in openstack (radiant)
- rke1 : leverages rancher, argocd and openstack to create a fully working kubernetes cluster.