This project will show all about K8s: such as theory, how to deploy, and so on.
- Kubernetes provides
- Kubernetes's architecture
- Master node (or Control plane)
- Kube API server
- etcd storage
- Scheduler
- Controller Manager
- Worker node
- Kubelet
- Kube Proxy
- Container Runtime
- Addons
- Master node (or Control plane)
- Let's overview the Kubernetes's architecture once again
- In a Pod, you can have one or many containers?
- How will Pod 1 and Pod 6 (these pods are in different node) interact?
- Deploy K8s using Minikube
K8s stands for Kubernetes.
Kubernetes is really not a replacement for docker engine. Kubernetes manages the cluster of docker engine. And not only docker engine, it can manage cluster of other container runtime environment like rocket.
(rkt stands for rocket is an application container engine developed for modern production cloud-native environments. It features a pod-native approach, a pluggable excution environment, and a well-defined surface area that makes it ideal for itergration with other systems. The core excution unit of rkt is the pod, a collection of one or more applications excuting in a shared context (rkt's pod are synonymous with the concept in the K8s orchestration system). rkt allows users to apply different configurations (like isolation parameters) at both pod-level and at the more granular per-application level. rkt's architecture means that each pod excutes directly in the classic Unix process model (i.e. there is no central daemon), in a self-contained, isolated environment. rkt implements a modern, open, standard container format, the App container (appc) spec, but can also excute other container images, like those created with Docker).
- Service discovery and load balancing: You create a container which gets automatically dicovered by the load balancer and it gets updated in the load balancer.
- Storage orchestration: K8s provide intergration with lots of storage, like SAN, NAV, EBS volume, CEPH storage.
- Automated rollouts and rollbacks: easy to roll out a new image version and also roll back very easily if it's not working.
- Automatic bin packing: It's going to place your container on the right node where it gets the right resources based on the requirement.
- Self-healing: If the node goes down, it brings your containers to life on the live node. Apart from that, you containers are also monitored. You can set that just like Auto Scaling group. When an instance goes down, Auto Scaling group will launch a replacement like that. The Self-healing capability, it's much faster than Auto Scaling.
- Secret and configuration management: You can manage the configuration in form of variables and volumes and also secrets which are encoded values.
Two main components: Master node and Worker node.
It is the one that managing these worker nodes. So you don't log into the worker node and run the containers, you tell it to the master node. You don't even log into the master node, you connect by using some client. You give information to the master node (.yaml file) that you want to run about the containers. And it is going to take the action based on the requirement. There are four primary services: API server, Scheduler, Controller Manager and etcd.
- It handles all the incoming and out going communcation. When you want to send instructions to Kubernetes, Kube API server is going to recieve that. And then it's going to pass the information to other services like Scheduler, etcd, and Worker nodes.
- Component on the Master node that exposes the Kubernetes API. If you want, you can build your own tool that gets intergrated with Kubernetes API. There are so many third party tools to be available to intergrate with your Kubernetes API which you can use like monitoring agent, logging agent, web dashboards.
- It is the fornt-end of the control plane or the whole Kubernetes cluster.
- Admins connects to Kube API server using Kubectl CLI.
- Web dashboard can be intergrated with this API.
- Stores all the information of your Kubernetes cluster.
- The Kube API service is going to store or retrieve information from this.
- It will have all the runtime information.
- It should be backed up regularly because if this fails, you lose the current data.
- Stores current state of everything in the cluster.
- Scheduler is going to schedule the container on the right node.
- watches newly created pods that have no node assigned, and selects a node for them to run on.
- Factors taken into account for scheduling decisions include:
- individual and collective resource requirements.
- hardware/software/policy contraints.
- affinity and anti-affinity specifications.
- data locality.
- inter-workload interference and deadlines.
- Logically, each controller is a separate process.
- These controllers include:
- Node Controller: Responsible for noticing and responding when nodes go down.
- Replicaion Controller: Responsible for maintaining the correct number of pods for every replication controller object in the system.
- Endpoints Controller: Populates the Endpoints object (that is, joins Service & Pods).
- Service Account & Token Controllers: manages the authentication and authorization, create default accounts and API access tokens for new namespace.
Worker node is the one where docker engine are running. Worker node, we have: Kubelet, Proxy, Docker engine.
- An agent that runs on each node in the cluster. It makes sure that containers are running in a pod. It's going to listen to your kubernetes master node requests or commands.
- When the Scheduler decides that this Worker node is going to run the container, it's going to assign the responsiblity to Kubelet. Now Kubelet is going to fetch your image and run the container from it.
- Kube Proxy actually is a network proxy that is going to run on every node in the cluster.
- You can set network rules also like security groups. (E.g: rules allow network communication to your Pods inside or outside of your cluster.)
- Docker
- containerd
- cri-o, rktlet
- Kubernetes CRI (Container Runtime Interface)
These components are taken by some third party vendors who have some specialization in that area, like better logging tool, better monitoring tool, or webUI, or DNS server.
- DNS
- webUI
- Container Resource Monitoring
- Cluster Level Logging
kubectl is a tool, which we are going to connect to the Kubernetes master node. Master node has API server: enables the communication; Scheduler: decides where your container will be running on which node; Controller Manager: responsible for monitoring Worker nodes, your containers, and also the authentication, authorization; and etcd: stores the current information. In Worker node, you have kubelet, which is the agent, will do all the heavy lifting on the containers: it's going to fetch the image, run the containers, map the volumes, do all those stuff. kube proxy is a network proxy. If you want to expose a Pod to the outside world, you can do it through Kube proxy or you can even set the network rules. And then, the docker engine, where your containers will be running, as you can see, containers enclosed in Pod. So what is Pod?
- What is the relation between Pod and the container? It's the same relation as the VM and the process running inside it (the VM is going to provide all the resource to the process, which is running RAM, network, CPU, storage, everything and the process just uses it). Similarly, Pod is going to provide all the resource to a container. The container will be running inside the Pod, so container will be like the process, and Pod will be like the VM. But, there is no virtualization, it's isolation.
- Why does Kubernetes uses Pod, why not directly run containers? Because Kubernetes can use different container runtime environment like Docker, Rocker, CRI. If you don't have the Pod, there will be no abstraction. Now we have Pod, it's a standard set of commands, standard set of configuration that we do, it doesn't matter what technology we are using behind the scene. So if you're running a "process A" in the Pod, the "process A" will be the container, which will be running on port 8080 (for example), and the Pod will give the IP address. You can access it by giving the Pod IP and the port number of the container.
Pod gives the resources to the containers. It really depends. Ideally, you see one container inside the Pod, named main, the other containers will be the helper containers.
- In node 1, you have a Pod and a main container running inside that. One Pod - One container.
- In node 2, you have 3 containers in a Pod. init container will be short lived container. It's going to start does some commands excutions and then it will be dead. Then, main container will start with sidecar container. If you have sidecar container, its work will be helping main container (for example: streaming the log, it could be a logging agent or a monitoring agent to help main container). But at any given point of time, you should have one main container only running in the Pod. The other containers will be helper containers. If you have MySQL and Tomcat, you are not going to run both in the same Pod. You'll run it on the different pods. -> Pod will be distributed across multiple Worker nodes.
On every node, you'll have a bridge 00 (or a subnet, like Local Area Network - LAN). bridge 0 acts like a switch, so all the Pods running in the node will be able to communicate with each other. If you want to connect a Pod in a different node, bridge 0 is going to forward the request to the wg0, wg0 acts like a router. wg0 is going to route it to the right node router by looking at the IP addres. Then the destination router recieves the request and forward to the destination switch and switch send to the destination Pod. Anyway, every node will have a small private network, and all these private networks will be connected in one bigger network. This is Overlay network.
- Install Git on your computer. (follow the link: https://git-scm.com/download/win)
- Open Windows PowerShell as an Aministrator
- Clone repository github (follow the link: https://github.com/devopshydclub/vprofile-project):
cd ../.. cd Downloads git clone https://github.com/devopshydclub/vprofile-project.git
-
Requirement: 3 nodes (2 CPUs and 4GB RAM, or higher). This is up to you, but anw, you should have at least 3 nodes, with 1 Master node and 2 Worker nodes. And you need another node to deploy K8s to 3 nodes above.
-
I'll use CentOS 7.9 operating system for 4 nodes and I rented these virtual machines from a cloud service provider. My nodes default to using the root user. Run this command first:
yum update -y
. Nodes's information are shown below:Deploy node: 1 vCPU and 1GB RAM - IP: 192.168.0.19
Target nodes:
Master node: 2vCPUs and 8GB RAM - IP: 192.168.0.48
Worker node 1: 2vCPUs and 8GB RAM - IP: 192.168.0.211
Worker node 2: 2vCPUs and 8GB RAM - IP: 192.168.0.54/24
-
Enable SSH connection between Deploy node to others and between Master node to Worker nodes.
- At all nodes, do the following (Deploy node, for example):
Generate SSH key:
ssh-keygen
- At Deploy node, SSH to the remaining nodes:
To Master node:
ssh-copy-id [email protected]
To Worker node 1:
ssh-copy-id [email protected]
To Worker node 2:
ssh-copy-id [email protected]
- At Master node, SSH to the Worker noes:
To Worker node 1:
ssh-copy-id [email protected]
To Worker node 2:
ssh-copy-id [email protected]
-
Deploy
At Deploy nodes, do the following:
-
Git clone Kubespray, version 2.16:
yum install git -y git clone https://github.com/kubernetes-sigs/kubespray.git --branch release-2.16
-
Go to the folder:
cd kubespray/inventory/sample/
-
Create and write
host.yaml
file:vi host.yaml
Content:
[all] master ansible_host=192.168.0.48 ip=192.168.0.48 worker1 ansible_host=192.168.0.211 ip=192.168.0.211 worker2 ansible_host=192.168.0.54 ip=192.168.0.54 [kube_control_plane] master [etcd] master [kube_node] worker1 worker2 [k8s_cluster:children] kube_node kube_control_plane
-
Install docker:
curl -fsSL https://get.docker.com/ | sh systemctl start docker
-
Docker run:
docker run --rm -it --mount type=bind,source=/root/kubespray/inventory/sample/,dst=/kubespray/inventory quay.io/kubespray/kubespray:v2.16.0 bash
After running the above command, we have already executed it inside the container. Please note that the prompt will now be "root@a3247643f8e2:/kubespray#".
Now, you'll enable SSH connection between the container to other nodes (except Deploy node):
ssh-keygen ssh-copy-id [email protected] ssh-copy-id [email protected] ssh-copy-id [email protected]
Run the final command:
ansible-playbook -i inventory/host.yaml cluster.yml
-