Skip to content

Latest commit

 

History

History
1633 lines (1336 loc) · 48.5 KB

从 0 到 1 手动搭建 K8s 集群.md

File metadata and controls

1633 lines (1336 loc) · 48.5 KB

TOC

1. 概述

K8s 作为优秀的云原生容器编排系统,不仅要学会如何使用、理解其核心概念、抽象思想、提供的核心 features,还需要掌握如何从 0 到 1 搭建 K8s 集群环境。目前社区有开源的自动化搭建工具,如 kubeadmin, minikube, kind 等,但这些自动化工具屏蔽了搭建过程中各个组件的运行细节。如果想要进一步掌握、理解 K8s 系统各组件的运行原理、相互依赖关系,就很有必要去掌握如何从 0 到 1 手动搭建 K8s 集群环境。

本文将介绍从 0 到 1 手动一步一步搭建 K8s 集群环境,通过手动实践操作此流程,将使我们对 K8s 核心组件 (etcd, kube-apiserver, kube-controller-manager, kube-scheduler, kubelet, kube-proxy 等) 的运行原理、依赖关系等有更加深入的掌握和理解。

K8s-arch

搭建流程参考:kubernetes-the-hard-way ,fork 到了此 项目

2. 准备机器(VM)

  • master-1
  • master-2
  • master-3
  • worker-1
  • worker-2
  • lb

2.1 vagrant 批量创建 VM

# 根据 Vagrantfile 批量创建机器,大概 10 分钟可安装完成。
vagrant up

# 查看机器状态
vagrant status

Current machine states:

master-1                  running (virtualbox)
master-2                  running (virtualbox)
master-3                  running (virtualbox)
lb                        running (virtualbox)
worker-1                  running (virtualbox)
worker-2                  running (virtualbox)

2.2 登录 VM

方式一:
vagrant ssh master-1

方式二:
ssh [email protected] -i .vagrant/machines/master-1/virtualbox/private_key

2.3 访问所有机器

把 master-1 作为操作主机 (administrative client),生成 ssh-key:

ssh-keygen

cat .ssh/id_rsa.pub
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC/uhkQUvMUQV3y4SNav6L6TCjHjXy8Oz1y8SkCuJagxXuP+YnVngAfWLPcOEAPKM8+GcwLRgi8lZzKXqa720LnN8+8odsuJE0HSihT9rKsS/804BKN9ApXWkzbwdEJLBD4pjxiyh2mdTBwtptOkEGX28pLDKPMmhMxMgN4XKR5RB9cs5xkKreVMC0nxSXyRHmAmeyqU3rPzSpspg9tbw4q6LKSF4fPdVtJGDFJmz/+JNaEk75znVNFRhRh750PeIilVrsTsJ4vtP/QPxkd4KSz2hYzj1/DmfOZpLxJ8XKHgh5k7eYvrfhOwCu/n2Hvl+Eugslrh2FE+PB0CHyg0Ccx vagrant@master-1

登录到其他 VM,添加到授权文件:
echo "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC/uhkQUvMUQV3y4SNav6L6TCjHjXy8Oz1y8SkCuJagxXuP+YnVngAfWLPcOEAPKM8+GcwLRgi8lZzKXqa720LnN8+8odsuJE0HSihT9rKsS/804BKN9ApXWkzbwdEJLBD4pjxiyh2mdTBwtptOkEGX28pLDKPMmhMxMgN4XKR5RB9cs5xkKreVMC0nxSXyRHmAmeyqU3rPzSpspg9tbw4q6LKSF4fPdVtJGDFJmz/+JNaEk75znVNFRhRh750PeIilVrsTsJ4vtP/QPxkd4KSz2hYzj1/DmfOZpLxJ8XKHgh5k7eYvrfhOwCu/n2Hvl+Eugslrh2FE+PB0CHyg0Ccx vagrant@master-1" >> .ssh/authorized_keys

验证登录:

ssh master-2

第一次登录需要确认:yes

The authenticity of host 'master-2 (192.168.56.12)' can't be established.
ECDSA key fingerprint is SHA256:QS6cDbML1bWxCi37EpLBYjk2xm0AGQu9oIdNFz+/lBE.
Are you sure you want to continue connecting (yes/no)? yes

可看到登录成功。

3. 创建 TLS 证书

3.1 创建 self-signed CA

# Create private key for CA
openssl genrsa -out ca.key 2048

# Comment line starting with RANDFILE in /etc/ssl/openssl.cnf definition to avoid permission issues
sudo sed -i '0,/RANDFILE/{s/RANDFILE/\#&/}' /etc/ssl/openssl.cnf

# Create CSR using the private key
openssl req -new -key ca.key -subj "/CN=KUBERNETES-CA" -out ca.csr

# Self sign the csr using its own private key
openssl x509 -req -in ca.csr -signkey ca.key -CAcreateserial  -out ca.crt -days 1000

3.2 创建 kubectl admin 证书

# Generate private key for admin user
openssl genrsa -out admin.key 2048

# Generate CSR for admin user. Note the OU.
openssl req -new -key admin.key -subj "/CN=admin/O=system:masters" -out admin.csr

# Sign certificate for admin user using CA servers private key
openssl x509 -req -in admin.csr -CA ca.crt -CAkey ca.key -CAcreateserial  -out admin.crt -days 1000

3.3 创建 kube-controller-manager 证书

openssl genrsa -out kube-controller-manager.key 2048

openssl req -new -key kube-controller-manager.key -subj "/CN=system:kube-controller-manager" -out kube-controller-manager.csr

openssl x509 -req -in kube-controller-manager.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out kube-controller-manager.crt -days 1000

3.4 创建 kube-proxy 证书

openssl genrsa -out kube-proxy.key 2048

openssl req -new -key kube-proxy.key -subj "/CN=system:kube-proxy" -out kube-proxy.csr

openssl x509 -req -in kube-proxy.csr -CA ca.crt -CAkey ca.key -CAcreateserial  -out kube-proxy.crt -days 1000

3.5 创建 kube-scheduler 证书

openssl genrsa -out kube-scheduler.key 2048

openssl req -new -key kube-scheduler.key -subj "/CN=system:kube-scheduler" -out kube-scheduler.csr

openssl x509 -req -in kube-scheduler.csr -CA ca.crt -CAkey ca.key -CAcreateserial  -out kube-scheduler.crt -days 1000

3.6 创建 kube-apiserver 证书

创建 csr 配置 openssl-apiserver.cnf

cat > openssl-apiserver.cnf <<EOF
[req]
req_extensions = v3_req
distinguished_name = req_distinguished_name
[req_distinguished_name]
[ v3_req ]
basicConstraints = CA:FALSE
keyUsage = nonRepudiation, digitalSignature, keyEncipherment
subjectAltName = @alt_names
[alt_names]
DNS.1 = kubernetes
DNS.2 = kubernetes.default
DNS.3 = kubernetes.default.svc
DNS.4 = kubernetes.default.svc.cluster.local
IP.1 = 10.96.0.1
IP.2 = 192.168.56.11
IP.3 = 192.168.56.12
IP.4 = 192.168.56.13
IP.5 = 192.168.56.30
IP.6 = 127.0.0.1
EOF

创建证书:

openssl genrsa -out kube-apiserver.key 2048

openssl req -new -key kube-apiserver.key -subj "/CN=kube-apiserver" -out kube-apiserver.csr -config openssl-apiserver.cnf

openssl x509 -req -in kube-apiserver.csr -CA ca.crt -CAkey ca.key -CAcreateserial  -out kube-apiserver.crt -extensions v3_req -extfile openssl-apiserver.cnf -days 1000

3.7 创建 etcd 证书

cat > openssl-etcd.cnf <<EOF
[req]
req_extensions = v3_req
distinguished_name = req_distinguished_name
[req_distinguished_name]
[ v3_req ]
basicConstraints = CA:FALSE
keyUsage = nonRepudiation, digitalSignature, keyEncipherment
subjectAltName = @alt_names
[alt_names]
IP.1 = 192.168.56.11
IP.2 = 192.168.56.12
IP.3 = 192.168.56.13
IP.4 = 127.0.0.1
EOF
openssl genrsa -out etcd-server.key 2048

openssl req -new -key etcd-server.key -subj "/CN=etcd-server" -out etcd-server.csr -config openssl-etcd.cnf

openssl x509 -req -in etcd-server.csr -CA ca.crt -CAkey ca.key -CAcreateserial  -out etcd-server.crt -extensions v3_req -extfile openssl-etcd.cnf -days 1000

3.8 创建 service-account 证书

openssl genrsa -out service-account.key 2048

openssl req -new -key service-account.key -subj "/CN=service-accounts" -out service-account.csr

openssl x509 -req -in service-account.csr -CA ca.crt -CAkey ca.key -CAcreateserial  -out service-account.crt -days 1000

3.9 分发证书到所有 master nodes

for instance in master-1 master-2 master-3; do
  scp ca.crt ca.key kube-apiserver.key kube-apiserver.crt \
    service-account.key service-account.crt \
    etcd-server.key etcd-server.crt \
    ${instance}:~/
done

4. 创建 kubeconfig

4.1 kube-proxy kubeconfig

LOADBALANCER_ADDRESS=192.168.56.30

{
  kubectl config set-cluster kubernetes-the-hard-way \
    --certificate-authority=ca.crt \
    --embed-certs=true \
    --server=https://${LOADBALANCER_ADDRESS}:6443 \
    --kubeconfig=kube-proxy.kubeconfig

  kubectl config set-credentials system:kube-proxy \
    --client-certificate=kube-proxy.crt \
    --client-key=kube-proxy.key \
    --embed-certs=true \
    --kubeconfig=kube-proxy.kubeconfig

  kubectl config set-context default \
    --cluster=kubernetes-the-hard-way \
    --user=system:kube-proxy \
    --kubeconfig=kube-proxy.kubeconfig

  kubectl config use-context default --kubeconfig=kube-proxy.kubeconfig
}

4.2 kube-controller-manager kubeconfig

{
  kubectl config set-cluster kubernetes-the-hard-way \
    --certificate-authority=ca.crt \
    --embed-certs=true \
    --server=https://127.0.0.1:6443 \
    --kubeconfig=kube-controller-manager.kubeconfig

  kubectl config set-credentials system:kube-controller-manager \
    --client-certificate=kube-controller-manager.crt \
    --client-key=kube-controller-manager.key \
    --embed-certs=true \
    --kubeconfig=kube-controller-manager.kubeconfig

  kubectl config set-context default \
    --cluster=kubernetes-the-hard-way \
    --user=system:kube-controller-manager \
    --kubeconfig=kube-controller-manager.kubeconfig

  kubectl config use-context default --kubeconfig=kube-controller-manager.kubeconfig
}

4.3 kube-scheduler kubeconfig

{
  kubectl config set-cluster kubernetes-the-hard-way \
    --certificate-authority=ca.crt \
    --embed-certs=true \
    --server=https://127.0.0.1:6443 \
    --kubeconfig=kube-scheduler.kubeconfig

  kubectl config set-credentials system:kube-scheduler \
    --client-certificate=kube-scheduler.crt \
    --client-key=kube-scheduler.key \
    --embed-certs=true \
    --kubeconfig=kube-scheduler.kubeconfig

  kubectl config set-context default \
    --cluster=kubernetes-the-hard-way \
    --user=system:kube-scheduler \
    --kubeconfig=kube-scheduler.kubeconfig

  kubectl config use-context default --kubeconfig=kube-scheduler.kubeconfig
}

4.4 admin(kubectl) kubeconfig

{
  kubectl config set-cluster kubernetes-the-hard-way \
    --certificate-authority=ca.crt \
    --embed-certs=true \
    --server=https://127.0.0.1:6443 \
    --kubeconfig=admin.kubeconfig

  kubectl config set-credentials admin \
    --client-certificate=admin.crt \
    --client-key=admin.key \
    --embed-certs=true \
    --kubeconfig=admin.kubeconfig

  kubectl config set-context default \
    --cluster=kubernetes-the-hard-way \
    --user=admin \
    --kubeconfig=admin.kubeconfig

  kubectl config use-context default --kubeconfig=admin.kubeconfig
}

4.5 配置数据加密

针对敏感数据如 secret 的存储,默认在 K8s 中 Secret 通过 base64 编码后存储,通过权限控制 (RBAC) 用户或应用对其访问。

但是这些敏感数据持久化到 etcd 后端后,如果没有进行加密处理,有权限查看 etcd 数据的用户将会直接看到 Secret 中的敏感数据。因此,为了解决此安全问题,K8s 在 v1.13 就提出了通过 EncryptionConfiguration 实现存储在 etcd 中的数据也是加密的,及时获取了 etcd 数据也无法破解,保障了敏感数据安全性。

ENCRYPTION_KEY=$(head -c 32 /dev/urandom | base64)

cat > encryption-config.yaml <<EOF
kind: EncryptionConfig
apiVersion: v1
resources:
  - resources:
      - secrets
    providers:
      - aescbc:
          keys:
            - name: key1
              secret: ${ENCRYPTION_KEY}
      - identity: {}
EOF

分发到 master nodes:

for instance in master-1 master-2 master-3; do
  scp encryption-config.yaml ${instance}:~/
done

4.6 分发 kubeconfig 到 nodes

kube-proxy to worker nodes:

for instance in worker-1 worker-2; do
  scp kube-proxy.kubeconfig ${instance}:~/
done

others to master nodes:

for instance in master-1 master-2 master-3; do
  scp admin.kubeconfig kube-controller-manager.kubeconfig kube-scheduler.kubeconfig ${instance}:~/
done

5. Master 组件部署

5.1 etcd 部署

# recommend to the latest version
ETCD_VER=v3.4.20

wget -q --show-progress --https-only --timestamping \
  "https://github.com/etcd-io/etcd/releases/download/${ETCD_VER}/etcd-${ETCD_VER}-linux-amd64.tar.gz"

{
  tar -zxvf etcd-${ETCD_VER}-linux-amd64.tar.gz
  sudo cp etcd-${ETCD_VER}-linux-amd64/etcd* /usr/local/bin/
}

{
  sudo mkdir -p /etc/etcd /var/lib/etcd
  sudo cp ca.crt etcd-server.key etcd-server.crt /etc/etcd/
}

通过 systemd 启动 etcd:

INTERNAL_IP=$(ip addr show enp0s8 | grep "inet " | awk '{print $2}' | cut -d / -f 1)
ETCD_NAME=$(hostname -s)

# etcd.service
cat <<EOF | sudo tee /etc/systemd/system/etcd.service
[Unit]
Description=etcd
Documentation=https://github.com/coreos

[Service]
ExecStart=/usr/local/bin/etcd \\
  --name ${ETCD_NAME} \\
  --cert-file=/etc/etcd/etcd-server.crt \\
  --key-file=/etc/etcd/etcd-server.key \\
  --peer-cert-file=/etc/etcd/etcd-server.crt \\
  --peer-key-file=/etc/etcd/etcd-server.key \\
  --trusted-ca-file=/etc/etcd/ca.crt \\
  --peer-trusted-ca-file=/etc/etcd/ca.crt \\
  --peer-client-cert-auth \\
  --client-cert-auth \\
  --initial-advertise-peer-urls https://${INTERNAL_IP}:2380 \\
  --listen-peer-urls https://${INTERNAL_IP}:2380 \\
  --listen-client-urls https://${INTERNAL_IP}:2379,https://127.0.0.1:2379 \\
  --advertise-client-urls https://${INTERNAL_IP}:2379 \\
  --initial-cluster-token etcd-cluster-0 \\
  --initial-cluster master-1=https://192.168.56.11:2380,master-2=https://192.168.56.12:2380,master-3=https://192.168.56.13:2380 \\
  --initial-cluster-state new \\
  --data-dir=/var/lib/etcd
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF


{
  sudo systemctl daemon-reload
  sudo systemctl enable etcd
  sudo systemctl start etcd
}

分别在三台 master nodes 上执行相同的部署,才可以完成 etcd 集群部署。

验证 etcd 状态:

sudo ETCDCTL_API=3 etcdctl endpoint health \
  --endpoints=https://192.168.56.11:2379,https://192.168.56.12:2379,https://192.168.56.13:2379 \
  --cacert=/etc/etcd/ca.crt \
  --cert=/etc/etcd/etcd-server.crt \
  --key=/etc/etcd/etcd-server.key \
  --write-out=table
  
+----------------------------+--------+-------------+-------+
|          ENDPOINT          | HEALTH |    TOOK     | ERROR |
+----------------------------+--------+-------------+-------+
| https://192.168.56.11:2379 |   true | 19.020094ms |       |
| https://192.168.56.13:2379 |   true | 21.171837ms |       |
| https://192.168.56.12:2379 |   true | 25.935113ms |       |
+----------------------------+--------+-------------+-------+

5.2 kube-apiserver 部署

下载指定版本二进制:

KUBE_BIN_VERSION=v1.25.0

wget -q --show-progress --https-only --timestamping \
  "https://storage.googleapis.com/kubernetes-release/release/${KUBE_BIN_VERSION}/bin/linux/amd64/kube-apiserver" \
  "https://storage.googleapis.com/kubernetes-release/release/${KUBE_BIN_VERSION}/bin/linux/amd64/kube-controller-manager" \
  "https://storage.googleapis.com/kubernetes-release/release/${KUBE_BIN_VERSION}/bin/linux/amd64/kube-scheduler" \
  "https://storage.googleapis.com/kubernetes-release/release/${KUBE_BIN_VERSION}/bin/linux/amd64/kubectl"
  
{
  chmod +x kube-apiserver kube-controller-manager kube-scheduler kubectl
  sudo cp kube-apiserver kube-controller-manager kube-scheduler kubectl /usr/local/bin/
}
{
  sudo mkdir -p /var/lib/kubernetes/

  sudo cp ca.crt ca.key kube-apiserver.crt kube-apiserver.key \
    service-account.key service-account.crt \
    etcd-server.key etcd-server.crt \
    encryption-config.yaml /var/lib/kubernetes/
}

通过 systemd 启动 kube-apiserver:

INTERNAL_IP=$(ip addr show enp0s8 | grep "inet " | awk '{print $2}' | cut -d / -f 1)

cat <<EOF | sudo tee /etc/systemd/system/kube-apiserver.service
[Unit]
Description=Kubernetes API Server
Documentation=https://github.com/kubernetes/kubernetes

[Service]
ExecStart=/usr/local/bin/kube-apiserver \\
  --advertise-address=${INTERNAL_IP} \\
  --allow-privileged=true \\
  --audit-log-maxage=30 \\
  --audit-log-maxbackup=3 \\
  --audit-log-maxsize=100 \\
  --audit-log-path=/var/log/audit.log \\
  --authorization-mode=Node,RBAC \\
  --bind-address=0.0.0.0 \\
  --client-ca-file=/var/lib/kubernetes/ca.crt \\
  --enable-admission-plugins=NodeRestriction,ServiceAccount \\
  --enable-bootstrap-token-auth=true \\
  --etcd-cafile=/var/lib/kubernetes/ca.crt \\
  --etcd-certfile=/var/lib/kubernetes/etcd-server.crt \\
  --etcd-keyfile=/var/lib/kubernetes/etcd-server.key \\
  --etcd-servers=https://192.168.56.11:2379,https://192.168.56.12:2379,https://192.168.56.13:2379 \\
  --event-ttl=1h \\
  --encryption-provider-config=/var/lib/kubernetes/encryption-config.yaml \\
  --kubelet-certificate-authority=/var/lib/kubernetes/ca.crt \\
  --kubelet-client-certificate=/var/lib/kubernetes/kube-apiserver.crt \\
  --kubelet-client-key=/var/lib/kubernetes/kube-apiserver.key \\
  --runtime-config=api/all=true \\
  --service-account-issuer=https://kubernetes.default.svc.cluster.local \\
  --service-account-key-file=/var/lib/kubernetes/service-account.crt \\
  --service-account-signing-key-file=/var/lib/kubernetes/service-account.key \\
  --service-cluster-ip-range=10.96.0.0/24 \\
  --service-node-port-range=30000-32767 \\
  --tls-cert-file=/var/lib/kubernetes/kube-apiserver.crt \\
  --tls-private-key-file=/var/lib/kubernetes/kube-apiserver.key \\
  --v=2
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF

5.3 kube-controller-manager 部署

sudo cp kube-controller-manager.kubeconfig /var/lib/kubernetes/

cat <<EOF | sudo tee /etc/systemd/system/kube-controller-manager.service
[Unit]
Description=Kubernetes Controller Manager
Documentation=https://github.com/kubernetes/kubernetes

[Service]
ExecStart=/usr/local/bin/kube-controller-manager \\
  --bind-address=0.0.0.0 \\
  --cluster-cidr=192.168.56.0/24 \\
  --cluster-name=kubernetes \\
  --cluster-signing-cert-file=/var/lib/kubernetes/ca.crt \\
  --cluster-signing-key-file=/var/lib/kubernetes/ca.key \\
  --kubeconfig=/var/lib/kubernetes/kube-controller-manager.kubeconfig \\
  --leader-elect=true \\
  --root-ca-file=/var/lib/kubernetes/ca.crt \\
  --service-account-private-key-file=/var/lib/kubernetes/service-account.key \\
  --service-cluster-ip-range=10.96.0.0/24 \\
  --use-service-account-credentials=true \\
  --v=2
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF

5.4 kube-scheduler 部署

sudo cp kube-scheduler.kubeconfig /var/lib/kubernetes/

cat <<EOF | sudo tee /etc/systemd/system/kube-scheduler.service
[Unit]
Description=Kubernetes Scheduler
Documentation=https://github.com/kubernetes/kubernetes

[Service]
ExecStart=/usr/local/bin/kube-scheduler \\
  --kubeconfig=/var/lib/kubernetes/kube-scheduler.kubeconfig \\
  --bind-address=127.0.0.1 \\
  --leader-elect=true \\
  --v=2
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF

5.5 master 组件启动与验证

{
  sudo systemctl daemon-reload
  sudo systemctl enable kube-apiserver kube-controller-manager kube-scheduler
  sudo systemctl start kube-apiserver kube-controller-manager kube-scheduler
}

查看组件日志:

sudo journalctl -u kube-apiserver

查看组件状态:

kubectl get componentstatuses --kubeconfig admin.kubeconfig

NAME                 STATUS    MESSAGE              ERROR
controller-manager   Healthy   ok
scheduler            Healthy   ok
etcd-0               Healthy   {"health": "true"}
etcd-1               Healthy   {"health": "true"}
etcd-2               Healthy   {"health": "true"}

在所有 master nodes 执行相同操作,即可完成 master 组件部署。

6. 配置 HAProxy LB

当配置了多个 master kube-apiserver 后,可通过 HAProxy 配置一个 LB 给 worker 或 客户端 (kubectl) 连接,当某个 kube-apiserver 节点挂掉后,不影响集群的正常运行,提高 HA 稳定性。

sudo apt-get update && sudo apt-get install -y haproxy
cat <<EOF | sudo tee /etc/haproxy/haproxy.cfg 
frontend kubernetes
    bind 192.168.56.30:6443
    option tcplog
    mode tcp
    default_backend kubernetes-master-nodes

backend kubernetes-master-nodes
    mode tcp
    balance roundrobin
    option tcp-check
    server master-1 192.168.56.11:6443 check fall 3 rise 2
    server master-2 192.168.56.12:6443 check fall 3 rise 2
    server master-3 192.168.56.13:6443 check fall 3 rise 2
EOF
sudo service haproxy restart

验证:

curl  https://192.168.56.30:6443/version -k

{
  "major": "1",
  "minor": "25",
  "gitVersion": "v1.25.0",
  "gitCommit": "a866cbe2e5bbaa01cfd5e969aa3e033f3282a8a2",
  "gitTreeState": "clean",
  "buildDate": "2022-08-23T17:38:15Z",
  "goVersion": "go1.19",
  "compiler": "gc",
  "platform": "linux/amd64"
}

7. Worker node 部署

7.1 部署方式一:手动生成证书

在 master-1 上操作:

cat > openssl-worker-1.cnf <<EOF
[req]
req_extensions = v3_req
distinguished_name = req_distinguished_name
[req_distinguished_name]
[ v3_req ]
basicConstraints = CA:FALSE
keyUsage = nonRepudiation, digitalSignature, keyEncipherment
subjectAltName = @alt_names
[alt_names]
DNS.1 = worker-1
IP.1 = 192.168.56.21
EOF

openssl genrsa -out worker-1.key 2048
openssl req -new -key worker-1.key -subj "/CN=system:node:worker-1/O=system:nodes" -out worker-1.csr -config openssl-worker-1.cnf
openssl x509 -req -in worker-1.csr -CA ca.crt -CAkey ca.key -CAcreateserial  -out worker-1.crt -extensions v3_req -extfile openssl-worker-1.cnf -days 1000

配置 worker kubeconfig:

LOADBALANCER_ADDRESS=192.168.56.30

{
  kubectl config set-cluster kubernetes-the-hard-way \
    --certificate-authority=ca.crt \
    --embed-certs=true \
    --server=https://${LOADBALANCER_ADDRESS}:6443 \
    --kubeconfig=worker-1.kubeconfig

  kubectl config set-credentials system:node:worker-1 \
    --client-certificate=worker-1.crt \
    --client-key=worker-1.key \
    --embed-certs=true \
    --kubeconfig=worker-1.kubeconfig

  kubectl config set-context default \
    --cluster=kubernetes-the-hard-way \
    --user=system:node:worker-1 \
    --kubeconfig=worker-1.kubeconfig

  kubectl config use-context default --kubeconfig=worker-1.kubeconfig
}

复制以上文件到 worker-1:

scp ca.crt worker-1.crt worker-1.key worker-1.kubeconfig worker-1:~/

在 worker-1 下载指定版本二进制:

KUBE_BIN_VERSION=v1.25.0

wget -q --show-progress --https-only --timestamping \
  https://storage.googleapis.com/kubernetes-release/release/${KUBE_BIN_VERSION}/bin/linux/amd64/kubectl \
  https://storage.googleapis.com/kubernetes-release/release/${KUBE_BIN_VERSION}/bin/linux/amd64/kube-proxy \
  https://storage.googleapis.com/kubernetes-release/release/${KUBE_BIN_VERSION}/bin/linux/amd64/kubelet

{
  chmod +x kubectl kube-proxy kubelet
  sudo mv kubectl kube-proxy kubelet /usr/local/bin/
}
sudo mkdir -p \
  /etc/cni/net.d \
  /opt/cni/bin \
  /var/lib/kubelet \
  /var/lib/kube-proxy \
  /var/lib/kubernetes \
  /var/run/kubernetes


{
  sudo mv ${HOSTNAME}.key ${HOSTNAME}.crt /var/lib/kubelet/
  sudo mv ${HOSTNAME}.kubeconfig /var/lib/kubelet/kubeconfig
  sudo mv ca.crt /var/lib/kubernetes/
}

创建 kubelet-config.yaml 配置文件:

cat <<EOF | sudo tee /var/lib/kubelet/kubelet-config.yaml
kind: KubeletConfiguration
apiVersion: kubelet.config.k8s.io/v1beta1
authentication:
  anonymous:
    enabled: false
  webhook:
    enabled: true
  x509:
    clientCAFile: "/var/lib/kubernetes/ca.crt"
authorization:
  mode: Webhook
clusterDomain: "cluster.local"
clusterDNS:
  - "10.96.0.10"
resolvConf: "/run/systemd/resolve/resolv.conf"
runtimeRequestTimeout: "15m"
EOF

配置 systemd unit:

cat <<EOF | sudo tee /etc/systemd/system/kubelet.service
[Unit]
Description=Kubernetes Kubelet
Documentation=https://github.com/kubernetes/kubernetes
After=containerd.service
Requires=containerd.service runc

[Service]
ExecStart=/usr/local/bin/kubelet \\
  --config=/var/lib/kubelet/kubelet-config.yaml \\
  --kubeconfig=/var/lib/kubelet/kubeconfig \\
  --tls-cert-file=/var/lib/kubelet/${HOSTNAME}.crt \\
  --tls-private-key-file=/var/lib/kubelet/${HOSTNAME}.key \\
  --container-runtime-endpoint=unix:///run/containerd/containerd.sock \\
  --register-node=true \\
  --pod-manifest-path=/etc/kubernetes/manifests \\
  --v=2
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF

创建 kube-proxy-config.yaml 配置文件:

sudo mv kube-proxy.kubeconfig /var/lib/kube-proxy/kubeconfig

cat <<EOF | sudo tee /var/lib/kube-proxy/kube-proxy-config.yaml
kind: KubeProxyConfiguration
apiVersion: kubeproxy.config.k8s.io/v1alpha1
clientConnection:
  kubeconfig: "/var/lib/kube-proxy/kubeconfig"
mode: "iptables"
clusterCIDR: "192.168.56.0/24"
EOF

配置 systemd unit:

cat <<EOF | sudo tee /etc/systemd/system/kube-proxy.service
[Unit]
Description=Kubernetes Kube Proxy
Documentation=https://github.com/kubernetes/kubernetes

[Service]
ExecStart=/usr/local/bin/kube-proxy \\
  --config=/var/lib/kube-proxy/kube-proxy-config.yaml
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF

启动服务:

{
  sudo systemctl daemon-reload
  sudo systemctl enable kubelet kube-proxy
  sudo systemctl start kubelet kube-proxy
}

查看组件日志:

sudo journalctl -u kubelet

7.2 部署 CRI containerd

已安装 docker 时:

sudo mkdir -p /etc/containerd
containerd config default | sudo tee /etc/containerd/config.toml
sudo systemctl restart containerd

未安装 docker 时:

{
  CONTAINERD_VERSION=1.6.8
  RUNC_VERSION=1.1.4

  wget -q --show-progress --https-only --timestamping \
    https://github.com/containerd/containerd/releases/download/v${CONTAINERD_VERSION}/containerd-${CONTAINERD_VERSION}-linux-amd64.tar.gz \
    https://github.com/opencontainers/runc/releases/download/v${RUNC_VERSION}/runc.amd64


  sudo chmod +x runc.amd64
  sudo mv runc.amd64 /usr/local/bin/runc

  sudo tar -xzvf containerd-${CONTAINERD_VERSION}-linux-amd64.tar.gz -C /usr/local
}

配置 containerd.service:

cat <<EOF | sudo tee /etc/systemd/system/containerd.service
[Unit]
Description=containerd container runtime
Documentation=https://containerd.io
After=network.target local-fs.target
Requires=runc

[Service]
ExecStartPre=-/sbin/modprobe overlay
ExecStart=/usr/local/bin/containerd

Type=notify
Delegate=yes
KillMode=process
Restart=always
RestartSec=5
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNPROC=infinity
LimitCORE=infinity
LimitNOFILE=infinity
# Comment TasksMax if your systemd version does not supports it.
# Only systemd 226 and above support this version.
TasksMax=infinity
OOMScoreAdjust=-999

[Install]
WantedBy=multi-user.target
EOF
{
  sudo systemctl daemon-reload
  sudo systemctl enable containerd
  sudo systemctl start containerd
}

需要切换到 root 用户,否则很多文件夹没权限创建,导致 kubelet 起不来。

7.3 部署方式二:TLS bootstrap 自动生成、续签证书

在 worker-2 下载指定版本二进制:

KUBE_BIN_VERSION=v1.25.0

wget -q --show-progress --https-only --timestamping \
  https://storage.googleapis.com/kubernetes-release/release/${KUBE_BIN_VERSION}/bin/linux/amd64/kubectl \
  https://storage.googleapis.com/kubernetes-release/release/${KUBE_BIN_VERSION}/bin/linux/amd64/kube-proxy \
  https://storage.googleapis.com/kubernetes-release/release/${KUBE_BIN_VERSION}/bin/linux/amd64/kubelet

{
  chmod +x kubectl kube-proxy kubelet
  sudo mv kubectl kube-proxy kubelet /usr/local/bin/
}
sudo mkdir -p \
  /etc/cni/net.d \
  /opt/cni/bin \
  /var/lib/kubelet \
  /var/lib/kube-proxy \
  /var/lib/kubernetes \
  /var/run/kubernetes
  
sudo mv ca.crt /var/lib/kubernetes/

下面步骤都在 master-1:

创建 bootstrap-token-secret:

cat > bootstrap-token-07401b.yaml <<EOF
apiVersion: v1
kind: Secret
metadata:
  # Name MUST be of form "bootstrap-token-<token id>"
  name: bootstrap-token-07401b
  namespace: kube-system

# Type MUST be 'bootstrap.kubernetes.io/token'
type: bootstrap.kubernetes.io/token
stringData:
  # Human readable description. Optional.
  description: "The default bootstrap token generated by 'kubeadm init'."

  # Token ID and secret. Required.
  token-id: 07401b
  token-secret: f395accd246ae52d

  # Expiration. Optional.
  expiration: 2023-09-30T03:22:11Z

  # Allowed usages.
  usage-bootstrap-authentication: "true"
  usage-bootstrap-signing: "true"

  # Extra groups to authenticate the token as. Must start with "system:bootstrappers:"
  auth-extra-groups: system:bootstrappers:worker
EOF


kubectl create -f bootstrap-token-07401b.yaml --kubeconfig=admin.kubeconfig

创建 csr-crb:

cat > csrs-for-bootstrapping.yaml <<EOF
# enable bootstrapping nodes to create CSR
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: create-csrs-for-bootstrapping
subjects:
- kind: Group
  name: system:bootstrappers
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: ClusterRole
  name: system:node-bootstrapper
  apiGroup: rbac.authorization.k8s.io
EOF


kubectl create -f csrs-for-bootstrapping.yaml --kubeconfig=admin.kubeconfig

创建 auto-approve-csr-crb:

cat > auto-approve-csrs-for-group.yaml <<EOF
# Approve all CSRs for the group "system:bootstrappers"
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: auto-approve-csrs-for-group
subjects:
- kind: Group
  name: system:bootstrappers
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: ClusterRole
  name: system:certificates.k8s.io:certificatesigningrequests:nodeclient
  apiGroup: rbac.authorization.k8s.io
EOF


kubectl create -f auto-approve-csrs-for-group.yaml --kubeconfig=admin.kubeconfig

创建 auto-approve-renew-crb:

cat > auto-approve-renewals-for-nodes.yaml <<EOF
# Approve renewal CSRs for the group "system:nodes"
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: auto-approve-renewals-for-nodes
subjects:
- kind: Group
  name: system:nodes
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: ClusterRole
  name: system:certificates.k8s.io:certificatesigningrequests:selfnodeclient
  apiGroup: rbac.authorization.k8s.io
EOF


kubectl create -f auto-approve-renewals-for-nodes.yaml --kubeconfig=admin.kubeconfig

下面步骤都在 worker-2:

创建 kubelet bootstrap-kubeconfig:

cat <<EOF | sudo tee /var/lib/kubelet/bootstrap-kubeconfig
apiVersion: v1
clusters:
- cluster:
    certificate-authority: /var/lib/kubernetes/ca.crt
    server: https://192.168.56.30:6443
  name: bootstrap
contexts:
- context:
    cluster: bootstrap
    user: kubelet-bootstrap
  name: bootstrap
current-context: bootstrap
kind: Config
preferences: {}
users:
- name: kubelet-bootstrap
  user:
    token: 07401b.f395accd246ae52d
EOF

创建 kubelet-config:

cat <<EOF | sudo tee /var/lib/kubelet/kubelet-config.yaml
kind: KubeletConfiguration
apiVersion: kubelet.config.k8s.io/v1beta1
authentication:
  anonymous:
    enabled: false
  webhook:
    enabled: true
  x509:
    clientCAFile: "/var/lib/kubernetes/ca.crt"
authorization:
  mode: Webhook
clusterDomain: "cluster.local"
clusterDNS:
  - "10.96.0.10"
resolvConf: "/run/systemd/resolve/resolv.conf"
runtimeRequestTimeout: "15m"
EOF

创建 kubelet systemd service:

cat <<EOF | sudo tee /etc/systemd/system/kubelet.service
[Unit]
Description=Kubernetes Kubelet
Documentation=https://github.com/kubernetes/kubernetes
After=containerd.service
Requires=containerd.service runc

[Service]
ExecStart=/usr/local/bin/kubelet \\
  --bootstrap-kubeconfig="/var/lib/kubelet/bootstrap-kubeconfig" \\
  --config=/var/lib/kubelet/kubelet-config.yaml \\
  --kubeconfig=/var/lib/kubelet/kubeconfig \\
  --cert-dir=/var/lib/kubelet/pki/ \\
  --rotate-certificates=true \\
  --rotate-server-certificates=true \\
  --container-runtime-endpoint=unix:///run/containerd/containerd.sock \\
  --register-node=true \\
  --pod-manifest-path=/etc/kubernetes/manifests \\
  --v=2
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF

7.4 kube-proxy 部署

sudo mv kube-proxy.kubeconfig /var/lib/kube-proxy/kubeconfig

创建 kube-proxy-config:

cat <<EOF | sudo tee /var/lib/kube-proxy/kube-proxy-config.yaml
kind: KubeProxyConfiguration
apiVersion: kubeproxy.config.k8s.io/v1alpha1
clientConnection:
  kubeconfig: "/var/lib/kube-proxy/kubeconfig"
mode: "iptables"
clusterCIDR: "192.168.56.0/24"
EOF

创建 kube-proxy systemd service:

cat <<EOF | sudo tee /etc/systemd/system/kube-proxy.service
[Unit]
Description=Kubernetes Kube Proxy
Documentation=https://github.com/kubernetes/kubernetes

[Service]
ExecStart=/usr/local/bin/kube-proxy \\
  --config=/var/lib/kube-proxy/kube-proxy-config.yaml
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF

7.5 启动 kubelet、kube-proxy

启动 kubelet、kube-proxy:

{
  sudo systemctl daemon-reload
  sudo systemctl enable kubelet kube-proxy
  sudo systemctl start kubelet kube-proxy
}

CRI containerd 部署同方式一。

8. 配置 kubectl

未设置 --kubeconfig,则会在 ${HOME}/.kube/config 自动生成。

在 master 或 worker node 上,配置含 admin 权限的 kubeconfig:

{
  KUBERNETES_LB_ADDRESS=192.168.56.30

  kubectl config set-cluster kubernetes-the-hard-way \
    --certificate-authority=ca.crt \
    --embed-certs=true \
    --server=https://${KUBERNETES_LB_ADDRESS}:6443

  kubectl config set-credentials admin \
    --client-certificate=admin.crt \
    --client-key=admin.key

  kubectl config set-context kubernetes-the-hard-way \
    --cluster=kubernetes-the-hard-way \
    --user=admin

  kubectl config use-context kubernetes-the-hard-way
}

验证:

kubectl get no

NAME       STATUS     ROLES    AGE   VERSION
worker-1   NotReady   <none>   25h   v1.25.0
worker-2   NotReady   <none>   60m   v1.25.0

kubectl get componentstatus

Warning: v1 ComponentStatus is deprecated in v1.19+
NAME                 STATUS    MESSAGE             ERROR
etcd-0               Healthy   {"health":"true"}
etcd-2               Healthy   {"health":"true"}
etcd-1               Healthy   {"health":"true"}
controller-manager   Healthy   ok
scheduler            Healthy   ok

9. 配置 CNI 网络

在所有 workers (worker-1/2) 上:

9.1 下载 CNI 插件

wget https://github.com/containernetworking/plugins/releases/download/v1.1.1/cni-plugins-linux-amd64-v1.1.1.tgz

sudo mkdir -p /opt/cni/bin
sudo tar -xzvf cni-plugins-linux-amd64-v1.1.1.tgz --directory /opt/cni/bin/

查看支持的 CNI 插件:

ll /opt/cni/bin

total 63736
drwxrwxr-x 2 root root    4096 Mar  9  2022 ./
drwxr-xr-x 3 root root    4096 Sep 17 16:51 ../
-rwxr-xr-x 1 root root 3780654 Mar  9  2022 bandwidth*
-rwxr-xr-x 1 root root 4221977 Mar  9  2022 bridge*
-rwxr-xr-x 1 root root 9742834 Mar  9  2022 dhcp*
-rwxr-xr-x 1 root root 4345726 Mar  9  2022 firewall*
-rwxr-xr-x 1 root root 3811793 Mar  9  2022 host-device*
-rwxr-xr-x 1 root root 3241605 Mar  9  2022 host-local*
-rwxr-xr-x 1 root root 3922560 Mar  9  2022 ipvlan*
-rwxr-xr-x 1 root root 3295519 Mar  9  2022 loopback*
-rwxr-xr-x 1 root root 3959868 Mar  9  2022 macvlan*
-rwxr-xr-x 1 root root 3679140 Mar  9  2022 portmap*
-rwxr-xr-x 1 root root 4092460 Mar  9  2022 ptp*
-rwxr-xr-x 1 root root 3484284 Mar  9  2022 sbr*
-rwxr-xr-x 1 root root 2818627 Mar  9  2022 static*
-rwxr-xr-x 1 root root 3379564 Mar  9  2022 tuning*
-rwxr-xr-x 1 root root 3920827 Mar  9  2022 vlan*
-rwxr-xr-x 1 root root 3523475 Mar  9  2022 vrf*

9.2 CNI 实现一:安装 Weave Net

CNI 实现的插件,任选其一安装即可。

更多 CNI 实现请 查看

在 master-1 上:

kubectl apply -f https://github.com/weaveworks/weave/releases/download/v2.8.1/weave-daemonset-k8s.yaml

serviceaccount/weave-net created
clusterrole.rbac.authorization.k8s.io/weave-net created
clusterrolebinding.rbac.authorization.k8s.io/weave-net created
role.rbac.authorization.k8s.io/weave-net created
rolebinding.rbac.authorization.k8s.io/weave-net created
daemonset.apps/weave-net created

验证网络:

kubectl get po -A

NAMESPACE     NAME              READY   STATUS    RESTARTS       AGE
kube-system   weave-net-b6rgm   2/2     Running   1 (2m6s ago)   2m32s
kube-system   weave-net-rkjws   2/2     Running   1 (2m3s ago)   2m32s


kubectl get no
NAME       STATUS   ROLES    AGE    VERSION
worker-1   Ready    <none>   25h    v1.25.0
worker-2   Ready    <none>   100m   v1.25.0

9.3 CNI 实现二:安装 kube-router

CLUSTERCIDR=10.32.0.0/12 \
APISERVER=https://192.168.56.11:6443 \
sh -c 'curl https://raw.githubusercontent.com/cloudnativelabs/kube-router/master/daemonset/generic-kuberouter-all-features.yaml -o - | \
sed -e "s;%APISERVER%;$APISERVER;g" -e "s;%CLUSTERCIDR%;$CLUSTERCIDR;g"' | \
kubectl apply -f -

10. 配置 apiserver to kubelet 权限

On master-1:

创建 cr:

cat <<EOF | kubectl apply --kubeconfig admin.kubeconfig -f -
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  annotations:
    rbac.authorization.kubernetes.io/autoupdate: "true"
  labels:
    kubernetes.io/bootstrapping: rbac-defaults
  name: system:kube-apiserver-to-kubelet
rules:
  - apiGroups:
      - ""
    resources:
      - nodes/proxy
      - nodes/stats
      - nodes/log
      - nodes/spec
      - nodes/metrics
    verbs:
      - "*"
EOF

创建 crb:

cat <<EOF | kubectl apply --kubeconfig admin.kubeconfig -f -
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: system:kube-apiserver
  namespace: ""
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:kube-apiserver-to-kubelet
subjects:
  - apiGroup: rbac.authorization.k8s.io
    kind: User
    name: kube-apiserver
EOF

验证:

kubectl logs test-pod

pod running ok
kubectl exec -it test-pod -- sh

/ # env
KUBERNETES_SERVICE_PORT=443
KUBERNETES_PORT=tcp://10.96.0.1:443
HOSTNAME=test-pod
...

11. 配置 DNS 服务

On master-1:

部署 CoreDNS:

kubectl apply -f https://raw.githubusercontent.com/astraw99/kubernetes-the-hard-way/master/deployments/coredns.yaml

启动 busybox:1.28 测试容器:

busybox:latest 有问题,查看:docker-library/busybox#48

kubectl run busybox --image=busybox:1.28 --command -- sleep 3600

验证 dns:

kubectl exec -it busybox -- nslookup kubernetes


Server:    10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local

Name:      kubernetes
Address 1: 10.96.0.1 kubernetes.default.svc.cluster.local

12. Static pod 方式部署 Master 组件

12.1 安装 kubelet、kube-proxy

同上面手工方式。

12.2 证书、kubeconfig 准备

生成步骤同上面手工方式。

sudo mkdir -p /etc/kubernetes/pki/etcd
sudo chmod 0655 -R /etc/kubernetes/pki/
sudo chmod 0775 -R /var/lib/etcd/

sudo cp kube-scheduler.kubeconfig /etc/kubernetes/scheduler.conf
sudo cp kube-controller-manager.kubeconfig /etc/kubernetes/controller-manager.conf

sudo cp ca.* service-account.key /etc/kubernetes/pki/
sudo cp encryption-config.yaml kube-apiserver.crt kube-apiserver.key service-account.crt /etc/kubernetes/pki/

12.3 上传相关 manifests

Control-plane static pod manifests.

sudo mkdir -p /etc/kubernetes/manifests


ll /etc/kubernetes/manifests

etcd.yaml
kube-apiserver.yaml
kube-controller-manager.yaml
kube-scheduler.yaml

之后,kubelet 会自动创建对应的组件 pod,编辑后自动更新。

当 Master 节点上安装了 kubelet 后,通过 kubelet get node 将可以看到 master nodes。

Master node 可通过 kubectl label no xxx node-role.kubernetes.io/master=true 打上对应的角色标签。

13. 测试

13.1 验证 etcd 加密 secret 数据

kubectl create secret generic kubernetes-the-hard-way --from-literal="mykey=mydata"
sudo ETCDCTL_API=3 etcdctl get \
  --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/etcd/ca.crt \
  --cert=/etc/etcd/etcd-server.crt \
  --key=/etc/etcd/etcd-server.key\
  /registry/secrets/default/kubernetes-the-hard-way | hexdump -C
  
00000000  2f 72 65 67 69 73 74 72  79 2f 73 65 63 72 65 74  |/registry/secret|
00000010  73 2f 64 65 66 61 75 6c  74 2f 6b 75 62 65 72 6e  |s/default/kubern|
00000020  65 74 65 73 2d 74 68 65  2d 68 61 72 64 2d 77 61  |etes-the-hard-wa|
00000030  79 0a 6b 38 73 3a 65 6e  63 3a 61 65 73 63 62 63  |y.k8s:enc:aescbc|
00000040  3a 76 31 3a 6b 65 79 31  3a 8e ff fa ae 04 fc ed  |:v1:key1:.......|
00000050  c0 7d 34 0c f5 fc c8 89  c8 8f 95 79 2c 02 db 8c  |.}4........y,...|
00000060  8a 33 cb 23 ac eb 6d 6a  b1 0b 6d c4 2d 49 a1 99  |.3.#..mj..m.-I..|
00000070  82 08 cd f0 e3 96 1e 49  40 bd 9d 99 81 5e 82 49  |.......I@....^.I|
00000080  1a 5e b8 f8 a0 41 4f aa  41 63 2f 58 80 a1 97 54  |.^...AO.Ac/X...T|
00000090  32 01 80 0e b8 39 44 17  0c 58 90 ad 07 aa 00 9c  |2....9D..X......|
000000a0  fc ae 61 7f cb 3c 64 a3  60 b9 bb 93 88 03 7e 36  |..a..<d.`.....~6|
000000b0  ee 08 e4 9c b2 57 63 9d  da 5f b3 30 61 bb ae 18  |.....Wc.._.0a...|
000000c0  7a 4c 59 af 3e 29 fd 6c  ee 25 b9 b1 a3 01 11 c5  |zLY.>).l.%......|
000000d0  15 64 68 b3 f1 50 db cf  55 21 37 fc 04 03 24 13  |.dh..P..U!7...$.|
000000e0  52 37 ed 65 79 94 a3 f6  76 03 80 65 b7 d7 73 a8  |R7.ey...v..e..s.|
000000f0  e2 68 96 31 68 43 1b 1a  e8 61 30 87 61 ee a5 e6  |.h.1hC...a0.a...|
00000100  91 ed 20 1c 13 e1 26 0d  99 18 58 57 c5 f1 24 5c  |.. ...&...XW..$\|
00000110  22 4b c6 fd 45 39 6f ea  7b 29 77 94 81 aa 33 03  |"K..E9o.{)w...3.|
00000120  fa f9 e1 09 05 3d ef 8b  48 26 29 d5 81 49 29 d3  |.....=..H&)..I).|
00000130  ec e9 6b 74 ee ec 87 d6  65 a0 e2 f9 57 3f 82 e8  |..kt....e...W?..|
00000140  3c 20 40 6a 98 5b 48 02  29 5a dc 56 32 98 d0 96  |< @j.[H.)Z.V2...|
00000150  c3 e1 89 5c c0 f6 66 bb  88 0a                    |...\..f...|
0000015a

可以看到,secret 加密数据在 etcd 中采用 k8s:enc:aescbc:v1:key1 方式加密存储。

13.2 deployment

kubectl create deploy nginx --image=nginx
kubectl get po -l app=nginx
NAME                    READY   STATUS    RESTARTS   AGE
nginx-76d6c9b8c-p94w5   1/1     Running   0          30s

13.3 service

kubectl expose deploy nginx --type=NodePort --port 80
PORT_NUMBER=$(kubectl get svc -l app=nginx -o jsonpath="{.items[0].spec.ports[0].nodePort}")

curl http://worker-1:$PORT_NUMBER

13.4 logs

POD_NAME=$(kubectl get pods -l app=nginx -o jsonpath="{.items[0].metadata.name}")

kubectl logs $POD_NAME

13.5 exec

kubectl exec -it $POD_NAME -- nginx -v

nginx version: nginx/1.23.1

14. 坑点记录

14.1 etcd 证书权限

sudo chmod 0655 -R /etc/kubernetes/pki/

14.2 etcd 数据目录权限

sudo chmod 0775 -R /var/lib/etcd/

14.3 端口已占用

  • etcd: 2379, 2380 (peer), 2381 (healthz)
  • kube-controller-manager: 10257
  • kube-scheduler: 10259
  • kubelet: 10250, 10248 (healthz)
  • kube-apiserver: 6443
  • kube-proxy: 10249, 10256 (healthz)

14.4 Weave-net 在 worker nodes 起不来

kube-system   weave-net-4xswv                    2/2     Running            0                 20m
kube-system   weave-net-5dz5g                    1/2     CrashLoopBackOff   8 (3m24s ago)     20m
kube-system   weave-net-k84gc                    2/2     Running            0                 14m
kube-system   weave-net-klxg8                    1/2     CrashLoopBackOff   7 (97s ago)       12m
kube-system   weave-net-v88tz                    2/2     Running            0                 20m
kubectl logs -nkube-system   weave-net-5dz5g


Defaulted container "weave" out of: weave, weave-npc, weave-init (init)
FATA: 2022/10/07 07:27:41.109478 [kube-peers] Could not get peers: Get "https://10.96.0.1:443/api/v1/nodes": dial tcp 10.96.0.1:443: connect: connection refused
Failed to get peers

解决方法:

获取 kubernetes svc ClusterIP = 10.96.0.1

k get svc
NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)        AGE
kubernetes   ClusterIP   10.96.0.1    <none>        443/TCP        22d

在 worker nodes 上:

route add 10.96.0.1 gw 192.168.56.11

14.5 CRI 坑点

没配置:--container-runtime-endpoint

Sep 15 13:20:23 worker-1 kubelet[8014]: I0915 13:20:23.019025    8014 state_mem.go:36] "Initialized new in-memory state store"
Sep 15 13:20:23 worker-1 kubelet[8014]: I0915 13:20:23.019075    8014 util_unix.go:104] "Using this endpoint is deprecated, please consider using full URL format" endpoint="" URL="unix://"
Sep 15 13:20:23 worker-1 kubelet[8014]: I0915 13:20:23.019875    8014 dynamic_cafile_content.go:157] "Starting controller" name="client-ca-bundle::/var/lib/kubernetes/ca.crt"
Sep 15 13:20:23 worker-1 kubelet[8014]: W0915 13:20:23.020243    8014 logging.go:59] [core] [Channel #1 SubChannel #2] grpc: addrConn.createTransport failed to connect to {
Sep 15 13:20:23 worker-1 kubelet[8014]:   "Addr": "",
Sep 15 13:20:23 worker-1 kubelet[8014]:   "ServerName": "",
Sep 15 13:20:23 worker-1 kubelet[8014]:   "Attributes": null,
Sep 15 13:20:23 worker-1 kubelet[8014]:   "BalancerAttributes": null,
Sep 15 13:20:23 worker-1 kubelet[8014]:   "Type": 0,
Sep 15 13:20:23 worker-1 kubelet[8014]:   "Metadata": null
Sep 15 13:20:23 worker-1 kubelet[8014]: }. Err: connection error: desc = "transport: Error while dialing dial unix: missing address"
Sep 15 13:20:23 worker-1 kubelet[8014]: E0915 13:20:23.020961    8014 run.go:74] "command failed" err="failed to run Kubelet: unable to determine runtime API version: rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial unix: missing address\""
Sep 15 13:20:23 worker-1 systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE
Sep 15 13:20:23 worker-1 systemd[1]: kubelet.service: Failed with result 'exit-code'.

设置为 --container-runtime-endpoint=unix:///var/run/docker.sock

Sep 15 14:55:42 worker-1 kubelet[16034]: I0915 14:55:42.281052   16034 state_mem.go:36] "Initialized new in-memory state store"
Sep 15 14:55:42 worker-1 kubelet[16034]: W0915 14:55:42.281598   16034 logging.go:59] [core] [Channel #1 SubChannel #2] grpc: addrConn.createTransport failed to connect to {
Sep 15 14:55:42 worker-1 kubelet[16034]:   "Addr": "/var/run/docker.sock",
Sep 15 14:55:42 worker-1 kubelet[16034]:   "ServerName": "/var/run/docker.sock",
Sep 15 14:55:42 worker-1 kubelet[16034]:   "Attributes": null,
Sep 15 14:55:42 worker-1 kubelet[16034]:   "BalancerAttributes": null,
Sep 15 14:55:42 worker-1 kubelet[16034]:   "Type": 0,
Sep 15 14:55:42 worker-1 kubelet[16034]:   "Metadata": null
Sep 15 14:55:42 worker-1 kubelet[16034]: }. Err: write unix @->/var/run/docker.sock: write: broken pipe
Sep 15 14:55:42 worker-1 kubelet[16034]: E0915 14:55:42.281662   16034 run.go:74] "command failed" err="failed to run Kubelet: unable to determine runtime API version: rpc error: code = Unavailable desc = write unix @->/var/run/docker.sock: write: broken pipe"
Sep 15 14:55:42 worker-1 systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE
Sep 15 14:55:42 worker-1 systemd[1]: kubelet.service: Failed with result 'exit-code'.
srw-rw---- 1 root docker 0 Sep 16 07:18 /var/run/docker.sock=
srw-rw---- 1 root root 0 Sep 12 13:12 /run/containerd/containerd.sock=
echo -e "GET /containers/json HTTP/1.0\r\n" | nc -U /var/run/docker.sock
curl --unix-socket /var/run/docker.sock http://localhost/images/json

15. 小结

本文通过介绍从 0 到 1 手动一步一步搭建 K8s 集群环境,使我们对 K8s 核心组件 (etcd, kube-apiserver, kube-controller-manager, kube-scheduler, kubelet, kube-proxy 等) 的运行原理、依赖关系等有更加深入的掌握和理解。小结如下:

  • Master control-plane 组件可通过两种方式搭建:systemd 进程方式、static pod 方式;
  • Worker node 可通过两种方式搭建:手动生成证书方式、TLS bootstrap 自动生成/续签证书方式;
  • 敏感数据在 etcd 中可通过配置 EncryptionConfiguration 进行加密存储;
  • 多个 Master kube-apiserver 可通过 HAProxy 进行 LB 配置,提高 HA 稳定性;
  • CNI、CRI 等 K8s 接口规范有多种插件实现,可根据需要自行选择部署;

PS: 更多内容请关注 k8s-club

参考资料

  1. K8s architecture
  2. kubernetes-the-hard-way
  3. YouTube: Install Kubernetes Cluster from Scratch
  4. K8s static pod
  5. Container Runtimes