iaun
iaun
发布于 2026-01-22 / 37 阅读
0
0

Kubernetes学习01 - 安装集群

版本 & 组件选择

  • 系统:Ubuntu 24.04

  • Kubernetes:v1.35

  • 容器运行时:containerd (2.2.x)

  • 网络组件:Calico (v3.31.x)

  • 安装方式:kubeadm

前提条件

  1. 禁用所有节点的 swap

  2. 节点间的内网 IP 互相可访问 (推荐)

安装

在所有节点

  1. 安装必须的包

sudo apt-get update
sudo install -m 0755 -d /etc/apt/keyrings
sudo apt-get install -y apt-transport-https ca-certificates curl gpg
  1. 下载安装 Kubernetes 软件包仓库的公共签名密钥

curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.35/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
  1. 添加 Kubernetes 软件仓库

echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.35/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list
  1. 下载安装 docker (containerd) 的软件包仓库的公共签名密钥

sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc
  1. 添加 docker (containerd) 软件仓库

sudo tee /etc/apt/sources.list.d/docker.sources <<EOF
Types: deb
URIs: https://download.docker.com/linux/ubuntu
Suites: $(. /etc/os-release && echo "${UBUNTU_CODENAME:-$VERSION_CODENAME}")
Components: stable
Signed-By: /etc/apt/keyrings/docker.asc
EOF
  1. 安装 containerd.io、kubelet、kubeadm、kubectl,并锁定版本

sudo apt-get update
sudo apt-get install -y containerd.io kubelet kubeadm kubectl
sudo apt-mark hold containerd.io kubelet kubeadm kubectl
  1. 配置 containerd

  • 写入 containerd 默认配置文件

containerd config default > /etc/containerd/config.toml
  • 找到如下配置项并修改其中的 SystemdCgroup 为 true

          [plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.runc.options]
            SystemdCgroup = true
  • 配置 registry,修改 config_path

    [plugins.'io.containerd.cri.v1.images'.registry]
      config_path = '/etc/containerd/certs.d'
  • 配置 containerd 镜像源 (如有自建源,可修改 mirrors_host 变量)

mirrors_host="m.daocloud.io"

mkdir -p /etc/containerd/certs.d/docker.io

cat <<EOF | sudo tee /etc/containerd/certs.d/docker.io/hosts.toml > /dev/null
server = "https://docker.io"

[host."https://hub.${mirrors_host}"]
  capabilities = ["pull", "resolve"]

EOF

mkdir -p /etc/containerd/certs.d/registry.k8s.io
cat <<EOF | sudo tee /etc/containerd/certs.d/registry.k8s.io/hosts.toml > /dev/null
server = "https://registry.k8s.io"

[host."https://k8s.${mirrors_host}"]
  capabilities = ["pull", "resolve", "push"]
EOF

mkdir -p /etc/containerd/certs.d/docker.elastic.co
cat <<EOF | sudo tee /etc/containerd/certs.d/docker.elastic.co/hosts.toml > /dev/null
server = "https://docker.elastic.co"

[host."https://elastic.${mirrors_host}"]
  capabilities = ["pull", "resolve", "push"]
EOF

mkdir -p /etc/containerd/certs.d/gcr.io
cat <<EOF | sudo tee /etc/containerd/certs.d/gcr.io/hosts.toml > /dev/null
server = "https://gcr.io"

[host."https://gcr.${mirrors_host}"]
  capabilities = ["pull", "resolve", "push"]
EOF

mkdir -p /etc/containerd/certs.d/ghcr.io
cat <<EOF | sudo tee /etc/containerd/certs.d/ghcr.io/hosts.toml > /dev/null
server = "https://ghcr.io"

[host."https://ghcr.${mirrors_host}"]
  capabilities = ["pull", "resolve", "push"]
EOF

mkdir -p /etc/containerd/certs.d/k8s.gcr.io
cat <<EOF | sudo tee /etc/containerd/certs.d/k8s.gcr.io/hosts.toml > /dev/null
server = "https://k8s.gcr.io"

[host."https://k8s-gcr.${mirrors_host}"]
  capabilities = ["pull", "resolve", "push"]
EOF

mkdir -p /etc/containerd/certs.d/mcr.microsoft.com
cat <<EOF | sudo tee /etc/containerd/certs.d/mcr.microsoft.com/hosts.toml > /dev/null
server = "https://mcr.microsoft.com"

[host."https://mcr.${mirrors_host}"]
  capabilities = ["pull", "resolve", "push"]
EOF

mkdir -p /etc/containerd/certs.d/nvcr.io
cat <<EOF | sudo tee /etc/containerd/certs.d/nvcr.io/hosts.toml > /dev/null
server = "https://nvcr.io"

[host."https://nvcr.${mirrors_host}"]
  capabilities = ["pull", "resolve", "push"]
EOF

mkdir -p /etc/containerd/certs.d/quay.io
cat <<EOF | sudo tee /etc/containerd/certs.d/quay.io/hosts.toml > /dev/null
server = "https://quay.io"

[host."https://quay.${mirrors_host}"]
  capabilities = ["pull", "resolve", "push"]
EOF

mkdir -p /etc/containerd/certs.d/registry.jujucharms.com
cat <<EOF | sudo tee /etc/containerd/certs.d/registry.jujucharms.com/hosts.toml > /dev/null
server = "https://registry.jujucharms.com"

[host."https://jujucharms.${mirrors_host}"]
  capabilities = ["pull", "resolve", "push"]
EOF

mkdir -p /etc/containerd/certs.d/rocks.canonical.com
cat <<EOF | sudo tee /etc/containerd/certs.d/rocks.canonical.com/hosts.toml > /dev/null
server = "https://rocks.canonical.com"

[host."https://rocks-canonical.${mirrors_host}"]
  capabilities = ["pull", "resolve", "push"]
EOF
  • 重启 containerd

systemctl restart containerd
  • 配置内核参数

cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF

modprobe overlay
modprobe br_netfilter

cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
EOF
sudo sysctl --system

仅在主节点

  1. 拉取镜像 (可选)

kubeadm config images pull
  1. 启动部署 Kubernetes

其中:

  • 192.168.128.1 为主节点的 IP,其他节点要能够通过这个 IP 访问到主节点

  • 192.168.0.0/22 是为 pod 分配的网段

  • ignore-preflight-errors=Mem 仅在内存 < 1700 MB 时需要

kubeadm init \
  --apiserver-advertise-address=192.168.128.1 \
  --pod-network-cidr=192.168.0.0/22 \
  --ignore-preflight-errors=Mem

当看到类似如下的文本时,表示已安装成功:

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join ***:6443 --token *** \
        --discovery-token-ca-cert-hash sha256:***

记录如上 kubeadm join 的命令,在后续 worker 节点安装时需要

  1. 复制 kube 配置以开始使用 kubernetes 集群

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

仅在 worker 节点

输入如下命令以加入集群:

kubeadm join ***:6443 --token *** \
        --discovery-token-ca-cert-hash sha256:***

安装网络插件

  1. 下载 Calico (v3.31.3 为例,更多版本)

wget https://raw.githubusercontent.com/projectcalico/calico/v3.31.3/manifests/calico.yaml
  1. 打开并找到如下配置项

            # - name: CALICO_IPV4POOL_CIDR
            #   value: "192.168.0.0/16"
  • 将其取消注释并修改为安装集群时的pod-network-cidr,e.g.:

            - name: CALICO_IPV4POOL_CIDR
              value: "192.168.0.0/22"
  1. (可选) 设置 MTU

  • 找到如下配置项

  veth_mtu: ""
  • 修改配置项

  veth_mtu: "1380"
  1. 应用 Calico

kubectl apply -f calico.yaml

(可选) 修改节点 IP

如果各节点是通过 VPN 虚拟组网,pod 之间通过错误的网卡发起连接,则需要修改节点 IP,错误现象如下:

# kubectl exec -it busybox-test-vt2l5 -- sh
error: Internal error occurred: error sending request: Post "https://172.17.19.245:10250/exec/default/busybox-test-vt2l5/busybox?command=sh&input=1&output=1&tty=1": dial tcp 172.17.19.245:10250: i/o timeout

检查 node 发现 INTERNAL-IP 是云厂商网卡 IP 而不是 VPN IP,control-plane 无法通过如下 INTERNAL-IP 连接到其他节点

# kubectl get node -o wide
NAME      STATUS   ROLES           AGE   VERSION   INTERNAL-IP     EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION     CONTAINER-RUNTIME
node1     Ready    control-plane   17h   v1.35.0   172.16.5.74     <none>        Ubuntu 24.04.3 LTS   6.8.0-90-generic   containerd://2.2.1
node2     Ready    <none>          17h   v1.35.0   172.17.19.245   <none>        Ubuntu 24.04.3 LTS   6.8.0-40-generic   containerd://2.2.1
node3     Ready    <none>          17h   v1.35.0   10.1.8.2        <none>        Ubuntu 24.04.3 LTS   6.8.0-90-generic   containerd://2.2.1

解决方法 (在所有节点执行):

  1. 编辑 /etc/default/kubelet 文件

vim /etc/default/kubelet
  1. 添加 node-ip 参数

KUBELET_EXTRA_ARGS="--node-ip=192.168.128.1"
  1. 重启 kubelet

systemctl restart kubelet

修改后,再次查看节点信息,所有节点 INTERNAL-IP 都已经修改:

# kubectl get node -owide
NAME      STATUS   ROLES           AGE   VERSION   INTERNAL-IP     EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION     CONTAINER-RUNTIME
node1     Ready    control-plane   17h   v1.35.0   192.168.128.1   <none>        Ubuntu 24.04.3 LTS   6.8.0-90-generic   containerd://2.2.1
node2     Ready    <none>          17h   v1.35.0   192.168.128.3   <none>        Ubuntu 24.04.3 LTS   6.8.0-40-generic   containerd://2.2.1
node3     Ready    <none>          17h   v1.35.0   192.168.128.2   <none>        Ubuntu 24.04.3 LTS   6.8.0-90-generic   containerd://2.2.1

再次尝试连接 pod,已可正常连接:

# kubectl exec -it busybox-test-vt2l5 -- sh
/ #


评论