版本 & 组件选择
系统:Ubuntu 24.04
Kubernetes:v1.35
容器运行时:containerd (2.2.x)
网络组件:Calico (v3.31.x)
安装方式:kubeadm
前提条件
禁用所有节点的 swap
节点间的内网 IP 互相可访问 (推荐)
安装
在所有节点
安装必须的包
sudo apt-get update
sudo install -m 0755 -d /etc/apt/keyrings
sudo apt-get install -y apt-transport-https ca-certificates curl gpg下载安装 Kubernetes 软件包仓库的公共签名密钥
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.35/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg添加 Kubernetes 软件仓库
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.35/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list下载安装 docker (containerd) 的软件包仓库的公共签名密钥
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc添加 docker (containerd) 软件仓库
sudo tee /etc/apt/sources.list.d/docker.sources <<EOF
Types: deb
URIs: https://download.docker.com/linux/ubuntu
Suites: $(. /etc/os-release && echo "${UBUNTU_CODENAME:-$VERSION_CODENAME}")
Components: stable
Signed-By: /etc/apt/keyrings/docker.asc
EOF安装 containerd.io、kubelet、kubeadm、kubectl,并锁定版本
sudo apt-get update
sudo apt-get install -y containerd.io kubelet kubeadm kubectl
sudo apt-mark hold containerd.io kubelet kubeadm kubectl配置 containerd
写入 containerd 默认配置文件
containerd config default > /etc/containerd/config.toml找到如下配置项并修改其中的 SystemdCgroup 为 true
[plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.runc.options]
SystemdCgroup = true配置 registry,修改 config_path
[plugins.'io.containerd.cri.v1.images'.registry]
config_path = '/etc/containerd/certs.d'配置 containerd 镜像源 (如有自建源,可修改 mirrors_host 变量)
mirrors_host="m.daocloud.io"
mkdir -p /etc/containerd/certs.d/docker.io
cat <<EOF | sudo tee /etc/containerd/certs.d/docker.io/hosts.toml > /dev/null
server = "https://docker.io"
[host."https://hub.${mirrors_host}"]
capabilities = ["pull", "resolve"]
EOF
mkdir -p /etc/containerd/certs.d/registry.k8s.io
cat <<EOF | sudo tee /etc/containerd/certs.d/registry.k8s.io/hosts.toml > /dev/null
server = "https://registry.k8s.io"
[host."https://k8s.${mirrors_host}"]
capabilities = ["pull", "resolve", "push"]
EOF
mkdir -p /etc/containerd/certs.d/docker.elastic.co
cat <<EOF | sudo tee /etc/containerd/certs.d/docker.elastic.co/hosts.toml > /dev/null
server = "https://docker.elastic.co"
[host."https://elastic.${mirrors_host}"]
capabilities = ["pull", "resolve", "push"]
EOF
mkdir -p /etc/containerd/certs.d/gcr.io
cat <<EOF | sudo tee /etc/containerd/certs.d/gcr.io/hosts.toml > /dev/null
server = "https://gcr.io"
[host."https://gcr.${mirrors_host}"]
capabilities = ["pull", "resolve", "push"]
EOF
mkdir -p /etc/containerd/certs.d/ghcr.io
cat <<EOF | sudo tee /etc/containerd/certs.d/ghcr.io/hosts.toml > /dev/null
server = "https://ghcr.io"
[host."https://ghcr.${mirrors_host}"]
capabilities = ["pull", "resolve", "push"]
EOF
mkdir -p /etc/containerd/certs.d/k8s.gcr.io
cat <<EOF | sudo tee /etc/containerd/certs.d/k8s.gcr.io/hosts.toml > /dev/null
server = "https://k8s.gcr.io"
[host."https://k8s-gcr.${mirrors_host}"]
capabilities = ["pull", "resolve", "push"]
EOF
mkdir -p /etc/containerd/certs.d/mcr.microsoft.com
cat <<EOF | sudo tee /etc/containerd/certs.d/mcr.microsoft.com/hosts.toml > /dev/null
server = "https://mcr.microsoft.com"
[host."https://mcr.${mirrors_host}"]
capabilities = ["pull", "resolve", "push"]
EOF
mkdir -p /etc/containerd/certs.d/nvcr.io
cat <<EOF | sudo tee /etc/containerd/certs.d/nvcr.io/hosts.toml > /dev/null
server = "https://nvcr.io"
[host."https://nvcr.${mirrors_host}"]
capabilities = ["pull", "resolve", "push"]
EOF
mkdir -p /etc/containerd/certs.d/quay.io
cat <<EOF | sudo tee /etc/containerd/certs.d/quay.io/hosts.toml > /dev/null
server = "https://quay.io"
[host."https://quay.${mirrors_host}"]
capabilities = ["pull", "resolve", "push"]
EOF
mkdir -p /etc/containerd/certs.d/registry.jujucharms.com
cat <<EOF | sudo tee /etc/containerd/certs.d/registry.jujucharms.com/hosts.toml > /dev/null
server = "https://registry.jujucharms.com"
[host."https://jujucharms.${mirrors_host}"]
capabilities = ["pull", "resolve", "push"]
EOF
mkdir -p /etc/containerd/certs.d/rocks.canonical.com
cat <<EOF | sudo tee /etc/containerd/certs.d/rocks.canonical.com/hosts.toml > /dev/null
server = "https://rocks.canonical.com"
[host."https://rocks-canonical.${mirrors_host}"]
capabilities = ["pull", "resolve", "push"]
EOF重启 containerd
systemctl restart containerd配置内核参数
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF
modprobe overlay
modprobe br_netfilter
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
EOF
sudo sysctl --system仅在主节点
拉取镜像 (可选)
kubeadm config images pull启动部署 Kubernetes
其中:
192.168.128.1 为主节点的 IP,其他节点要能够通过这个 IP 访问到主节点
192.168.0.0/22 是为 pod 分配的网段
ignore-preflight-errors=Mem 仅在内存 < 1700 MB 时需要
kubeadm init \
--apiserver-advertise-address=192.168.128.1 \
--pod-network-cidr=192.168.0.0/22 \
--ignore-preflight-errors=Mem当看到类似如下的文本时,表示已安装成功:
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join ***:6443 --token *** \
--discovery-token-ca-cert-hash sha256:***记录如上 kubeadm join 的命令,在后续 worker 节点安装时需要
复制 kube 配置以开始使用 kubernetes 集群
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config仅在 worker 节点
输入如下命令以加入集群:
kubeadm join ***:6443 --token *** \
--discovery-token-ca-cert-hash sha256:***安装网络插件
下载 Calico (v3.31.3 为例,更多版本)
wget https://raw.githubusercontent.com/projectcalico/calico/v3.31.3/manifests/calico.yaml打开并找到如下配置项
# - name: CALICO_IPV4POOL_CIDR
# value: "192.168.0.0/16"将其取消注释并修改为安装集群时的pod-network-cidr,e.g.:
- name: CALICO_IPV4POOL_CIDR
value: "192.168.0.0/22"(可选) 设置 MTU
找到如下配置项
veth_mtu: ""修改配置项
veth_mtu: "1380"应用 Calico
kubectl apply -f calico.yaml(可选) 修改节点 IP
如果各节点是通过 VPN 虚拟组网,pod 之间通过错误的网卡发起连接,则需要修改节点 IP,错误现象如下:
# kubectl exec -it busybox-test-vt2l5 -- sh
error: Internal error occurred: error sending request: Post "https://172.17.19.245:10250/exec/default/busybox-test-vt2l5/busybox?command=sh&input=1&output=1&tty=1": dial tcp 172.17.19.245:10250: i/o timeout检查 node 发现 INTERNAL-IP 是云厂商网卡 IP 而不是 VPN IP,control-plane 无法通过如下 INTERNAL-IP 连接到其他节点
# kubectl get node -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
node1 Ready control-plane 17h v1.35.0 172.16.5.74 <none> Ubuntu 24.04.3 LTS 6.8.0-90-generic containerd://2.2.1
node2 Ready <none> 17h v1.35.0 172.17.19.245 <none> Ubuntu 24.04.3 LTS 6.8.0-40-generic containerd://2.2.1
node3 Ready <none> 17h v1.35.0 10.1.8.2 <none> Ubuntu 24.04.3 LTS 6.8.0-90-generic containerd://2.2.1解决方法 (在所有节点执行):
编辑 /etc/default/kubelet 文件
vim /etc/default/kubelet添加 node-ip 参数
KUBELET_EXTRA_ARGS="--node-ip=192.168.128.1"重启 kubelet
systemctl restart kubelet修改后,再次查看节点信息,所有节点 INTERNAL-IP 都已经修改:
# kubectl get node -owide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
node1 Ready control-plane 17h v1.35.0 192.168.128.1 <none> Ubuntu 24.04.3 LTS 6.8.0-90-generic containerd://2.2.1
node2 Ready <none> 17h v1.35.0 192.168.128.3 <none> Ubuntu 24.04.3 LTS 6.8.0-40-generic containerd://2.2.1
node3 Ready <none> 17h v1.35.0 192.168.128.2 <none> Ubuntu 24.04.3 LTS 6.8.0-90-generic containerd://2.2.1再次尝试连接 pod,已可正常连接:
# kubectl exec -it busybox-test-vt2l5 -- sh
/ #