1、前言
【适用】 1、部署环境无法联网; 2、有一台可以联网的服务器【联网节点】;(没有也没关系,需要的资源文中也都提供了)
本文通过kubeadm部署,kubeadm是官方社区推出的一个用于快速部署kubernetes集群的工具。这个工具能通过两条指令快速完成一个kubernetes集群的部署。
2、环境准备
2.1 软件环境
软件版本操作系统CentOS 7Docker19.03.13K8s1.23.0
2.2 服务器
最小硬件配置:2核CPU、2G内存、20G硬盘。
名称IPmaster192.168.18.134node1192.168.18.135node2192.168.18.136
master节点需要至少2个CPU,不然kubeadm init时会报错:
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR NumCPU]: the number of available CPUs 1 is less than the required 2
2.3 关闭防火墙
[root@localhost ~]# systemctl stop firewalld
[root@localhost ~]# systemctl disable firewalld
2.4 关闭selinux
# 重启生效,永久
[root@localhost ~]# sed -i s/SELINUX=enforcing/SELINUX=disabled/g /etc/selinux/config
# 立即生效,临时
[root@localhost ~]# setenforce 0
# 查看selinux的状态
[root@localhost ~]# getenforce
Disabled
2.5 关闭Swp
# 重启生效,永久
[root@localhost ~]# vi /etc/fstab
...
# 找到并注释下面这行(这一行定义了 swap 分区,并在启动时进行挂载。)
# /dev/mapper/centos-swap swap swap defaults 0 0
...
# 立即生效,临时
[root@localhost ~]# swapoff -a
# 查看一下Swap已经全部为0了
[root@localhost ~]# free -m
total used free shared buff/cache available
Mem: 2117 253 1351 9 512 1704
Swap: 0 0 0
如果没有关闭swp,kubeadm init 初始化会报错:
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
# systemctl status kubelet
# journalctl -xeu kubelet
..."Failed to run kubelet" err="failed to run Kubelet: running with swap on is not sup
2.6 设置hosts
在master添加hosts
cat >> /etc/hosts < 192.168.18.134 master 192.168.18.135 node1 192.168.18.136 node2 EOF 2.7 将桥接的IPv4流量传递到iptables的链 cat > /etc/sysctl.d/k8s.conf << EOF net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 EOF sysctl --system # 生效 3、安装Docker 3.1 下载docker软件包【联网节点】 找一台可以联网的机器上下载安装所需的软件。 资源下载:【docker软件包】 # 配置docker-ce源 wget -P /etc/yum.repos.d/ https://download.docker.com/linux/centos/docker-ce.repo # 查看docker所有版本 yum list docker-ce --showduplicates # docker及其依赖下载到~/dockerPackges文件夹中 yum install --downloadonly --downloaddir ~/dockerPackges docker-ce-19.03.13 docker-ce-cli-19.03.13 3.2 安装docker 在master及node节点创建~/k8s/docker目录,把下载好的docker软件包上传到各节点的此目录。 cd ~/k8s/docker # 安装 yum install ./*.rpm 3.3 启动docker # 启动docker systemctl start docker # 设置开机启动 systemctl enable docker # 查看docker服务信息 docker info 4、安装k8s组件【所有节点】 kubeadm:是一个工具,它可以初始化集群、引导新的节点加入集群等。kubelet:是运行在集群中所有节点上的代理。它确保容器都在运行状态。kubectl:是 Kubernetes 的命令行工具。可以使用它来管理 Kubernetes 集群。 kubeadm 和 kubelet 每个节点上都安装,而 kubectl 通常只安装在你打算执行管理命令的机器上。 4.1 下载k8s组件【联网节点】 需要安装组件: kubeadm、kubelet、kubectl ,版本要一致。在可以连外网的机器上下载组件,同上面docker。 资源下载:【k8s相关组件】 # 添加kubernetes 阿里yum源 cat > /etc/yum.repos.d/kubernetes.repo < cat > /etc/yum.repos.d/kubernetes.repo << EOF [kubernetes] name=Kubernetes baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64 enabled=1 gpgcheck=0 repo_gpgcheck=0 gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg EOF # k8s组件及其依赖下载到~/k8sPackges文件夹中 yum install --downloadonly --downloaddir ~/k8sPackges kubelet-1.23.0 kubeadm-1.23.0 kubectl-1.23.0 4.2 安装组件 在所有节点创建~/k8s/kubernetes目录,把下载好的k8s组件软件包上传到各节点的此目录。 cd ~/k8s/kubernetes # 安装 yum install ./*.rpm # 先设置kubelet 为自启动 systemctl enable kubelet 4.3 关闭 Docker 的 cgroups # 修改 /etc/docker/daemon.json,加入以下内容 "exec-opts": ["native.cgroupdriver=systemd"] # 重启 docker systemctl daemon-reload systemctl restart docker 5、拉取并导入kubeadm所需镜像【联网节点】 执行kubeadm时,要用到一些镜像,所以需要提前准备。 资源下载:【kubeadm相关镜像】 5.1 镜像拉取到本地 # pull [root@repo ~]# vim pull_images.sh #!/bin/bash images=( registry.aliyuncs.com/google_containers/kube-apiserver:v1.23.0 registry.aliyuncs.com/google_containers/kube-controller-manager:v1.23.0 registry.aliyuncs.com/google_containers/kube-scheduler:v1.23.0 registry.aliyuncs.com/google_containers/kube-proxy:v1.23.0 registry.aliyuncs.com/google_containers/pause:3.6 registry.aliyuncs.com/google_containers/etcd:3.5.1-0 registry.aliyuncs.com/google_containers/coredns:v1.8.6 ) for pullimageName in ${images[@]} ; do docker pull $pullimageName done [root@repo ~]# chmod +x pull_images.sh [root@repo ~]# ./pull_images.sh # 查看拉取的镜像 [root@repo ~]# docker images 5.2 导出本地镜像 # save [root@repo ~]# vim save_images.sh #!/bin/bash images=( registry.aliyuncs.com/google_containers/kube-apiserver:v1.23.0 registry.aliyuncs.com/google_containers/kube-controller-manager:v1.23.0 registry.aliyuncs.com/google_containers/kube-scheduler:v1.23.0 registry.aliyuncs.com/google_containers/kube-proxy:v1.23.0 registry.aliyuncs.com/google_containers/pause:3.6 registry.aliyuncs.com/google_containers/etcd:3.5.1-0 registry.aliyuncs.com/google_containers/coredns:v1.8.6 ) for imageName in ${images[@]}; do key=`echo $imageName | awk -F '\\\/' '{print $3}' | awk -F ':' '{print $1}'` docker save -o $key.tar $imageName done [root@repo ~]# chmod +x save_images.sh [root@repo ~]# ./save_images.sh [root@repo ~]# ll total 755536 -rw------- 1 root root 46967296 Sep 4 02:37 coredns.tar -rw------- 1 root root 293936128 Sep 4 02:37 etcd.tar -rw------- 1 root root 136559616 Sep 4 02:36 kube-apiserver.tar -rw------- 1 root root 126385152 Sep 4 02:37 kube-controller-manager.tar -rw------- 1 root root 114243584 Sep 4 02:37 kube-proxy.tar -rw------- 1 root root 54864896 Sep 4 02:37 kube-scheduler.tar -rw------- 1 root root 692736 Sep 4 02:37 pause.tar 5.3 导入到部署节点【所有节点】 # 将从联网节点导出的镜像上传到集群的各个节点 [root@master ~]# ll total 755536 -rw------- 1 root root 46967296 Sep 4 02:37 coredns.tar -rw------- 1 root root 293936128 Sep 4 02:37 etcd.tar -rw------- 1 root root 136559616 Sep 4 02:36 kube-apiserver.tar -rw------- 1 root root 126385152 Sep 4 02:37 kube-controller-manager.tar -rw------- 1 root root 114243584 Sep 4 02:37 kube-proxy.tar -rw------- 1 root root 54864896 Sep 4 02:37 kube-scheduler.tar -rw------- 1 root root 692736 Sep 4 02:37 pause.tar # load # 编写 load 脚本: [root@master ~]# vim load_images.sh #!/bin/bash images=( kube-apiserver kube-controller-manager kube-scheduler kube-proxy pause etcd coredns ) for imageName in ${images[@]} ; do key=.tar docker load -i $imageName$key done [root@master ~]# chmod +x load_images.sh [root@master ~]# ./load_images.sh [root@master ~]# docker images REPOSITORY TAG IMAGE ID CREATED SIZE registry.aliyuncs.com/google_containers//kube-apiserver v1.23.0 9ca5fafbe8dc 2 weeks ago 135MB registry.aliyuncs.com/google_containers//kube-proxy v1.23.0 71b9bf9750e1 2 weeks ago 112MB registry.aliyuncs.com/google_containers//kube-controller-manager v1.23.0 91a4a0d5de4e 2 weeks ago 125MB registry.aliyuncs.com/google_containers//kube-scheduler v1.23.0 d5c0efb802d9 2 weeks ago 53.5MB registry.aliyuncs.com/google_containers//etcd 3.5.1-0 25f8c7f3da61 10 months ago 293MB registry.aliyuncs.com/google_containers//coredns v1.8.6 a4ca41631cc7 11 months ago 46.8MB registry.aliyuncs.com/google_containers//pause 3.6 6270bb605e12 12 months ago 683kB 6、初始化Master节点【master执行】 6.1 kubeadm初始化 [root@master ~]# kubeadm init \ --apiserver-advertise-address=192.168.18.134 \ --image-repository registry.aliyuncs.com/google_containers \ --kubernetes-version v1.23.0 \ --service-cidr=10.96.0.0/12 \ --pod-network-cidr=10.244.0.0/16 –image-repository:镜像仓库,离线安装需要把相关镜像先拉取下来 –apiserver-advertise-address:集群通告地址 –image-repository:由于默认拉取镜像地址k8s.gcr.io国内无法访问,这里指定镜像仓库地址 –kubernetes-version:K8s版本,与上面安装的一致 –service-cidr:集群内部虚拟网络,Pod统一访问入口 –pod-network-cidr:Pod网络 初始化完成之后,会输出一个"kubeadm join ..."信息,先保存下来。node节点加入master会使用。 Your Kubernetes control-plane has initialized successfully! To start using your cluster, you need to run the following as a regular user: # 稍后执行 mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config Alternatively, if you are the root user, you can run: export KUBECONFIG=/etc/kubernetes/admin.conf You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ Then you can join any number of worker nodes by running the following on each as root: # 记录下来,用于向集群添加节点(有效期24小时) kubeadm join 192.168.18.134:6443 --token 6m4wt4.y90169m53e6nen8d \ --discovery-token-ca-cert-hash sha256:0ea734ba54d630659ed78463d0f38fc6c407fabe9c8a0d41913b626160981402 6.2 拷贝k8s认证文件 [root@master ~]# mkdir -p $HOME/.kube [root@master ~]# cp -i /etc/kubernetes/admin.conf $HOME/.kube/config [root@master ~]# chown $(id -u):$(id -g) $HOME/.kube/config 6.3 查看节点信息(验证) 由于网络插件还没有部署,还没有准备就绪 NotReady。 [root@master ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION master NotReady control-plane,master 6m46s v1.23.0 7、往集群加入node节点【node节点】 7.1 创建token 默认token有效期为24小时。token过期后,就需要重新创建token,可以直接使用命令快捷生成。 # 查看token信息 [root@master ~]# kubeadm token list TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS h42j8v.muli4y1asv6cwsgv 23h 2024-03-07T09:12:04Z authentication,signing The default bootstrap token generated by 'kubeadm init'. system:bootstrappers:kubeadm:default-node-token # 在master节点重新生成 Join Token,然后复制生成的内容,到从节点,执行下 [root@master ~]# kubeadm token create --print-join-command kubeadm join 192.168.18.134:6443 --token h9g5rn.y07uajj3d9r3v5hh --discovery-token-ca-cert-hash sha256:cfb734386ee0d27d4864900648c3eaf0e2f84b1e9f98d04b483ad9e702653c9e 7.2 向集群添加新节点 执行在 kubeadm 初始化时输出的"kubeadm join ..."命令。 [root@node1 ~]# kubeadm join 192.168.18.134:6443 --token 8y4nd8.ww9f2npklyebtjqp \ --discovery-token-ca-cert-hash sha256:c5f01fe144020785cb82b53bcda3b64c2fb8d955af3ca863b8c31d9980c32023 ... [kubelet-start] Activating the kubelet service [kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap... This node has joined the cluster: * Certificate signing request was sent to apiserver and a response was received. * The Kubelet was informed of the new secure connection details. Run 'kubectl get nodes' on the control-plane to see this node join the cluster. 7.3 查看节点信息 由于网络插件还没有部署,还没有准备就绪 NotReady。 [root@master ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION node1 NotReady node2 NotReady master NotReady control-plane,master 10m v1.23.0 8、安装Network插件(选其一) 网络插件是必要部件,常用的有Flannel、Calico等。云厂商一般是结合VPC有自己的一套实现。 注意,安装一种网络插件即可。本文使用了 Flannel,建议使用 Flannel;使用Calico可能会存在 centos 系统内核版本问题。 8.1 Flannel插件(可选) 资源下载:【flannel插件】 8.1.1 查看安装方法 查看flannel的官网https://github.com/coreos/flannel,找到安装方法。 kubectl apply -f https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.yml 8.1.2 下载yml文件 在有网络的机器上下载kube-flannel.yml文件。 直接浏览器访问下载:https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.yml 或者联网服务器: [root@repo ~]# wget https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.yml 把下载好的kube-flannel.yml文件上传到k8s集群的master节点上。 8.1.3 下载镜像 # 查看需要哪些镜像【master节点】 [root@master ~]# cat kube-flannel.yml | grep image image: docker.io/flannel/flannel:v0.23.0 image: docker.io/flannel/flannel-cni-plugin:v1.2.0 # pull:拉取镜像【联网节点】 docker pull docker.io/flannel/flannel:v0.23.0 docker pull docker.io/flannel/flannel-cni-plugin:v1.2.0 # save:导出镜像【联网节点】 docker save -o flannel_v0.23.0.tar flannel/flannel:v0.23.0 docker save -o flannel-cni-plugin_v1.2.0.tar flannel/flannel-cni-plugin:v1.2.0 8.1.4 导入镜像【所有节点】 把导出的镜像上传到集群节点 # load:集群各节点导入镜像 docker load -i flannel_v0.23.0.tar docker load -i flannel-cni-plugin_v1.2.0.tar 8.1.5 安装 flannel【Master节点】 [root@master ~]# kubectl apply -f kube-flannel.yml namespace/kube-flannel created serviceaccount/flannel created clusterrole.rbac.authorization.k8s.io/flannel created clusterrolebinding.rbac.authorization.k8s.io/flannel created configmap/kube-flannel-cfg created daemonset.apps/kube-flannel-ds created # 查看节点信息 [root@master ~]# kubectl get node NAME STATUS ROLES AGE VERSION master Ready master 5h46m v1.23.0 node1 Ready node2 Ready # 网络插件部署后,状态变为:Ready # 查看pod [root@master ~]# kubectl get pods -n kube-system 8.2 Calico 插件(可选) 资源下载:【calico插件相关】 8.2.1 下载calico.yaml文件【联网节点】 [root@repo ~]# wget --no-check-certificate https://docs.projectcalico.org/manifests/calico.yaml 8.2.2 下载镜像【联网节点】 下载哪些镜像: [root@repo ~]# grep image: calico.yaml image: calico/cni:v3.22.1 image: calico/cni:v3.22.1 image: calico/node:v3.22.1 image: calico/node:v3.22.1 image: calico/kube-controllers:v3.22.1 开始下载 # pull:拉取镜像 [root@repo ~]# vim pull_calico_images.sh #!/bin/bash images=( docker.io/calico/cni:v3.22.1 docker.io/calico/pod2daemon-flexvol:v3.22.1 docker.io/calico/node:v3.22.1 docker.io/calico/kube-controllers:v3.22.1 ) for pullimageName in ${images[@]} ; do docker pull $pullimageName done [root@repo ~]# chmod +x pull_calico_images.sh [root@repo ~]# ./pull_calico_images.sh # save:导出镜像【联网节点】 [root@repo ~]# vim save_calico_images.sh #!/bin/bash images=( docker.io/calico/cni:v3.22.1 docker.io/calico/pod2daemon-flexvol:v3.22.1 docker.io/calico/node:v3.22.1 docker.io/calico/kube-controllers:v3.22.1 ) for imageName in ${images[@]}; do key=`echo $imageName | awk -F '\\\/' '{print $3}' | awk -F ':' '{print $1}'` docker save -o $key.tar $imageName done [root@repo ~]# chmod +x save_calico_images.sh [root@repo ~]# ./save_calico_images.sh 8.2.3 导入镜像【所有节点】 把下载好的 calico.yaml 文件上传到k8s集群的master节点上。 将从联网节点导出的镜像上传到集群的所有节点。 [root@master ~]# vim load_calico_images.sh #!/bin/bash images=( cni kube-controllers node pod2daemon-flexvol ) for imageName in ${images[@]} ; do key=.tar docker load -i $imageName$key done [root@master ~]# chmod +x load_calico_images.sh [root@master ~]# ./load_calico_images.sh 8.2.4 编辑calico.yaml【master节点】 1、把calico.yaml里pod所在网段更新为kubeadm init时选项--pod-network-cidr所指定的网段。 # 查看pod网段 [root@master ~]# cat /etc/kubernetes/manifests/kube-controller-manager.yaml | grep "cluster-cidr=" - --cluster-cidr=10.244.0.0/16 2、指定网卡。 [root@master ~]# vim calico.yaml # 找到下面的内容进行修改 # no effect. This should fall within `--cluster-cidr`. - name: CALICO_IPV4POOL_CIDR # 去掉注释 value: "10.244.0.0/16" # 去掉注释,更新地址 # Disable file logging so `kubectl logs` works. - name: CALICO_DISABLE_FILE_LOGGING value: "true" # 指定网卡,不然创建pod时会有报错 # 找到这里 - name: CLUSTER_TYPE value: "k8s,bgp" # 在下面添加 - name: IP_AUTODETECTION_METHOD value: "interface=ens33" # ens33为本地网卡名 没有指定网卡,创建 pod 会有报错: 报错信息 network: error getting ClusterInformation: connection is unauthorized: Unauthorized 8.2.5 安装 Calico【Master节点】 [root@master ~]# kubectl apply -f calico.yaml # 查看节点信息 [root@master ~]# kubectl get node # 查看pod [root@master ~]# kubectl get pods -n kube-system 9、部署nginx服务【测试】 # 创建一个nginx服务 [root@master ~]# kubectl create deployment nginx --image=nginx # 暴露80端口 [root@master ~]# kubectl expose deployment nginx --port=80 --type=NodePort # 查看 pod 以及服务信息 [root@master ~]# kubectl get pods,svc NAME READY STATUS RESTARTS AGE service/nginx NodePort 10.99.0.96 # 容器内的80端口被映射到外部端口30868(对外暴露) # 此时可以通过30868端口访问nginx服务 [root@master ~]# curl http://192.168.38.10:30868 也可以浏览器访问:http://masterIP:30868、http://node1IP:30868、http://node2IP:30868 10、Pod 各种状态 状态码说明CrashLoopBackOff容器退出,kubelet正在将它重启InvalidImageName无法解析镜像名称ImageInspectError无法校验镜像ErrImageNeverPul策略禁止拉取镜像ImagePullBackOff正在重试拉取RegistryUnavailable连接不到镜像中心ErrImagePull通用的拉取镜像出错CreateContainerConfigError不能创建kubelet使用的容器配置CreateContainerError创建容器失败m.internalLifecycle.PreStartContainer执行hook报错RunContainerError启动容器失败PostStartHookError执行hook报错ContainersNotInitialized容器没有初始化完毕ContainersNotReady容器没有准备完毕ContainerCreating容器创建中PodInitializingpod 初始化中DockerDaemonNotReadydocker还没有完全启动NetworkPluginNotReady网络插件还没有完全启动Evicted即驱赶的意思,意思是当节点出现异常时,kubernetes将有相应的机制驱赶该节点上的Pod。 多见于资源不足时导致的驱赶。 FAQ kubeadm init 初始化报错 报错1:[ERROR Port-6443] error execution phase preflight: [preflight] Some fatal errors occurred: [ERROR Port-6443]: Port 6443 is in use [ERROR Port-10259]: Port 10259 is in use [ERROR Port-10257]: Port 10257 is in use ... 解决: # 重启kubeadm [root@master ~]# kubeadm reset 报错2: It seems like the kubelet isn’t running or healthy [etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests" [wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s [kubelet-check] Initial timeout of 40s passed. [kubelet-check] It seems like the kubelet isn't running or healthy. [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused. [kubelet-check] It seems like the kubelet isn't running or healthy. [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused. [WARNING IsDockerSystemdCheck] [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/ 解决: 以上两个报错,是因为Docker的Cgroup Driver驱动程序为cgroupfs。 使用K8S / Kubernetes时,默认情况下Kubernetes cgroup为systemd,所以需要将Cgroup Driver设置为systemd。 # 查看docker的 Cgroup Driver [root@master ~]# docker info | grep -E "Cgroup Driver" Cgroup Driver: cgroupfs # 将Cgroup Driver设置为 systemd [root@master ~]# vim /etc/docker/daemon.json { "exec-opts": ["native.cgroupdriver=systemd"] } # 重启docker [root@master ~]# systemctl restart docker # 查看docker的 Cgroup Driver [root@master ~]# docker info | grep -E "Cgroup Driver" Cgroup Driver: systemd # 重置kubeadm即可 [root@k8s-master ~]# kubeadm reset 向集群添加node报错 [preflight] Running pre-flight checks error execution phase preflight: [preflight] Some fatal errors occurred: [ERROR FileAvailable--etc-kubernetes-kubelet.conf]: /etc/kubernetes/kubelet.conf already exists [ERROR Port-10250]: Port 10250 is in use [ERROR FileAvailable--etc-kubernetes-pki-ca.crt]: /etc/kubernetes/pki/ca.crt already exists [preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...` To see the stack trace of this error execute with --v=5 or higher 解决: 由于该节点之前已添加过集群导致。 [root@node1 ~]# kubeadm reset 相关链接
发表评论