如何修改 Kubernetes 节点 IP 地址?
如何修改 Kubernetes 節點 IP 地址?
- ?博主介紹
- 前言
- 環境
- 操作
- master 節點
- node 節點
- 推薦操作
- 個人總結
?博主介紹
💂 個人主頁:蘇州程序大白💂 個人社區:CSDN全國各地程序猿
🤟作者介紹:中國DBA聯盟(ACDU)成員,CSDN全國各地程序猿(媛)聚集地管理員。目前從事工業自動化軟件開發工作。擅長C#、Java、機器視覺、底層算法等語言。2019年成立柒月軟件工作室,2021年注冊蘇州凱捷智能科技有限公司
💅 有任何問題歡迎私信,看到會及時回復
👤 微信號:stbsl6,微信公眾號:蘇州程序大白
💬如果文章對你有幫助,歡迎關注、點贊、收藏(一鍵三連)
🎯 想加入技術交流群的可以加我好友,群里會分享學習資料
前言
本地的虛擬機搭建的 Kubernetes 環境沒有固定 IP,結果節點 IP 變了,當然最簡單的方式是將節點重新固定回之前的 IP 地址,但是自己頭鐵想去修改下集群的 IP 地址,結果一路下來踩了好多坑,壓根就沒那么簡單~
環境
首先看下之前的環境:
? ~ cat /etc/hosts 192.168.0.111 master1 192.168.0.109 node1 192.168.0.110 node2新的 IP 地址:
? ~ cat /etc/hosts 192.168.0.106 master1 192.168.0.101 node1 192.168.0.105 node2所以我們需要修改所有節點的 IP 地址。
操作
首先將所有節點的 /etc/hosts 更改為新的地址。
提示:在操作任何文件之前強烈建議先備份。
master 節點
1、備份 /etc/kubernetes 目錄。
? cp -Rf /etc/kubernetes/ /etc/kubernetes-bak2、替換 /etc/kubernetes 中所有配置文件的 APIServer 地址。
? oldip=192.168.0.111 ? newip=192.168.0.106 # 查看之前的 ? find . -type f | xargs grep $oldip # 替換IP地址 ? find . -type f | xargs sed -i "s/$oldip/$newip/" # 檢查更新后的 ? find . -type f | xargs grep $newip3、識別 /etc/kubernetes/pki 中以舊的 IP 地址作為 alt name 的證書。
? cd /etc/kubernetes/pki ? for f in $(find -name "*.crt"); doopenssl x509 -in $f -text -noout > $f.txt; done ? grep -Rl $oldip . ? for f in $(find -name "*.crt"); do rm $f.txt; done4、找到 kube-system 命名空間中引用舊 IP 的 ConfigMap。
# 獲取所有的 kube-system 命名空間下面所有的 ConfigMap ? configmaps=$(kubectl -n kube-system get cm -o name | \awk '{print $1}' | \cut -d '/' -f 2)# 獲取所有的ConfigMap資源清單 ? dir=$(mktemp -d) ? for cf in $configmaps; dokubectl -n kube-system get cm $cf -o yaml > $dir/$cf.yaml done# 找到所有包含舊 IP 的 ConfigMap ? grep -Hn $dir/* -e $oldip# 然后編輯這些 ConfigMap,將舊 IP 替換成新的 IP ? kubectl -n kube-system edit cm kubeadm-config ? kubectl -n kube-system edit cm kube-proxy這一步非常非常重要,我在操作的時候忽略了這一步,導致 Flannel CNI 啟動不起來,一直報錯,類似下面的日志信息:
? kubectl logs -f kube-flannel-ds-pspzf -n kube-system I0512 14:46:26.044229 1 main.go:205] CLI flags config: {etcdEndpoints:http://127.0.0.1:4001,http://127.0.0.1:2379 etcdPrefix:/coreos.com/network etcdKeyfile: etcdCertfile: etcdCAFile: etcdUsername: etcdPassword: version:false kubeSubnetMgr:true kubeApiUrl: kubeAnnotationPrefix:flannel.alpha.coreos.com kubeConfigFile: iface:[ens33] ifaceRegex:[] ipMasq:true subnetFile:/run/flannel/subnet.env publicIP: publicIPv6: subnetLeaseRenewMargin:60 healthzIP:0.0.0.0 healthzPort:0 iptablesResyncSeconds:5 iptablesForwardRules:true netConfPath:/etc/kube-flannel/net-conf.json setNodeNetworkUnavailable:true} W0512 14:46:26.044617 1 client_config.go:614] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work. E0512 14:46:56.142921 1 main.go:222] Failed to create SubnetManager: error retrieving pod spec for 'kube-system/kube-flannel-ds-pspzf': Get "https://10.96.0.1:443/api實就是連不上 apiserver,排查了好久才想起來查看 kube-proxy 的日志,其中出現了如下所示的錯誤信息:
E0512 14:53:03.260817 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.EndpointSlice: failed to list *v1.EndpointSlice: Get "https://192.168.0.111:6443/apis/discovery.k8s.io/v1/endpointslices?labelSelector=%21service.kubernetes.io%2Fheadless%2C%21service.kubernetes.io%2Fservice-proxy-name&limit=500&resourceVersion=0": dial tcp 192.168.0.111:6443: connect: no route to host這就是因為 kube-proxy 的 ConfigMap 中配置的 apiserver 地址是舊的 IP 地址,所以一定要將其替換成新的。
5、刪除第3步中 grep 出的證書和私鑰,重新生成這些證書。
? cd /etc/kubernetes/pki ? rm apiserver.crt apiserver.key ? kubeadm init phase certs apiserver? rm etcd/peer.crt etcd/peer.key ? kubeadm init phase certs etcd-peer當然也可以全部重新生成:
? kubeadm init phase certs all6、生成新的 kubeconfig 文件。
? cd /etc/kubernetes ? rm -f admin.conf kubelet.conf controller-manager.conf scheduler.conf ? kubeadm init phase kubeconfig all I0513 15:33:34.404780 52280 version.go:255] remote version is much newer: v1.24.0; falling back to: stable-1.22 [kubeconfig] Using kubeconfig folder "/etc/kubernetes" [kubeconfig] Writing "admin.conf" kubeconfig file [kubeconfig] Writing "kubelet.conf" kubeconfig file [kubeconfig] Writing "controller-manager.conf" kubeconfig file [kubeconfig] Writing "scheduler.conf" kubeconfig file # 覆蓋默認的 kubeconfig 文件 ? cp /etc/kubernetes/admin.conf $HOME/.kube/config7、重啟 kubelet。
? systemctl restart containerd ? systemctl restart kubelet正常現在可以訪問的 Kubernetes 集群了。
? kubectl get nodes NAME STATUS ROLES AGE VERSION master1 Ready control-plane,master 48d v1.22.8 node1 NotReady <none> 48d v1.22.8 node2 NotReady <none> 48d v1.22.8node 節點
雖然現在可以訪問集群了,但是我們可以看到 Node 節點現在處于 NotReady 狀態,我們可以去查看 node2 節點的 kubelet 日志:
? journalctl -u kubelet -f ...... May 13 15:47:55 node2 kubelet[1194]: E0513 15:47:55.470896 1194 kubelet.go:2412] "Error getting node" err="node \"node2\" not found" May 13 15:47:55 node2 kubelet[1194]: E0513 15:47:55.531695 1194 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Service: failed to list *v1.Service: Get "https://192.168.0.111:6443/api/v1/services?limit=500&resourceVersion=0": dial tcp 192.168.0.111:6443: connect: no route to host May 13 15:47:55 node2 kubelet[1194]: E0513 15:47:55.571958 1194 kubelet.go:2412] "Error getting node" err="node \"node2\" not found" May 13 15:47:55 node2 kubelet[1194]: E0513 15:47:55.673379 1194 kubelet.go:2412] "Error getting node" err="node \"node2\" not found"可以看到仍然是在訪問之前的 APIServer 地址,那么在什么地方會明確使用 APIServer 的地址呢?我們可以通過下面的命令來查看 kubelet 的啟動參數:
? systemctl status kubelet ● kubelet.service - kubelet: The Kubernetes Node AgentLoaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)Drop-In: /usr/lib/systemd/system/kubelet.service.d└─10-kubeadm.confActive: active (running) since Fri 2022-05-13 14:37:31 CST; 1h 13min agoDocs: https://kubernetes.io/docs/Main PID: 1194 (kubelet)Tasks: 15Memory: 126.9MCGroup: /system.slice/kubelet.service└─1194 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kub...May 13 15:51:08 node2 kubelet[1194]: E0513 15:51:08.787677 1194 kubelet.go:2412] "Error getting node" err="node \"node2... found" May 13 15:51:08 node2 kubelet[1194]: E0513 15:51:08.888194 1194 kubelet.go:2412] "Error getting node" err="node \"node2... found" ......其核心配置文件為 /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf,內容如下所示:
? cat /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf # Note: This dropin only works with kubeadm and kubelet v1.11+ [Service] Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf" Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml" # This is a file that "kubeadm init" and "kubeadm join" generates at runtime, populating the KUBELET_KUBEADM_ARGS variable dynamically EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env # This is a file that the user can use for overrides of the kubelet args as a last resort. Preferably, the user should use # the .NodeRegistration.KubeletExtraArgs object in the configuration files instead. KUBELET_EXTRA_ARGS should be sourced from this file. EnvironmentFile=-/etc/sysconfig/kubelet ExecStart= ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS其中有一個配置 KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf,這里提到了兩個配置文件 bootstrap-kubelet.conf 與 kubelet.conf,其中第一個文件不存在:
? cat /etc/kubernetes/bootstrap-kubelet.conf cat: /etc/kubernetes/bootstrap-kubelet.conf: No such file or directory而第二個配置文件就是一個 kubeconfig 文件的格式,這個文件中就指定了 APIServer 的地址,可以看到還是之前的 IP 地址:
? cat /etc/kubernetes/kubelet.conf apiVersion: v1 clusters: - cluster:certificate-authority-data: <......>server: https://192.168.0.111:6443name: default-cluster contexts: - context:cluster: default-clusternamespace: defaultuser: default-authname: default-context current-context: default-context kind: Config preferences: {} users: - name: default-authuser:client-certificate: /var/lib/kubelet/pki/kubelet-client-current.pemclient-key: /var/lib/kubelet/pki/kubelet-client-current.pem所以我們最先想到的肯定就是去將這里的 APIServer 地址修改成新的 IP 地址,但是這顯然是有問題的,因為相關證書還是以前的,需要重新生成,那么要怎樣重新生成該文件呢?
首先備份 kubelet 工作目錄:
? cp /etc/kubernetes/kubelet.conf /etc/kubernetes/kubelet.conf.bak ? cp -rf /var/lib/kubelet/ /var/lib/kubelet-bak刪除 kubelet 客戶端證書:
? rm /var/lib/kubelet/pki/kubelet-client*然后在 master1 節點(具有 /etc/kubernetes/pki/ca.key 文件的節點)去生成 kubelet.conf 文件:
# 在master1節點 ? kubeadm kubeconfig user --org system:nodes --client-name system:node:node2 --config kubeadm.yaml > kubelet.conf然后將 kubelet.conf 文件復制到 node2 節點 /etc/kubernetes/kubelet.conf,然后重新啟動 node2 節點上的 kubelet,并等待 /var/lib/kubelet/pki/kubelet-client-current.pem 重新創建。
? systemctl restart kubelet # 重啟后等待重新生成 kubelet 客戶端證書 ? ll /var/lib/kubelet/pki/ total 12 -rw------- 1 root root 1106 May 13 16:32 kubelet-client-2022-05-13-16-32-35.pem lrwxrwxrwx 1 root root 59 May 13 16:32 kubelet-client-current.pem -> /var/lib/kubelet/pki/kubelet-client-2022-05-13-16-32-35.pem -rw-r--r-- 1 root root 2229 Mar 26 14:39 kubelet.crt -rw------- 1 root root 1675 Mar 26 14:39 kubelet.key最好我們可以通過手動編輯 kubelet.conf 的方式來指向輪轉的 kubelet 客戶端證書,將文件中的 client-certificate-data 和 client-key-data 替換為 /var/lib/kubelet/pki/kubelet-client-current.pem:
client-certificate: /var/lib/kubelet/pki/kubelet-client-current.pem client-key: /var/lib/kubelet/pki/kubelet-client-current.pem再次重啟 kubelet,正常現在 node2 節點就會變成 Ready 狀態了,用同樣的方法再次去配置 node1 節點即可。
? kubectl get nodes NAME STATUS ROLES AGE VERSION master1 Ready control-plane,master 48d v1.22.8 node1 Ready <none> 48d v1.22.8 node2 Ready <none> 48d v1.22.8推薦操作
上面的操作方式雖然可以正常完成我們的需求,但是需要我們對相關證書有一定的了解。除了這種方式之外還有一種更簡單的操作。 首先停止 kubelet 并備份要操作的目錄:
? systemctl stop kubelet ? mv /etc/kubernetes /etc/kubernetes-bak ? mv /var/lib/kubelet/ /var/lib/kubelet-bak將 pki 證書目錄保留下來:
? mkdir -p /etc/kubernetes ? cp -r /etc/kubernetes-bak/pki /etc/kubernetes ? rm /etc/kubernetes/pki/{apiserver.*,etcd/peer.*} rm: remove regular file ‘/etc/kubernetes/pki/apiserver.crt’? y rm: remove regular file ‘/etc/kubernetes/pki/apiserver.key’? y rm: remove regular file ‘/etc/kubernetes/pki/etcd/peer.crt’? y rm: remove regular file ‘/etc/kubernetes/pki/etcd/peer.key’? y現在我們使用下面的命令來重新初始化控制平面節點,但是最重要的一點是要使用 etcd 的數據目錄,可以通過 --ignore-preflight-errors=DirAvailable--var-lib-etcd 標志來告訴 kubeadm 使用預先存在的 etcd 數據。
? kubeadm init --config kubeadm.yaml --ignore-preflight-errors=DirAvailable--var-lib-etcd [init] Using Kubernetes version: v1.22.8 [preflight] Running pre-flight checks[WARNING DirAvailable--var-lib-etcd]: /var/lib/etcd is not empty [preflight] Pulling images required for setting up a Kubernetes cluster [preflight] This might take a minute or two, depending on the speed of your internet connection [preflight] You can also perform this action in beforehand using 'kubeadm config images pull' [certs] Using certificateDir folder "/etc/kubernetes/pki" [certs] Using existing ca certificate authority [certs] Generating "apiserver" certificate and key [certs] apiserver serving cert is signed for DNS names [api.k8s.local kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local master1] and IPs [10.96.0.1 192.168.0.106] [certs] Using existing apiserver-kubelet-client certificate and key on disk [certs] Using existing front-proxy-ca certificate authority [certs] Using existing front-proxy-client certificate and key on disk [certs] Using existing etcd/ca certificate authority [certs] Using existing etcd/server certificate and key on disk [certs] Generating "etcd/peer" certificate and key [certs] etcd/peer serving cert is signed for DNS names [localhost master1] and IPs [192.168.0.106 127.0.0.1 ::1] [certs] Using existing etcd/healthcheck-client certificate and key on disk [certs] Using existing apiserver-etcd-client certificate and key on disk [certs] Using the existing "sa" key [kubeconfig] Using kubeconfig folder "/etc/kubernetes" [kubeconfig] Writing "admin.conf" kubeconfig file [kubeconfig] Writing "kubelet.conf" kubeconfig file [kubeconfig] Writing "controller-manager.conf" kubeconfig file [kubeconfig] Writing "scheduler.conf" kubeconfig file [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [kubelet-start] Starting the kubelet [control-plane] Using manifest folder "/etc/kubernetes/manifests" [control-plane] Creating static Pod manifest for "kube-apiserver" [control-plane] Creating static Pod manifest for "kube-controller-manager" [control-plane] Creating static Pod manifest for "kube-scheduler" [etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests" [wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s [apiclient] All control plane components are healthy after 12.003599 seconds [upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace [kubelet] Creating a ConfigMap "kubelet-config-1.22" in namespace kube-system with the configuration for the kubelets in the cluster [upload-certs] Skipping phase. Please see --upload-certs [mark-control-plane] Marking the node master1 as control-plane by adding the labels: [node-role.kubernetes.io/master(deprecated) node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers] [mark-control-plane] Marking the node master1 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule] [bootstrap-token] Using token: abcdef.0123456789abcdef [bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles [bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes [bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials [bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token [bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster [bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace [kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key [addons] Applied essential addon: CoreDNS [addons] Applied essential addon: kube-proxyYour Kubernetes control-plane has initialized successfully!To start using your cluster, you need to run the following as a regular user:mkdir -p $HOME/.kubesudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/configsudo chown $(id -u):$(id -g) $HOME/.kube/configAlternatively, if you are the root user, you can run:export KUBECONFIG=/etc/kubernetes/admin.confYou should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:https://kubernetes.io/docs/concepts/cluster-administration/addons/Then you can join any number of worker nodes by running the following on each as root:kubeadm join 192.168.0.106:6443 --token abcdef.0123456789abcdef \--discovery-token-ca-cert-hash sha256:27993cae9c76d18a1b82b800182c4c7ebc7a704ba1093400ed886f65e709ec04上面的操作和我們平時去初始化集群的時候幾乎是一樣的,唯一不同的地方是加了一個 --ignore-preflight-errors=DirAvailable--var-lib-etcd 參數,意思就是使用之前 etcd 的數據。然后我們可以驗證下 APIServer 的 IP 地址是否變成了新的地址:
? cp -i /etc/kubernetes/admin.conf $HOME/.kube/config cp: overwrite ‘/root/.kube/config’? y ? kubectl cluster-info Kubernetes control plane is running at https://192.168.0.106:6443 CoreDNS is running at https://192.168.0.106:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxyTo further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.對于 node 節點我們可以 reset 后重新加入到集群即可:
# 在node節點操作 ? kubeadm reset 重置后重新 join 集群即可:# 在node節點操作 ? kubeadm join 192.168.0.106:6443 --token abcdef.0123456789abcdef \--discovery-token-ca-cert-hash sha256:27993cae9c76d18a1b82b800182c4c7ebc7a704ba1093400ed886f65e709ec04這種方式比上面的方式要簡單很多。正常操作后集群也正常了。
? kubectl get nodes NAME STATUS ROLES AGE VERSION master1 Ready control-plane,master 48d v1.22.8 node1 Ready <none> 48d v1.22.8 node2 Ready <none> 4m50s v1.22.8個人總結
對于 Kubernetes 集群節點的 IP 地址最好使用靜態 IP,避免 IP 變動對業務產生影響,如果不是靜態 IP,也強烈建議增加一個自定義域名進行簽名,這樣當 IP 變化后還可以直接重新映射下這個域名即可,只需要在 kubeadm 配置文件中通過 ClusterConfiguration 配置 apiServer.certSANs 即可,如下所示:
apiVersion: kubeadm.k8s.io/v1beta3 apiServer:timeoutForControlPlane: 4m0scertSANs:- api.k8s.local- master1- 192.168.0.106 kind: ClusterConfiguration ......將需要進行前面的地址加入到 certSANs 中,比如這里我們額外添加了一個 api.k8s.local 的地址,這樣即使以后 IP 變了可以直接將這個域名映射到新的 IP 地址即可,同樣如果你想通過外網訪問 IP 訪問你的集群,那么你也需要將你的外網 IP 地址加進來進行簽名認證。
總結
以上是生活随笔為你收集整理的如何修改 Kubernetes 节点 IP 地址?的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 淘宝开店怎么寻找货源?淘宝怎么样申请开店
- 下一篇: Android多屏幕适配之字体大小、行间