參考官網:
https://kubernetes.io/docs/setup/independent/install-kubeadm/#verify-the-mac-address-and-product-uuid-are-unique-for-every-node
kubeadm init 配置文件參數參考:
https://kubernetes.io/docs/reference/setup-tools/kubeadm/kubeadm-init/
5臺 centos7 最新的系統
etc 集群跑在3臺 master 節點上
網絡組件使用 calico
主機名ip說明組件 k8s-company01-master01 ~ 03 172.16.4.201 ~ 203 3個 master 節點 keepalived、haproxy、etcd、kubelet、kube-apiserver k8s-company01-worker001 ~ 002 172.16.4.204 ~ 205 2個 worker 節點 kubelet k8-company01-lb 172.16.4.200 keepalived虛IP
1. 虛擬機確定 mac 和主機uuid 是唯一的。
(uuid 查看方法:cat /sys/class/dmi/id/product_uuid)
2. Swap disabled.
(執行命令:swapoff -a; sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab)
3. 習慣性關閉 selinux,設置時區timedatectl set-timezone Asia/Shanghai,可選:echo "Asia/Shanghai" > /etc/timezone
4. 更新時間(etcd 對時間一致性要求高)ntpdate asia.pool.ntp.org
(寫入到 crontab:8 * * * * /usr/sbin/ntpdate asia.pool.ntp.org && /sbin/hwclock --systohc )
4. yum update 到最新并重啟系統讓新的內核生效。備注:關閉 selinux
setenforce 0
sed -i --follow-symlinks "s/^SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config
sed -i --follow-symlinks "s/^SELINUX=permissive/SELINUX=disabled/g" /etc/selinux/config關閉 firewalld,如果不關閉,后面很多 k8s 以外的組件會網絡不通,一個個排查很麻煩,由于我們 k8s 在內網,就直接關閉了。
systemctl stop firewalld.service
systemctl disable firewalld.service配置主機名(注意根據實際環境修改主機名):
5臺主機分別設置主機名:
hostnamectl set-hostname k8s-company01-master01
hostnamectl set-hostname k8s-company01-master02
hostnamectl set-hostname k8s-company01-master03
hostnamectl set-hostname k8s-company01-worker001
hostnamectl set-hostname k8s-company01-worker002在5臺主機的/etc/hosts 中添加
cat >> /etc/hosts <<EOF
172.16.4.201 k8s-company01-master01.skymobi.cn k8s-company01-master01
172.16.4.202 k8s-company01-master02.skymobi.cn k8s-company01-master02
172.16.4.203 k8s-company01-master03.skymobi.cn k8s-company01-master03
172.16.4.200 k8s-company01-lb.skymobi.cn k8s-company01-lb
172.16.4.204 k8s-company01-worker001.skymobi.cn k8s-company01-worker001
172.16.4.205 k8s-company01-worker002.skymobi.cn k8s-company01-worker002
EOFyum install wget git jq psmisc vim net-tools tcping bash-completion -y
yum update -y && reboot
# 重啟不僅是是讓新升級的 kernel 生效,也讓調用到 hostname 的相關服務使用新的 hostname
每臺安裝CRI(這里默認使用 docker,k8s 1.12開始推薦使用 docker 18.06 版本,但由于18.06有個 root 提權的漏洞,這里我們使用最新的版本,18.09.5)
安裝參考:https://kubernetes.io/docs/setup/cri/
yum
install -y yum-utils device-mapper-persistent-data lvm2
yum-config-manager \--add-repo \https://download.docker.com/linux/centos/docker-ce.repo
yum
install -y docker-ce-18.09.5 docker-ce-cli-18.09.5
mkdir /etc/docker
cat > /etc/docker/daemon.json
<< EOF
{"exec-opts": ["native.cgroupdriver=systemd"],"log-driver": "json-file","log-opts": {"max-size": "100m"},"storage-driver": "overlay2","storage-opts": ["overlay2.override_kernel_check=true"]
}
EOF mkdir -p /etc/systemd/system/docker.service.d
systemctl daemon-reload
systemctl
enable docker.service
systemctl restart docker
固定 docker 版本,防止以后意外更新到另外的大版本:
yum -y install yum-plugin-versionlock
yum versionlock docker-ce docker-ce-cli
yum versionlock list# 注:
# 解鎖
# yum versionlock delete docker-ce docker-ce-cli
cat << EOF
> /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables
= 1
net.bridge.bridge-nf-call-iptables
= 1
net.ipv4.ip_forward
= 1
vm.swappiness
= 0
vm.overcommit_memory
= 1
vm.panic_on_oom
= 0
fs.may_detach_mounts
= 1
fs.inotify.max_user_watches
= 89100
fs.file-max
= 52706963
fs.nr_open
= 52706963
net.netfilter.nf_conntrack_max
= 2310720
EOFmodprobe br_netfilter
sysctl --system
cat << EOF
> /etc/yum.repos.d/kubernetes.repo
[ kubernetes
]
name
= Kubernetes
baseurl
= https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled
= 1
gpgcheck
= 1
repo_gpgcheck
= 1
gpgkey
= https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
exclude
= kube*
EOF同時安裝ipvsadm,后面kube-proxy會采用ipvs的方式
( cri-tools-1.12.0 kubernetes-cni-0.7.5 是兩個關聯包
)
yum
install -y kubelet-1.14.1 kubeadm-1.14.1 kubectl-1.14.1 cri-tools-1.12.0 kubernetes-cni-0.7.5 ipvsadm --disableexcludes
= kubernetes
modprobe ip_vs
modprobe ip_vs_rr
modprobe ip_vs_wrr
modprobe ip_vs_sh
modprobe nf_conntrack_ipv4
modprobe br_netfilter
cat << EOF
>> /etc/rc.d/rc.local
modprobe ip_vs
modprobe ip_vs_rr
modprobe ip_vs_wrr
modprobe ip_vs_sh
modprobe nf_conntrack_ipv4
modprobe br_netfilter
EOF
chmod +x /etc/rc.d/rc.locallsmod
| grep ip_vs
DOCKER_CGROUPS
= $( docker info | grep 'Cgroup' | cut -d' ' -f3)
echo $DOCKER_CGROUPS
cat > /etc/sysconfig/kubelet
<< EOF
KUBELET_EXTRA_ARGS="--cgroup-driver=$DOCKER_CGROUPS --pod-infra-container-image=registry.cn-hangzhou.aliyuncs.com/google_containers/pause-amd64:3.1"
EOF
systemctl
enable --now kubelet
source /usr/share/bash-completion/bash_completion
source < ( kubectl completion bash
)
echo "source <(kubectl completion bash)" >> ~/.bashrc
如下操作在三個master節點操作,采用16443端口代理k8s的6443端口,注意修改最后面的主機名和 ip,如server k8s-company01-master01 172.16.4.201:6443
# 拉取haproxy鏡像【采用alpine小鏡像版本】
docker pull reg01.sky-mobi.com/k8s/haproxy:1.9.1-alpine
mkdir /etc/haproxy
cat >/etc/haproxy/haproxy.cfg<<EOF
globallog 127.0.0.1 local0 errmaxconn 30000uid 99gid 99#daemonnbproc 1pidfile haproxy.piddefaultsmode httplog 127.0.0.1 local0 errmaxconn 30000retries 3timeout connect 5stimeout client 30stimeout server 30stimeout check 2slisten admin_statsmode httpbind 0.0.0.0:1080log 127.0.0.1 local0 errstats refresh 30sstats uri /haproxy-statusstats realm Haproxy\ Statisticsstats auth admin:skymobik8sstats hide-versionstats admin if TRUEfrontend k8s-httpsbind 0.0.0.0:16443mode tcp#maxconn 30000default_backend k8s-httpsbackend k8s-httpsmode tcpbalance roundrobinserver k8s-company01-master01 172.16.4.201:6443 weight 1 maxconn 1000 check inter 2000 rise 2 fall 3server k8s-company01-master02 172.16.4.202:6443 weight 1 maxconn 1000 check inter 2000 rise 2 fall 3server k8s-company01-master03 172.16.4.203:6443 weight 1 maxconn 1000 check inter 2000 rise 2 fall 3
EOF# 啟動haproxy
docker run -d --name k8s-haproxy \
-v /etc/haproxy:/usr/local/etc/haproxy:ro \
-p 16443:16443 \
-p 1080:1080 \
--restart always \
-d reg01.sky-mobi.com/k8s/haproxy:1.9.1-alpine# 查看是否啟動成功,如果查看日志有連接報錯,是正常的,因為 kube-api的6443端口還沒起來。
docker ps# 如果上述配置失敗后,需清理重新實驗
docker stop k8s-haproxy
docker rm k8s-haproxy
在三臺 master 上配置 keepalived
# 拉取keepalived鏡像
docker pull reg01.sky-mobi.com/k8s/keepalived:2.0.10# 啟動keepalived , 注意修改網卡名和 ip
# eth0為本次實驗172.16.4.0/24網段的所在網卡(如果你的不是,請改成自己的網卡名稱,用法參考https://github.com/osixia/docker-keepalived/tree/v2.0.10)
# 密碼不要超過8位,如果是skymobk8s,則發送的 vrrp 包中只有8前8位:addrs: k8s-master-lb auth "skymobik"
# KEEPALIVED_PRIORITY Master節點設置為200 ,其他backup 上設置為150
docker run --net=host --cap-add=NET_ADMIN \
-e KEEPALIVED_ROUTER_ID=55 \
-e KEEPALIVED_INTERFACE=eth0 \
-e KEEPALIVED_VIRTUAL_IPS="#PYTHON2BASH:['172.16.4.200']" \
-e KEEPALIVED_UNICAST_PEERS="#PYTHON2BASH:['172.16.4.201','172.16.4.202','172.16.4.203']" \
-e KEEPALIVED_PASSWORD=skyk8stx \
-e KEEPALIVED_PRIORITY=150 \
--name k8s-keepalived \
--restart always \
-d reg01.sky-mobi.com/k8s/keepalived:2.0.10# 查看日志
# 會看到兩個成為backup 一個成為master
docker logs k8s-keepalived
# 如果日志中有 received an invalid passwd! 的信息,網絡中有配置相同的 ROUTER_ID,修改 ROUTER_ID 即可。# 從任意一臺 master 上ping測試
ping -c 4 虛IP# 如果上述配置失敗后,需清理重新實驗
docker stop k8s-keepalived
docker rm k8s-keepalived
https://kubernetes.io/docs/setup/independent/high-availability/
第一臺:k8s-master01 上操作
# 注意修改 controlPlaneEndpoint: "k8s-company01-lb:16443" 中對應的虛ip主機名
cat << EOF > kubeadm-config.yaml
apiVersion: kubeadm.k8s.io/v1beta1
kind: ClusterConfiguration
kubernetesVersion: v1.14.1
# add the available imageRepository in china
imageRepository: reg01.sky-mobi.com/k8s/k8s.gcr.io
controlPlaneEndpoint: "k8s-company01-lb:16443"
networking:podSubnet: "10.254.0.0/16"
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
ipvs:minSyncPeriod: 1ssyncPeriod: 10s
mode: ipvs
EOF
kubeadm-config 參數參考:
https://kubernetes.io/docs/reference/setup-tools/kubeadm/kubeadm-config/
預拉取鏡像:
kubeadm config images pull --config kubeadm-config.yaml
master01 初始化:
kubeadm init --config=kubeadm-config.yaml --experimental-upload-certs
注意剛開始的打印出的信息,根據提示,消除掉所有的 WARNING
如果想要重來,使用 kubeadm reset 命令,并且按照提示清空 iptables 和 ipvs 配置,然后重啟 docker 服務。
提示成功后,記錄下最后 join 的所有參數,用于后面節點的加入(兩小時內有效。一個用于 master 節點的加入,一個用于 worker 節點的加入)
You can now join any number of the control-plane node running the following command on each as root:kubeadm join k8s-company01-lb:16443 --token fp0x6g.cwuzedvtwlu1zg1f \--discovery-token-ca-cert-hash sha256:5d4095bc9e4e4b5300abe5a25afe1064f32c1ddcecc02a1f9b0aeee7710c3383 \--experimental-control-plane --certificate-key b56be86f65e73d844bb60783c7bd5d877fe20929296a3e254854d3b623bb86f7Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
"kubeadm init phase upload-certs --experimental-upload-certs" to reload certs afterward.Then you can join any number of worker nodes by running the following on each as root:kubeadm join k8s-company01-lb:16443 --token fp0x6g.cwuzedvtwlu1zg1f \--discovery-token-ca-cert-hash sha256:5d4095bc9e4e4b5300abe5a25afe1064f32c1ddcecc02a1f9b0aeee7710c3383
記得執行如下命令,以便使用 kubectl訪問集群
mkdir -p $HOME/.kubecp -i /etc/kubernetes/admin.conf $HOME/.kube/configchown $(id -u):$(id -g) $HOME/.kube/config# 如果不執行,將會出現一下報錯:
# [root@k8s-master01 ~]# kubectl -n kube-system get pod
# The connection to the server localhost:8080 was refused - did you specify the right host or port?
查看集群狀態時,coredns pending 沒關系,因為網絡插件還沒裝
# 顯示結果作為參考
[root@k8s-master01 ~]# kubectl get pod -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-56c9dc7946-5c5z2 0/1 Pending 0 34m
coredns-56c9dc7946-thqwd 0/1 Pending 0 34m
etcd-k8s-master01 1/1 Running 2 34m
kube-apiserver-k8s-master01 1/1 Running 2 34m
kube-controller-manager-k8s-master01 1/1 Running 1 33m
kube-proxy-bl9c6 1/1 Running 2 34m
kube-scheduler-k8s-master01 1/1 Running 1 34m
將 master02 和 master03 加入 cluster
# 使用之前生成的 join 參數將 master02 和 master03 加入集群(--experimental-control-plane 會自動加入服務集群kubeadm join k8s-company01-lb:16443 --token fp0x6g.cwuzedvtwlu1zg1f \--discovery-token-ca-cert-hash sha256:5d4095bc9e4e4b5300abe5a25afe1064f32c1ddcecc02a1f9b0aeee7710c3383 \--experimental-control-plane --certificate-key b56be86f65e73d844bb60783c7bd5d877fe20929296a3e254854d3b623bb86f7# 如果join 參數沒有記下來,或者已經失效,參考:
http://wiki.sky-mobi.com:8090/pages/viewpage.action?pageId=9079715# 成功加入后,添加下 kubectl 的訪問集群權限mkdir -p $HOME/.kubecp -i /etc/kubernetes/admin.conf $HOME/.kube/configchown $(id -u):$(id -g) $HOME/.kube/config
安裝 calico 網絡插件(在 master01 上操作)
參考:
https://docs.projectcalico.org/v3.6/getting-started/kubernetes/installation/calico下載 yaml 文件( 這里的版本是 v3.6.1,文件源于官網https://docs.projectcalico.org/v3.6/getting-started/kubernetes/installation/hosted/kubernetes-datastore/calico-networking/typha/calico.yaml 修改過網段和replicas以及 image 地址)# 機房外部使用(有訪問限制,公司自己的公網地址)
curl http://111.1.17.135/yum/scripts/k8s/calico_v3.6.1.yaml -O
# 機房內部使用
curl http://192.168.160.200/yum/scripts/k8s/calico_v3.6.1.yaml -O## 修改 yaml 文件,網絡地址段改成和kubeadm-config.yaml 中podSubnet 一致。
##
## export POD_CIDR="10.254.0.0/16" ; sed -i -e "s?192.168.0.0/16?$POD_CIDR?g" calico.yaml
## replicas 改成3份,用于生產(默認是1)
## 還修改過鏡像地址,鏡像放到了reg01.sky-mobi.com 上# 需要開啟允許pod 被調度到master 節點上(在master01 上執行就行)
[root@k8s-company01-master01 ~]# kubectl taint nodes --all node-role.kubernetes.io/master-
node/k8s-company01-master01 untainted
node/k8s-company01-master02 untainted
node/k8s-company01-master03 untainted# 安裝 calico (卸載是kubectl delete -f calico_v3.6.1.yaml)
[root@k8s-company01-master01 ~]# kubectl apply -f calico_v3.6.1.yaml
configmap/calico-config created
customresourcedefinition.apiextensions.k8s.io/felixconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamblocks.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/blockaffinities.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamhandles.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamconfigs.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/bgppeers.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/bgpconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ippools.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/hostendpoints.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/clusterinformations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworksets.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networkpolicies.crd.projectcalico.org created
clusterrole.rbac.authorization.k8s.io/calico-kube-controllers created
clusterrolebinding.rbac.authorization.k8s.io/calico-kube-controllers created
clusterrole.rbac.authorization.k8s.io/calico-node created
clusterrolebinding.rbac.authorization.k8s.io/calico-node created
service/calico-typha created
deployment.apps/calico-typha created
poddisruptionbudget.policy/calico-typha created
daemonset.extensions/calico-node created
serviceaccount/calico-node created
deployment.extensions/calico-kube-controllers created
serviceaccount/calico-kube-controllers created# 至此,所有pod 運行正常
[root@k8s-company01-master01 ~]# kubectl -n kube-system get pod
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-749f7c8df8-knlx4 0/1 Running 0 20s
calico-kube-controllers-749f7c8df8-ndf55 0/1 Running 0 20s
calico-kube-controllers-749f7c8df8-pqxlx 0/1 Running 0 20s
calico-node-4txj7 0/1 Running 0 21s
calico-node-9t2l9 0/1 Running 0 21s
calico-node-rtxlj 0/1 Running 0 21s
calico-typha-646cdc958c-7j948 0/1 Pending 0 21s
coredns-56c9dc7946-944nt 0/1 Running 0 4m9s
coredns-56c9dc7946-nh2sk 0/1 Running 0 4m9s
etcd-k8s-company01-master01 1/1 Running 0 3m26s
etcd-k8s-company01-master02 1/1 Running 0 2m52s
etcd-k8s-company01-master03 1/1 Running 0 110s
kube-apiserver-k8s-company01-master01 1/1 Running 0 3m23s
kube-apiserver-k8s-company01-master02 1/1 Running 0 2m53s
kube-apiserver-k8s-company01-master03 1/1 Running 1 111s
kube-controller-manager-k8s-company01-master01 1/1 Running 1 3m28s
kube-controller-manager-k8s-company01-master02 1/1 Running 0 2m52s
kube-controller-manager-k8s-company01-master03 1/1 Running 0 56s
kube-proxy-8wm4v 1/1 Running 0 4m9s
kube-proxy-vvdrl 1/1 Running 0 2m53s
kube-proxy-wnctx 1/1 Running 0 2m2s
kube-scheduler-k8s-company01-master01 1/1 Running 1 3m18s
kube-scheduler-k8s-company01-master02 1/1 Running 0 2m52s
kube-scheduler-k8s-company01-master03 1/1 Running 0 55s# 所有master 節點都是 ready 狀態
[root@k8s-company01-master01 ~]# kubectl get node
NAME STATUS ROLES AGE VERSION
k8s-company01-master01 Ready master 4m48s v1.14.1
k8s-company01-master02 Ready master 3m12s v1.14.1
k8s-company01-master03 Ready master 2m21s v1.14.1# 遇到 coredns 不停重啟,關閉 firewalld 后正常,再次開啟 firewalld 也正常了...
兩臺 worker 節點加入集群(按照前文做基礎配置,安裝好 docker 和 kubeadm 等)
# 與 master 加入集群的區別是少了 --experimental-control-plane 參數
kubeadm join k8s-company01-lb:16443 --token fp0x6g.cwuzedvtwlu1zg1f \--discovery-token-ca-cert-hash sha256:5d4095bc9e4e4b5300abe5a25afe1064f32c1ddcecc02a1f9b0aeee7710c3383# 如果join 參數沒有記下來,或者已經失效,參考
http://wiki.sky-mobi.com:8090/pages/viewpage.action?pageId=9079715# 添加成功顯示:
This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.Run 'kubectl get nodes' on the master to see this node join the cluster.### kubectl get nodes 命令在任意 master 節點執行。
[root@k8s-company01-master01 ~]# kubectl get pod -n kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
calico-kube-controllers-749f7c8df8-knlx4 1/1 Running 1 5m2s 10.254.28.66 k8s-company01-master02 <none> <none>
calico-kube-controllers-749f7c8df8-ndf55 1/1 Running 4 5m2s 10.254.31.67 k8s-company01-master03 <none> <none>
calico-kube-controllers-749f7c8df8-pqxlx 1/1 Running 4 5m2s 10.254.31.66 k8s-company01-master03 <none> <none>
calico-node-4txj7 1/1 Running 0 5m3s 172.16.4.203 k8s-company01-master03 <none> <none>
calico-node-7fqwh 1/1 Running 0 68s 172.16.4.205 k8s-company01-worker002 <none> <none>
calico-node-9t2l9 1/1 Running 0 5m3s 172.16.4.201 k8s-company01-master01 <none> <none>
calico-node-rkfxj 1/1 Running 0 86s 172.16.4.204 k8s-company01-worker001 <none> <none>
calico-node-rtxlj 1/1 Running 0 5m3s 172.16.4.202 k8s-company01-master02 <none> <none>
calico-typha-646cdc958c-7j948 1/1 Running 0 5m3s 172.16.4.204 k8s-company01-worker001 <none> <none>
coredns-56c9dc7946-944nt 0/1 CrashLoopBackOff 4 8m51s 10.254.28.65 k8s-company01-master02 <none> <none>
coredns-56c9dc7946-nh2sk 0/1 CrashLoopBackOff 4 8m51s 10.254.31.65 k8s-company01-master03 <none> <none>
etcd-k8s-company01-master01 1/1 Running 0 8m8s 172.16.4.201 k8s-company01-master01 <none> <none>
etcd-k8s-company01-master02 1/1 Running 0 7m34s 172.16.4.202 k8s-company01-master02 <none> <none>
etcd-k8s-company01-master03 1/1 Running 0 6m32s 172.16.4.203 k8s-company01-master03 <none> <none>
kube-apiserver-k8s-company01-master01 1/1 Running 0 8m5s 172.16.4.201 k8s-company01-master01 <none> <none>
kube-apiserver-k8s-company01-master02 1/1 Running 0 7m35s 172.16.4.202 k8s-company01-master02 <none> <none>
kube-apiserver-k8s-company01-master03 1/1 Running 1 6m33s 172.16.4.203 k8s-company01-master03 <none> <none>
kube-controller-manager-k8s-company01-master01 1/1 Running 1 8m10s 172.16.4.201 k8s-company01-master01 <none> <none>
kube-controller-manager-k8s-company01-master02 1/1 Running 0 7m34s 172.16.4.202 k8s-company01-master02 <none> <none>
kube-controller-manager-k8s-company01-master03 1/1 Running 0 5m38s 172.16.4.203 k8s-company01-master03 <none> <none>
kube-proxy-8wm4v 1/1 Running 0 8m51s 172.16.4.201 k8s-company01-master01 <none> <none>
kube-proxy-k8rng 1/1 Running 0 68s 172.16.4.205 k8s-company01-worker002 <none> <none>
kube-proxy-rqnkv 1/1 Running 0 86s 172.16.4.204 k8s-company01-worker001 <none> <none>
kube-proxy-vvdrl 1/1 Running 0 7m35s 172.16.4.202 k8s-company01-master02 <none> <none>
kube-proxy-wnctx 1/1 Running 0 6m44s 172.16.4.203 k8s-company01-master03 <none> <none>
kube-scheduler-k8s-company01-master01 1/1 Running 1 8m 172.16.4.201 k8s-company01-master01 <none> <none>
kube-scheduler-k8s-company01-master02 1/1 Running 0 7m34s 172.16.4.202 k8s-company01-master02 <none> <none>
kube-scheduler-k8s-company01-master03 1/1 Running 0 5m37s 172.16.4.203 k8s-company01-master03 <none> <none>[root@k8s-company01-master01 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-company01-master01 Ready master 9m51s v1.14.1
k8s-company01-master02 Ready master 8m15s v1.14.1
k8s-company01-master03 Ready master 7m24s v1.14.1
k8s-company01-worker001 Ready <none> 2m6s v1.14.1
k8s-company01-worker002 Ready <none> 108s v1.14.1[root@k8s-company01-master01 ~]# kubectl get csr
NAME AGE REQUESTOR CONDITION
csr-94f5v 8m27s system:bootstrap:fp0x6g Approved,Issued
csr-g9tbg 2m19s system:bootstrap:fp0x6g Approved,Issued
csr-pqr6l 7m49s system:bootstrap:fp0x6g Approved,Issued
csr-vwtqq 2m system:bootstrap:fp0x6g Approved,Issued
csr-w486d 10m system:node:k8s-company01-master01 Approved,Issued[root@k8s-company01-master01 ~]# kubectl get componentstatuses
NAME STATUS MESSAGE ERROR
scheduler Healthy ok
controller-manager Healthy ok
etcd-0 Healthy {"health":"true"}
安裝 metrics-server 用于簡單的監控,如命令 kubectl top nodes
[ root@k8s-master03 ~
]
Error from server
( NotFound
) : the server could not
find the requested resource
( get services http:heapster:
) 這里使用 helm 安裝:
安裝 helm(在 master01 上執行):
wget http://192.168.160.200/yum/scripts/k8s/helm-v2.13.1-linux-amd64.tar.gz
或
wget http://111.1.17.135/yum/scripts/k8s/helm-v2.13.1-linux-amd64.tar.gz
tar xvzf helm-v2.13.1-linux-amd64.tar.gz
mv linux-amd64/helm /usr/local/bin/helm
helm
help 每個節點執行
yum
install -y socat使用微軟的源(阿里的源很長時間都沒更新了!)
helm init --client-only --stable-repo-url http://mirror.azure.cn/kubernetes/charts/
helm repo add incubator http://mirror.azure.cn/kubernetes/charts-incubator/
helm repo updatehelm init --service-account tiller --upgrade -i registry.cn-hangzhou.aliyuncs.com/google_containers/tiller:v2.13.1 --tiller-tls-cert /etc/kubernetes/ssl/tiller001.pem --tiller-tls-key /etc/kubernetes/ssl/tiller001-key.pem --tls-ca-cert /etc/kubernetes/ssl/ca.pem --tiller-namespace kube-system --stable-repo-url http://mirror.azure.cn/kubernetes/charts/ --service-account tiller --history-max 200
給 Tiller 授權(master01 上執行)
# 因為 Helm 的服務端 Tiller 是一個部署在 Kubernetes 中 Kube-System Namespace 下 的 Deployment,它會去連接 Kube-Api 在 Kubernetes 里創建和刪除應用。# 而從 Kubernetes 1.6 版本開始,API Server 啟用了 RBAC 授權。目前的 Tiller 部署時默認沒有定義授權的 ServiceAccount,這會導致訪問 API Server 時被拒絕。所以我們需要明確為 Tiller 部署添加授權。# 創建 Kubernetes 的服務帳號和綁定角色kubectl create serviceaccount --namespace kube-system tillerkubectl create clusterrolebinding tiller-cluster-rule --clusterrole=cluster-admin --serviceaccount=kube-system:tillerkubectl patch deploy --namespace kube-system tiller-deploy -p '{"spec":{"template":{"spec":{"serviceAccount":"tiller"}}}}'# 查看是否授權成功
[root@k8s-company01-master01 ~]# kubectl -n kube-system get pods|grep tiller
tiller-deploy-7bf47568d4-42wf5 1/1 Running 0 17s[root@k8s-company01-master01 ~]# helm version
Client: &version.Version{SemVer:"v2.13.1", GitCommit:"618447cbf203d147601b4b9bd7f8c37a5d39fbb4", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.13.1", GitCommit:"618447cbf203d147601b4b9bd7f8c37a5d39fbb4", GitTreeState:"clean"}[root@k8s-company01-master01 ~]# helm repo list
NAME URL
stable http://mirror.azure.cn/kubernetes/charts/
local http://127.0.0.1:8879/charts
incubator http://mirror.azure.cn/kubernetes/charts-incubator/## 如果要替換倉庫,先移除原先的倉庫
#helm repo remove stable
## 添加新的倉庫地址
#helm repo add stable http://mirror.azure.cn/kubernetes/charts/
#helm repo add incubator http://mirror.azure.cn/kubernetes/charts-incubator/
#helm repo update
使用helm安裝metrics-server(在 master01上執行,因為只有 master01裝了 helm)
# 創建 metrics-server-custom.yaml
cat >> metrics-server-custom.yaml <<EOF
image:repository: reg01.sky-mobi.com/k8s/gcr.io/google_containers/metrics-server-amd64tag: v0.3.1
args:- --kubelet-insecure-tls- --kubelet-preferred-address-types=InternalIP,Hostname,InternalDNS,ExternalDNS,ExternalIP
EOF# 安裝 metrics-server(這里 -n 是 name)
[root@k8s-master01 ~]# helm install stable/metrics-server -n metrics-server --namespace kube-system --version=2.5.1 -f metrics-server-custom.yaml[root@k8s-company01-master01 ~]# kubectl get pod -n kube-system | grep metrics
metrics-server-dcbdb9468-c5f4n 1/1 Running 0 21s# 保存 yaml 文件退出后,metrics-server pod 會自動銷毀原來的,拉起一個新的。新 pod 起來后,過一兩分鐘再執行kubectl top命令就有結果了:
[root@k8s-company01-master01 ~]# kubectl top node
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
k8s-company01-master01 404m 5% 1276Mi 4%
k8s-company01-master02 493m 6% 1240Mi 3%
k8s-company01-master03 516m 6% 1224Mi 3%
k8s-company01-worker001 466m 0% 601Mi 0%
k8s-company01-worker002 244m 0% 516Mi 0%
使用helm安裝prometheus-operator
# 為方便管理,創建一個單獨的 Namespace monitoring,Prometheus Operator 相關的組件都會部署到這個 Namespace。kubectl create namespace monitoring## 自定義 prometheus-operator 參數
# helm fetch stable/prometheus-operator --version=5.0.3 --untar
# cat prometheus-operator/values.yaml | grep -v '#' | grep -v ^$ > prometheus-operator-custom.yaml
# 只保留我們要修改 image 的部分,還有使用 https 連接 etcd,例如:
參考:https://fengxsong.github.io/2018/05/30/Using-helm-to-manage-prometheus-operator/cat >> prometheus-operator-custom.yaml << EOF
## prometheus-operator/values.yaml
alertmanager:service:nodePort: 30503type: NodePortalertmanagerSpec:image:repository: reg01.sky-mobi.com/k8s/quay.io/prometheus/alertmanagertag: v0.16.1
prometheusOperator:image:repository: reg01.sky-mobi.com/k8s/quay.io/coreos/prometheus-operatortag: v0.29.0pullPolicy: IfNotPresentconfigmapReloadImage:repository: reg01.sky-mobi.com/k8s/quay.io/coreos/configmap-reloadtag: v0.0.1prometheusConfigReloaderImage:repository: reg01.sky-mobi.com/k8s/quay.io/coreos/prometheus-config-reloadertag: v0.29.0hyperkubeImage:repository: reg01.sky-mobi.com/k8s/k8s.gcr.io/hyperkubetag: v1.12.1pullPolicy: IfNotPresent
prometheus:service:nodePort: 30504type: NodePortprometheusSpec:image:repository: reg01.sky-mobi.com/k8s/quay.io/prometheus/prometheustag: v2.7.1secrets: [etcd-client-cert]
kubeEtcd:serviceMonitor:scheme: httpsinsecureSkipVerify: falseserverName: ""caFile: /etc/prometheus/secrets/etcd-client-cert/ca.crtcertFile: /etc/prometheus/secrets/etcd-client-cert/healthcheck-client.crtkeyFile: /etc/prometheus/secrets/etcd-client-cert/healthcheck-client.key## prometheus-operator/charts/grafana/values.yaml
grafana:service:nodePort: 30505type: NodePortimage:repository: reg01.sky-mobi.com/k8s/grafana/grafanatag: 6.0.2sidecar:image: reg01.sky-mobi.com/k8s/kiwigrid/k8s-sidecar:0.0.13## prometheus-operator/charts/kube-state-metrics/values.yaml
kube-state-metrics:image:repository: reg01.sky-mobi.com/k8s/k8s.gcr.io/kube-state-metricstag: v1.5.0## prometheus-operator/charts/prometheus-node-exporter/values.yaml
prometheus-node-exporter:image:repository: reg01.sky-mobi.com/k8s/quay.io/prometheus/node-exportertag: v0.17.0
EOF## 注:以上的prometheus-operator/charts/grafana/values.yaml 對應項添加了 grafana (按chats 目錄添加的:)
#[root@k8s-master01 ~]# ll prometheus-operator/charts/
#total 0
#drwxr-xr-x 4 root root 114 Apr 1 00:48 grafana
#drwxr-xr-x 3 root root 96 Apr 1 00:18 kube-state-metrics
#drwxr-xr-x 3 root root 110 Apr 1 00:20 prometheus-node-exporter# 創建連接 etcd 的證書secret:
kubectl -n monitoring create secret generic etcd-client-cert --from-file=/etc/kubernetes/pki/etcd/ca.crt --from-file=/etc/kubernetes/pki/etcd/healthcheck-client.crt --from-file=/etc/kubernetes/pki/etcd/healthcheck-client.key helm install stable/prometheus-operator --version=5.0.3 --name=monitoring --namespace=monitoring -f prometheus-operator-custom.yaml## 如果想要刪除重來,可以使用 helm 刪除,指定名字 monitoring
#helm del --purge monitoring
#kubectl delete crd prometheusrules.monitoring.coreos.com
#kubectl delete crd servicemonitors.monitoring.coreos.com
#kubectl delete crd alertmanagers.monitoring.coreos.com重新安裝 不要刪除之前的,再安裝可能會報錯,用 upgrade 就好:
helm upgrade monitoring stable/prometheus-operator --version=5.0.3 --namespace=monitoring -f prometheus-operator-custom.yaml[root@k8s-company01-master01 ~]# kubectl -n monitoring get pod
NAME READY STATUS RESTARTS AGE
alertmanager-monitoring-prometheus-oper-alertmanager-0 2/2 Running 0 29m
monitoring-grafana-7dd5cf9dd7-wx8mz 2/2 Running 0 29m
monitoring-kube-state-metrics-7d98487cfc-t6qqw 1/1 Running 0 29m
monitoring-prometheus-node-exporter-fnvp9 1/1 Running 0 29m
monitoring-prometheus-node-exporter-kczcq 1/1 Running 0 29m
monitoring-prometheus-node-exporter-m8kf6 1/1 Running 0 29m
monitoring-prometheus-node-exporter-mwc4g 1/1 Running 0 29m
monitoring-prometheus-node-exporter-wxmt8 1/1 Running 0 29m
monitoring-prometheus-oper-operator-7f96b488f6-2j7h5 1/1 Running 0 29m
prometheus-monitoring-prometheus-oper-prometheus-0 3/3 Running 1 28m[root@k8s-company01-master01 ~]# kubectl get svc -n monitoring
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
alertmanager-operated ClusterIP None <none> 9093/TCP,6783/TCP 31m
monitoring-grafana NodePort 10.109.159.105 <none> 80:30579/TCP 32m
monitoring-kube-state-metrics ClusterIP 10.100.31.235 <none> 8080/TCP 32m
monitoring-prometheus-node-exporter ClusterIP 10.109.119.13 <none> 9100/TCP 32m
monitoring-prometheus-oper-alertmanager NodePort 10.105.171.135 <none> 9093:31309/TCP 32m
monitoring-prometheus-oper-operator ClusterIP 10.98.135.170 <none> 8080/TCP 32m
monitoring-prometheus-oper-prometheus NodePort 10.96.15.36 <none> 9090:32489/TCP 32m
prometheus-operated ClusterIP None <none> 9090/TCP 31m# 查看有沒有異常告警,alerts里面的第一個Watchdog 是正常的報警,用于監控功能探測。
http://172.16.4.200:32489/alerts
http://172.16.4.200:32489/targets#以下是安裝 kubernetes-dashboard,用處不大,正式環境暫時不裝
#helm install --name=kubernetes-dashboard stable/kubernetes-dashboard --version=1.4.0 --namespace=kube-system --set image.repository=reg01.sky-mobi.com/k8s/k8s.gcr.io/kubernetes-dashboard-amd64,image.tag=v1.10.1,rbac.clusterAdminRole=true#Heapter 已在 Kubernetes 1.13 版本中移除(https://github.com/kubernetes/heapster/blob/master/docs/deprecation.md),推薦使用 metrics-server 與 Prometheus。
總結
以上是生活随笔 為你收集整理的kubeadm 安装 k8s 1.14.1版本(HA) 的全部內容,希望文章能夠幫你解決所遇到的問題。
如果覺得生活随笔 網站內容還不錯,歡迎將生活随笔 推薦給好友。