Openshift 4.4 静态 IP 离线安装系列:初始安装
Openshift 4.4 靜態 IP 離線安裝系列:初始安裝
上篇文章準備了離線安裝 OCP 所需要的離線資源,包括安裝鏡像、所有樣例?Image Stream?和?OperatorHub?中的所有 RedHat Operators。本文就開始正式安裝?OCP(Openshift Container Platform) 集群,包括 DNS 解析、負載均衡配置、ignition?配置文件生成和集群部署。
OCP?安裝期間需要用到多個文件:安裝配置文件、Kubernetes 部署清單、Ignition 配置文件(包含了 machine types)。安裝配置文件將被轉換為 Kubernetes 部署清單,然后將清單包裝到?Ignition?配置文件中。?安裝程序使用這些?Ignition?配置文件來創建 Openshift 集群。運行安裝程序時,所有原始安裝配置文件都會修改,因此在安裝之前應該先備份文件。
1. 安裝過程
在安裝 OCP 時,我們需要有一臺引導主機(Bootstrap)。這個主機可以訪問所有的 OCP 節點。引導主機啟動一個臨時控制平面,它啟動 OCP 集群的其余部分然后被銷毀。引導主機使用 Ignition 配置文件進行集群安裝引導,該文件描述了如何創建 OCP 集群。安裝程序生成的 Ignition 配置文件包含 24 小時后過期的證書,所以必須在證書過期之前完成集群安裝。
引導集群安裝包括如下步驟:
- 引導主機啟動并開始托管?Master?節點啟動所需的資源。
- Master?節點從引導主機遠程獲取資源并完成引導。
- Master?節點通過引導主機構建?Etcd?集群。
- 引導主機使用新的?Etcd?集群啟動臨時?Kubernetes?控制平面。
- 臨時控制平面在 Master 節點啟動生成控制平面。
- 臨時控制平面關閉并將控制權傳遞給生產控制平面。
- 引導主機將 OCP 組件注入生成控制平面。
- 安裝程序關閉引導主機。
引導安裝過程完成以后,OCP 集群部署完畢。然后集群開始下載并配置日常操作所需的其余組件,包括創建計算節點、通過?Operator?安裝其他服務等。
2. 準備服務器資源
服務器規劃如下:
- 三個控制平面節點,安裝?Etcd、控制平面組件和?Infras?基礎組件。
- 兩個計算節點,運行實際負載。
- 一個引導主機,執行安裝任務,集群部署完成后可刪除。
- 一個基礎節點,用于準備上節提到的離線資源,同時用來部署 DNS 和負載均衡。
- 一個鏡像節點,用來部署私有鏡像倉庫?Quay。
| 鏡像節點 | RHEL 7.6 | registry | 4 | 8GB | 150GB | 192.168.57.70 | registry.openshift4.example.com |
| 基礎節點 | RHEL 7.6 | bastion | 4 | 16GB | 120GB | 192.168.57.60 | bastion.openshift4.example.com |
| 引導主機 | RHCOS | bootstrap | 4 | 16GB | 120GB | 192.168.57.61 | bootstrap.openshift4.example.com |
| 控制平面 | RHCOS | master1 | 4 | 16GB | 120GB | 192.168.57.62 | master1.openshift4.example.com |
| 控制平面 | RHCOS | master2 | 4 | 16GB | 120GB | 192.168.57.63 | master2.openshift4.example.com |
| 控制平面 | RHCOS | master3 | 4 | 16GB | 120GB | 192.168.57.64 | master3.openshift4.example.com |
| 計算節點 | RHCOS 或 RHEL 7.6 | worker1 | 2 | 8GB | 120GB | 192.168.57.65 | worker1.openshift4.example.com |
| 計算節點 | RHCOS 或 RHEL 7.6 | worker2 | 2 | 8GB | 120GB | 192.168.57.66 | worke2.openshift4.example.com |
3. 防火墻配置
接下來看一下每個節點的端口號分配。
所有節點(計算節點和控制平面)之間需要開放的端口:
| ICMP | N/A | 測試網絡連通性 |
| TCP | 9000-9999 | 節點的服務端口,包括 node exporter 使用的?9100-9101?端口和 Cluster Version Operator 使用的?9099?端口 |
| ? | 10250-10259 | Kubernetes 預留的默認端口 |
| ? | 10256 | openshift-sdn |
| UDP | 4789 | VXLAN 協議或 GENEVE 協議的通信端口 |
| ? | 6081 | VXLAN 協議或 GENEVE 協議的通信端口 |
| ? | 9000-9999 | 節點的服務端口,包括 node exporter 使用的?9100-9101?端口 |
| ? | 30000-32767 | Kubernetes NodePort |
控制平面需要向其他節點開放的端口:
| TCP | 2379-2380 | Etcd 服務端口 |
| ? | 6443 | Kubernetes API |
除此之外,還要配置兩個四層負載均衡器,一個用來暴露集群 API,一個用來暴露 Ingress:
| 6443 | 引導主機和控制平面使用。在引導主機初始化集群控制平面后,需從負載均衡器中手動刪除引導主機 | x | x | Kubernetes API server |
| 22623 | 引導主機和控制平面使用。在引導主機初始化集群控制平面后,需從負載均衡器中手動刪除引導主機 | ? | x | Machine Config server |
| 443 | Ingress Controller 或 Router 使用 | x | x | HTTPS 流量 |
| 80 | Ingress Controller 或 Router 使用 | x | x | HTTP 流量 |
4. 配置 DNS
按照官方文檔,使用 UPI 基礎架構的 OCP 集群需要以下的 DNS 記錄。在每條記錄中,<cluster_name>?是集群名稱,<base_domain>?是在?install-config.yaml?文件中指定的集群基本域,如下表所示:
| Kubernetes API | api.<cluster_name>.<base_domain>. | 此 DNS 記錄必須指向控制平面節點的負載均衡器。此記錄必須可由集群外部的客戶端和集群中的所有節點解析。 |
| ? | api-int.<cluster_name>.<base_domain>. | 此 DNS 記錄必須指向控制平面節點的負載均衡器。此記錄必須可由集群外部的客戶端和集群中的所有節點解析。 |
| Routes | *.apps.<cluster_name>.<base_domain>. | DNS 通配符記錄,指向負載均衡器。這個負載均衡器的后端是 Ingress router 所在的節點,默認是計算節點。此記錄必須可由集群外部的客戶端和集群中的所有節點解析。 |
| etcd | etcd-<index>.<cluster_name>.<base_domain>. | OCP 要求每個 etcd 實例的 DNS 記錄指向運行實例的控制平面節點。etcd 實例由 值區分,它們以?0?開頭,以?n-1?結束,其中?n?是集群中控制平面節點的數量。集群中的所有節點必須都可以解析此記錄。 |
| ? | _etcd-server-ssl._tcp.<cluster_name>.<base_domain>. | 因為 etcd 使用端口?2380?對外服務,因此需要建立對應每臺 etcd 節點的 SRV DNS 記錄,優先級 0,權重 10 和端口 2380 |
DNS 服務的部署方法由很多種,我當然推薦使用?CoreDNS,畢竟云原生標配。由于這里需要添加 SRV 記錄,所以需要 CoreDNS 結合?etcd?插件使用。以下所有操作在基礎節點上執行。
首先通過 yum 安裝并啟動 etcd:
$ yum install -y etcd $ systemctl enable etcd --now復制代碼然后下載 CoreDNS 二進制文件:
$ wget https://github.com/coredns/coredns/releases/download/v1.6.9/coredns_1.6.9_linux_amd64.tgz $ tar zxvf coredns_1.6.9_linux_amd64.tgz $ mv coredns /usr/local/bin復制代碼創建?Systemd Unit?文件:
$ cat > /etc/systemd/system/coredns.service <<EOF [Unit] Description=CoreDNS DNS server Documentation=https://coredns.io After=network.target[Service] PermissionsStartOnly=true LimitNOFILE=1048576 LimitNPROC=512 CapabilityBoundingSet=CAP_NET_BIND_SERVICE AmbientCapabilities=CAP_NET_BIND_SERVICE NoNewPrivileges=true User=coredns WorkingDirectory=~ ExecStart=/usr/local/bin/coredns -conf=/etc/coredns/Corefile ExecReload=/bin/kill -SIGUSR1 $MAINPID Restart=on-failure[Install] WantedBy=multi-user.target EOF復制代碼新建?coredns?用戶:
$ useradd coredns -s /sbin/nologin復制代碼新建 CoreDNS 配置文件:
$ cat > /etc/coredns/Corefile <<EOF .:53 { # 監聽 TCP 和 UDP 的 53 端口template IN A apps.openshift4.example.com {match .*apps\.openshift4\.example\.com # 匹配請求 DNS 名稱的正則表達式answer "{{ .Name }} 60 IN A 192.168.57.60" # DNS 應答fallthrough}etcd { # 配置啟用 etcd 插件,后面可以指定域名,例如 etcd test.com {path /skydns # etcd 里面的路徑 默認為 /skydns,以后所有的 dns 記錄都存儲在該路徑下endpoint http://localhost:2379 # etcd 訪問地址,多個空格分開fallthrough # 如果區域匹配但不能生成記錄,則將請求傳遞給下一個插件# tls CERT KEY CACERT # 可選參數,etcd 認證證書設置}prometheus # 監控插件cache 160loadbalance # 負載均衡,開啟 DNS 記錄輪詢策略forward . 192.168.57.1log # 打印日志 } EOF復制代碼其中?template?插件用來實現泛域名解析。
啟動 CoreDNS 并設置開機自啟:
$ systemctl enable coredns --now復制代碼驗證泛域名解析:
$ dig +short apps.openshift4.example.com @127.0.0.1 192.168.57.60$ dig +short x.apps.openshift4.example.com @127.0.0.1 192.168.57.60復制代碼添加其余 DNS 記錄:
$ alias etcdctlv3='ETCDCTL_API=3 etcdctl' $ etcdctlv3 put /skydns/com/example/openshift4/api '{"host":"192.168.57.60","ttl":60}' $ etcdctlv3 put /skydns/com/example/openshift4/api-int '{"host":"192.168.57.60","ttl":60}'$ etcdctlv3 put /skydns/com/example/openshift4/etcd-0 '{"host":"192.168.57.62","ttl":60}' $ etcdctlv3 put /skydns/com/example/openshift4/etcd-1 '{"host":"192.168.57.63","ttl":60}' $ etcdctlv3 put /skydns/com/example/openshift4/etcd-2 '{"host":"192.168.57.64","ttl":60}'$ etcdctlv3 put /skydns/com/example/openshift4/_tcp/_etcd-server-ssl/x1 '{"host":"etcd-0.openshift4.example.com","ttl":60,"priority":0,"weight":10,"port":2380}' $ etcdctlv3 put /skydns/com/example/openshift4/_tcp/_etcd-server-ssl/x2 '{"host":"etcd-1.openshift4.example.com","ttl":60,"priority":0,"weight":10,"port":2380}' $ etcdctlv3 put /skydns/com/example/openshift4/_tcp/_etcd-server-ssl/x3 '{"host":"etcd-2.openshift4.example.com","ttl":60,"priority":0,"weight":10,"port":2380}'# 除此之外再添加各節點主機名記錄 $ etcdctlv3 put /skydns/com/example/openshift4/bootstrap '{"host":"192.168.57.61","ttl":60}' $ etcdctlv3 put /skydns/com/example/openshift4/master1 '{"host":"192.168.57.62","ttl":60}' $ etcdctlv3 put /skydns/com/example/openshift4/master2 '{"host":"192.168.57.63","ttl":60}' $ etcdctlv3 put /skydns/com/example/openshift4/master3 '{"host":"192.168.57.64","ttl":60}' $ etcdctlv3 put /skydns/com/example/openshift4/worker1 '{"host":"192.168.57.65","ttl":60}' $ etcdctlv3 put /skydns/com/example/openshift4/worker2 '{"host":"192.168.57.66","ttl":60}' $ etcdctlv3 put /skydns/com/example/openshift4/registry '{"host":"192.168.57.70","ttl":60}'復制代碼驗證 DNS 解析:
$ yum install -y bind-utils $ dig +short api.openshift4.example.com @127.0.0.1 192.168.57.60$ dig +short api-int.openshift4.example.com @127.0.0.1 192.168.57.60$ dig +short etcd-0.openshift4.example.com @127.0.0.1 192.168.57.62 $ dig +short etcd-1.openshift4.example.com @127.0.0.1 192.168.57.63 $ dig +short etcd-2.openshift4.example.com @127.0.0.1 192.168.57.64$ dig +short -t SRV _etcd-server-ssl._tcp.openshift4.example.com @127.0.0.1 10 33 2380 etcd-0.openshift4.example.com. 10 33 2380 etcd-1.openshift4.example.com. 10 33 2380 etcd-2.openshift4.example.com.$ dig +short bootstrap.openshift4.example.com @127.0.0.1 192.168.57.61 $ dig +short master1.openshift4.example.com @127.0.0.1 192.168.57.62 $ dig +short master2.openshift4.example.com @127.0.0.1 192.168.57.63 $ dig +short master3.openshift4.example.com @127.0.0.1 192.168.57.64 $ dig +short worker1.openshift4.example.com @127.0.0.1 192.168.57.65 $ dig +short worker2.openshift4.example.com @127.0.0.1 192.168.57.66復制代碼5. 配置負載均衡
負載均衡我選擇使用?Envoy,先準備配置文件:
Bootstrap
# /etc/envoy/envoy.yaml node:id: node0cluster: cluster0 dynamic_resources:lds_config:path: /etc/envoy/lds.yamlcds_config:path: /etc/envoy/cds.yaml admin:access_log_path: "/dev/stdout"address:socket_address:address: "0.0.0.0"port_value: 15001復制代碼LDS
# /etc/envoy/lds.yaml version_info: "0" resources: - "@type": type.googleapis.com/envoy.config.listener.v3.Listenername: listener_openshift-api-serveraddress:socket_address:address: 0.0.0.0port_value: 6443filter_chains:- filters:- name: envoy.tcp_proxytyped_config:"@type": type.googleapis.com/envoy.extensions.filters.network.tcp_proxy.v3.TcpProxystat_prefix: openshift-api-servercluster: openshift-api-serveraccess_log:name: envoy.access_loggers.filetyped_config:"@type": type.googleapis.com/envoy.extensions.access_loggers.file.v3.FileAccessLogpath: /dev/stdout - "@type": type.googleapis.com/envoy.config.listener.v3.Listenername: listener_machine-config-serveraddress:socket_address:address: "::"ipv4_compat: trueport_value: 22623filter_chains:- filters:- name: envoy.tcp_proxytyped_config:"@type": type.googleapis.com/envoy.extensions.filters.network.tcp_proxy.v3.TcpProxystat_prefix: machine-config-servercluster: machine-config-serveraccess_log:name: envoy.access_loggers.filetyped_config:"@type": type.googleapis.com/envoy.extensions.access_loggers.file.v3.FileAccessLogpath: /dev/stdout - "@type": type.googleapis.com/envoy.config.listener.v3.Listenername: listener_ingress-httpaddress:socket_address:address: "::"ipv4_compat: trueport_value: 80filter_chains:- filters:- name: envoy.tcp_proxytyped_config:"@type": type.googleapis.com/envoy.extensions.filters.network.tcp_proxy.v3.TcpProxystat_prefix: ingress-httpcluster: ingress-httpaccess_log:name: envoy.access_loggers.filetyped_config:"@type": type.googleapis.com/envoy.extensions.access_loggers.file.v3.FileAccessLogpath: /dev/stdout - "@type": type.googleapis.com/envoy.config.listener.v3.Listenername: listener_ingress-httpsaddress:socket_address:address: "::"ipv4_compat: trueport_value: 443filter_chains:- filters:- name: envoy.tcp_proxytyped_config:"@type": type.googleapis.com/envoy.extensions.filters.network.tcp_proxy.v3.TcpProxystat_prefix: ingress-httpscluster: ingress-httpsaccess_log:name: envoy.access_loggers.filetyped_config:"@type": type.googleapis.com/envoy.extensions.access_loggers.file.v3.FileAccessLogpath: /dev/stdout復制代碼CDS
# /etc/envoy/cds.yaml version_info: "0" resources: - "@type": type.googleapis.com/envoy.config.cluster.v3.Clustername: openshift-api-serverconnect_timeout: 1stype: strict_dnsdns_lookup_family: V4_ONLYlb_policy: ROUND_ROBINload_assignment:cluster_name: openshift-api-serverendpoints:- lb_endpoints:- endpoint:address:socket_address:address: 192.168.57.61port_value: 6443- endpoint:address:socket_address:address: 192.168.57.62port_value: 6443- endpoint:address:socket_address:address: 192.168.57.63port_value: 6443- endpoint:address:socket_address:address: 192.168.57.64port_value: 6443 - "@type": type.googleapis.com/envoy.config.cluster.v3.Clustername: machine-config-serverconnect_timeout: 1stype: strict_dnsdns_lookup_family: V4_ONLYlb_policy: ROUND_ROBINload_assignment:cluster_name: machine-config-serverendpoints:- lb_endpoints:- endpoint:address:socket_address:address: 192.168.57.61port_value: 22623- endpoint:address:socket_address:address: 192.168.57.62port_value: 22623- endpoint:address:socket_address:address: 192.168.57.63port_value: 22623- endpoint:address:socket_address:address: 192.168.57.64port_value: 22623 - "@type": type.googleapis.com/envoy.config.cluster.v3.Clustername: ingress-httpconnect_timeout: 1stype: strict_dnsdns_lookup_family: V4_ONLYlb_policy: ROUND_ROBINload_assignment:cluster_name: ingress-httpendpoints:- lb_endpoints:- endpoint:address:socket_address:address: 192.168.57.65port_value: 80- endpoint:address:socket_address:address: 192.168.57.66port_value: 80 - "@type": type.googleapis.com/envoy.config.cluster.v3.Clustername: ingress-httpsconnect_timeout: 1stype: strict_dnsdns_lookup_family: V4_ONLYlb_policy: ROUND_ROBINload_assignment:cluster_name: ingress-httpsendpoints:- lb_endpoints:- endpoint:address:socket_address:address: 192.168.57.65port_value: 443- endpoint:address:socket_address:address: 192.168.57.66port_value: 443復制代碼配置看不懂的去看我的電子書:Envoy 中文指南
啟動?Envoy:
$ podman run -d --restart=always --name envoy --net host -v /etc/envoy:/etc/envoy envoyproxy/envoy復制代碼6. 安裝準備
生成 SSH 私鑰并將其添加到 agent
在安裝過程中,我們會在基礎節點上執行 OCP 安裝調試和災難恢復,因此必須在基礎節點上配置 SSH key,ssh-agent?將會用它來執行安裝程序。
基礎節點上的?core?用戶可以使用該私鑰登錄到 Master 節點。部署集群時,該私鑰會被添加到 core 用戶的?~/.ssh/authorized_keys?列表中。
密鑰創建步驟如下:
① 創建無密碼驗證的 SSH key:
$ ssh-keygen -t rsa -b 4096 -N '' -f ~/.ssh/new_rsa復制代碼② 啟動?ssh-agent?進程作為后臺任務:
$ eval "$(ssh-agent -s)"復制代碼③ 將 SSH 私鑰添加到?ssh-agent:
$ ssh-add ~/.ssh/new_rsa復制代碼后續集群安裝過程中,有一步會提示輸入 SSH public key,屆時使用前面創建的公鑰?new_rsa.pub?就可以了。
獲取安裝程序
如果是在線安裝,還需要在基礎節點上下載安裝程序。但這里是離線安裝,安裝程序在上篇文章中已經被提取出來了,所以不需要再下載。
創建安裝配置文件
首先創建一個安裝目錄,用來存儲安裝所需要的文件:
$ mkdir /ocpinstall復制代碼自定義?install-config.yaml?并將其保存在?/ocpinstall?目錄中。配置文件必須命名為?install-config.yaml。配置文件內容:
apiVersion: v1 baseDomain: example.com compute: - hyperthreading: Enabledname: workerreplicas: 0 controlPlane:hyperthreading: Enabledname: masterreplicas: 3 metadata:name: openshift4 networking:clusterNetwork:- cidr: 10.128.0.0/14hostPrefix: 23networkType: OpenShiftSDNserviceNetwork:- 172.30.0.0/16 platform:none: {} fips: false pullSecret: '{"auths": ...}' sshKey: 'ssh-rsa ...' additionalTrustBundle: |-----BEGIN CERTIFICATE-----省略,注意這里要前面空兩格-----END CERTIFICATE----- imageContentSources: - mirrors:- registry.openshift4.example.com/ocp4/openshift4source: quay.io/openshift-release-dev/ocp-release - mirrors:- registry.openshift4.example.com/ocp4/openshift4source: quay.io/openshift-release-dev/ocp-v4.0-art-dev復制代碼- baseDomain?: 所有 Openshift 內部的 DNS 記錄必須是此基礎的子域,并包含集群名稱。
- compute?: 計算節點配置。這是一個數組,每一個元素必須以連字符?-?開頭。
- hyperthreading?: Enabled 表示啟用同步多線程或超線程。默認啟用同步多線程,可以提高機器內核的性能。如果要禁用,則控制平面和計算節點都要禁用。
- compute.replicas?: 計算節點數量。因為我們要手動創建計算節點,所以這里要設置為 0。
- controlPlane.replicas?: 控制平面節點數量。控制平面節點數量必須和 etcd 節點數量一致,為了實現高可用,本文設置為 3。
- metadata.name?: 集群名稱。即前面 DNS 記錄中的?<cluster_name>。
- cidr?: 定義了分配 Pod IP 的 IP 地址段,不能和物理網絡重疊。
- hostPrefix?: 分配給每個節點的子網前綴長度。例如,如果將?hostPrefix?設置為?23,則為每一個節點分配一個給定 cidr 的?/23?子網,允許 $510 (2^{32 - 23} - 2)$ 個 Pod IP 地址。
- serviceNetwork?: Service IP 的地址池,只能設置一個。
- pullSecret?: 上篇文章使用的 pull secret,可通過命令?cat /root/pull-secret.json|jq -c?來壓縮成一行。
- sshKey?: 上面創建的公鑰,可通過命令?cat ~/.ssh/new_rsa.pub?查看。
- additionalTrustBundle?: 私有鏡像倉庫 Quay 的信任證書,可在鏡像節點上通過命令?cat /data/quay/config/ssl.cert?查看。
- imageContentSources?: 來自前面?oc adm release mirror?的輸出結果。
備份安裝配置文件,便于以后重復使用:
$ cd /ocpinstall $ cp install-config.yaml install-config.yaml.20200604復制代碼創建 Kubernetes 部署清單
創建 Kubernetes 部署清單后?install-config.yaml?將被刪除,請務必先備份此文件!
創建 Kubernetes 部署清單文件:
$ openshift-install create manifests --dir=/ocpinstall復制代碼修改?manifests/cluster-scheduler-02-config.yml?文件,將?mastersSchedulable?的值設為?flase,以防止 Pod 調度到控制節點。
創建 Ignition 配置文件
創建 Ignition 配置文件后?install-config.yaml?將被刪除,請務必先備份此文件!
$ cp install-config.yaml.20200604 install-config.yaml $ openshift-install create ignition-configs --dir=/ocpinstall復制代碼生成的文件:
├── auth │?? ├── kubeadmin-password │?? └── kubeconfig ├── bootstrap.ign ├── master.ign ├── metadata.json └── worker.ign復制代碼準備一個 HTTP 服務,這里選擇使用 Nginx:
$ yum install -y nginx復制代碼修改 Nginx 的配置文件?/etc/nginx/nginx/.conf,將端口改為?8080(因為負載均衡器已經占用了 80 端口)。然后啟動 Nginx 服務:
$ systemctl enable nginx --now復制代碼將?Ignition?配置文件拷貝到 HTTP 服務的 ignition 目錄:
$ mkdir /usr/share/nginx/html/ignition $ cp -r *.ign /usr/share/nginx/html/ignition/復制代碼獲取 RHCOS 的 BIOS 文件
下載用于裸機安裝的 BIOS 文件,并上傳到 Nginx 的目錄:
$ mkdir /usr/share/nginx/html/install $ wget https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/4.4/latest/rhcos-4.4.3-x86_64-metal.x86_64.raw.gz -O /usr/share/nginx/html/install/rhcos-4.4.3-x86_64-metal.x86_64.raw.gz復制代碼獲取 RHCOS 的 ISO 文件
本地下載 RHCOS 的?ISO?文件:mirror.openshift.com/pub/openshi…,然后上傳到?vSphere。步驟如下:
① 首先登陸 vSphere,然后點擊『存儲』。?
② 選擇一個『數據存儲』,然后在右邊的窗口中選擇『上載文件』。
③ 選擇剛剛下載的 ISO 文件,上傳到 ESXI 主機。
7. 安裝集群
Bootstrap
最后開始正式安裝集群,先創建 bootstrap 節點虛擬機,操作系統選擇『Red Hat Enterprise Linux 7 (64-Bit)』,并掛載之前上傳的 ISO,按照之前的表格設置 CPU 、內存和硬盤,打開電源,然后按照下面的步驟操作:
① 在 RHCOS Installer 安裝界面按?Tab?鍵進入引導參數配置選項。
② 在默認選項?coreos.inst = yes?之后添加(由于無法拷貝粘貼,請輸入仔細核對后再回車進行):
ip=192.168.57.61::192.168.57.1:255.255.255.0:bootstrap.openshift4.example.com:ens192:none nameserver=192.168.57.60 coreos.inst.install_dev=sda coreos.inst.image_url=http://192.168.57.60:8080/install/rhcos-4.4.3-x86_64-metal.x86_64.raw.gz coreos.inst.ignition_url=http://192.168.57.60:8080/ignition/bootstrap.ign 復制代碼其中?ip=...?的含義為?ip=$IPADDRESS::$DEFAULTGW:$NETMASK:$HOSTNAMEFQDN:$IFACE:none。
如圖所示:
③ 如果安裝有問題會進入?emergency shell,檢查網絡、域名解析是否正常,如果正常一般是以上參數輸入有誤,reboot 退出 shell 回到第一步重新開始。
安裝成功后從基礎節點通過命令?ssh -i ~/.ssh/new_rsa core@192.168.57.61?登錄 bootstrap 節點,然后驗證:
- 網絡配置是否符合自己的設定:
- hostname
- ip route
- cat /etc/resolv.conf
- 驗證是否成功啟動 bootstrap 相應服務:
- podman ps?查看服務是否以容器方式運行
- 使用?ss -tulnp?查看 6443 和 22623 端口是否啟用。
這里簡單介紹一下 bootstrap 節點的啟動流程,它會先通過?podman?跑一些容器,然后在容器里面啟動臨時控制平面,這個臨時控制平面是通過?CRIO?跑在容器里的,有點繞。。直接看命令:
$ podman ps -a --no-trunc --sort created --format "{{.Command}}"start --tear-down-early=false --asset-dir=/assets --required-pods=openshift-kube-apiserver/kube-apiserver,openshift-kube-scheduler/openshift-kube-scheduler,openshift-kube-controller-manager/kube-controller-manager,openshift-cluster-version/cluster-version-operator /usr/bin/grep -oP Managed /manifests/0000_12_etcd-operator_01_operator.cr.yaml /usr/bin/grep -oP Managed /manifests/0000_12_etcd-operator_01_operator.cr.yaml /usr/bin/grep -oP Managed /manifests/0000_12_etcd-operator_01_operator.cr.yaml /usr/bin/grep -oP Managed /manifests/0000_12_etcd-operator_01_operator.cr.yaml /usr/bin/grep -oP Managed /manifests/0000_12_etcd-operator_01_operator.cr.yaml /usr/bin/grep -oP Managed /manifests/0000_12_etcd-operator_01_operator.cr.yaml /usr/bin/grep -oP Managed /manifests/0000_12_etcd-operator_01_operator.cr.yaml /usr/bin/grep -oP Managed /manifests/0000_12_etcd-operator_01_operator.cr.yaml /usr/bin/grep -oP Managed /manifests/0000_12_etcd-operator_01_operator.cr.yaml /usr/bin/grep -oP Managed /manifests/0000_12_etcd-operator_01_operator.cr.yaml /usr/bin/grep -oP Managed /manifests/0000_12_etcd-operator_01_operator.cr.yaml /usr/bin/grep -oP Managed /manifests/0000_12_etcd-operator_01_operator.cr.yaml /usr/bin/grep -oP Managed /manifests/0000_12_etcd-operator_01_operator.cr.yaml /usr/bin/grep -oP Managed /manifests/0000_12_etcd-operator_01_operator.cr.yaml /usr/bin/grep -oP Managed /manifests/0000_12_etcd-operator_01_operator.cr.yaml /usr/bin/grep -oP Managed /manifests/0000_12_etcd-operator_01_operator.cr.yaml /usr/bin/grep -oP Managed /manifests/0000_12_etcd-operator_01_operator.cr.yaml /usr/bin/grep -oP Managed /manifests/0000_12_etcd-operator_01_operator.cr.yaml /usr/bin/grep -oP Managed /manifests/0000_12_etcd-operator_01_operator.cr.yaml /usr/bin/grep -oP Managed /manifests/0000_12_etcd-operator_01_operator.cr.yaml /usr/bin/grep -oP Managed /manifests/0000_12_etcd-operator_01_operator.cr.yaml /usr/bin/grep -oP Managed /manifests/0000_12_etcd-operator_01_operator.cr.yaml /usr/bin/grep -oP Managed /manifests/0000_12_etcd-operator_01_operator.cr.yaml render --dest-dir=/assets/cco-bootstrap --cloud-credential-operator-image=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:244ab9d0fcf7315eb5c399bd3fa7c2e662cf23f87f625757b13f415d484621c3 bootstrap --etcd-ca=/assets/tls/etcd-ca-bundle.crt --etcd-metric-ca=/assets/tls/etcd-metric-ca-bundle.crt --root-ca=/assets/tls/root-ca.crt --kube-ca=/assets/tls/kube-apiserver-complete-client-ca-bundle.crt --config-file=/assets/manifests/cluster-config.yaml --dest-dir=/assets/mco-bootstrap --pull-secret=/assets/manifests/openshift-config-secret-pull-secret.yaml --etcd-image=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:aba3c59eb6d088d61b268f83b034230b3396ce67da4f6f6d49201e55efebc6b2 --kube-client-agent-image=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:8eb481214103d8e0b5fe982ffd682f838b969c8ff7d4f3ed4f83d4a444fb841b --machine-config-operator-image=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:31dfdca3584982ed5a82d3017322b7d65a491ab25080c427f3f07d9ce93c52e2 --machine-config-oscontent-image=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:b397960b7cc14c2e2603111b7385c6e8e4b0f683f9873cd9252a789175e5c4e1 --infra-image=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d7862a735f492a18cb127742b5c2252281aa8f3bd92189176dd46ae9620ee68a --keepalived-image=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:a882a11b55b2fc41b538b59bf5db8e4cfc47c537890e4906fe6bf22f9da75575 --coredns-image=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:b25b8b2219e8c247c088af93e833c9ac390bc63459955e131d89b77c485d144d --mdns-publisher-image=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:dea1fcb456eae4aabdf5d2d5c537a968a2dafc3da52fe20e8d99a176fccaabce --haproxy-image=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7064737dd9d0a43de7a87a094487ab4d7b9e666675c53cf4806d1c9279bd6c2e --baremetal-runtimecfg-image=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:715bc48eda04afc06827189883451958d8940ed8ab6dd491f602611fe98a6fba --cloud-config-file=/assets/manifests/cloud-provider-config.yaml --cluster-etcd-operator-image=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:9f7a02df3a5d91326d95e444e2e249f8205632ae986d6dccc7f007ec65c8af77 render --prefix=cluster-ingress- --output-dir=/assets/ingress-operator-manifests /usr/bin/cluster-kube-scheduler-operator render --manifest-image=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:187b9d29fea1bde9f1785584b4a7bbf9a0b9f93e1323d92d138e61c861b6286c --asset-input-dir=/assets/tls --asset-output-dir=/assets/kube-scheduler-bootstrap --config-output-file=/assets/kube-scheduler-bootstrap/config /usr/bin/cluster-kube-controller-manager-operator render --manifest-image=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:187b9d29fea1bde9f1785584b4a7bbf9a0b9f93e1323d92d138e61c861b6286c --asset-input-dir=/assets/tls --asset-output-dir=/assets/kube-controller-manager-bootstrap --config-output-file=/assets/kube-controller-manager-bootstrap/config --cluster-config-file=/assets/manifests/cluster-network-02-config.yml /usr/bin/cluster-kube-apiserver-operator render --manifest-etcd-serving-ca=etcd-ca-bundle.crt --manifest-etcd-server-urls=https://localhost:2379 --manifest-image=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:187b9d29fea1bde9f1785584b4a7bbf9a0b9f93e1323d92d138e61c861b6286c --manifest-operator-image=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:718ca346d5499cccb4de98c1f858c9a9a13bbf429624226f466c3ee2c14ebf40 --asset-input-dir=/assets/tls --asset-output-dir=/assets/kube-apiserver-bootstrap --config-output-file=/assets/kube-apiserver-bootstrap/config --cluster-config-file=/assets/manifests/cluster-network-02-config.yml /usr/bin/cluster-config-operator render --config-output-file=/assets/config-bootstrap/config --asset-input-dir=/assets/tls --asset-output-dir=/assets/config-bootstrap /usr/bin/cluster-etcd-operator render --etcd-ca=/assets/tls/etcd-ca-bundle.crt --etcd-metric-ca=/assets/tls/etcd-metric-ca-bundle.crt --manifest-etcd-image=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:aba3c59eb6d088d61b268f83b034230b3396ce67da4f6f6d49201e55efebc6b2 --etcd-discovery-domain=test.example.com --manifest-cluster-etcd-operator-image=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:9f7a02df3a5d91326d95e444e2e249f8205632ae986d6dccc7f007ec65c8af77 --manifest-setup-etcd-env-image=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:31dfdca3584982ed5a82d3017322b7d65a491ab25080c427f3f07d9ce93c52e2 --manifest-kube-client-agent-image=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:8eb481214103d8e0b5fe982ffd682f838b969c8ff7d4f3ed4f83d4a444fb841b --asset-input-dir=/assets/tls --asset-output-dir=/assets/etcd-bootstrap --config-output-file=/assets/etcd-bootstrap/config --cluster-config-file=/assets/manifests/cluster-network-02-config.yml render --output-dir=/assets/cvo-bootstrap --release-image=registry.openshift4.example.com/ocp4/openshift4@sha256:4a461dc23a9d323c8bd7a8631bed078a9e5eec690ce073f78b645c83fb4cdf74 /usr/bin/grep -oP Managed /manifests/0000_12_etcd-operator_01_operator.cr.yaml復制代碼 $ crictl podsPOD ID CREATED STATE NAME NAMESPACE ATTEMPT 17a978b9e7b1e 3 minutes ago Ready bootstrap-kube-apiserver-bootstrap.openshift4.example.com kube-system 24 8a0f79f38787a 3 minutes ago Ready bootstrap-kube-scheduler-bootstrap.openshift4.example.com kube-system 4 1a707da797173 3 minutes ago Ready bootstrap-kube-controller-manager-bootstrap.openshift4.example.com kube-system 4 0461d2caa2753 3 minutes ago Ready cloud-credential-operator-bootstrap.openshift4.example.com openshift-cloud-credential-operator 4 ab6519286f65a 3 minutes ago Ready bootstrap-cluster-version-operator-bootstrap.openshift4.example.com openshift-cluster-version 2 457a7a46ec486 8 hours ago Ready bootstrap-machine-config-operator-bootstrap.openshift4.example.com default 0 e4df49b4d36a1 8 hours ago Ready etcd-bootstrap-member-bootstrap.openshift4.example.com openshift-etcd 0復制代碼如果驗證無問題,則可以一邊繼續下面的步驟一邊觀察日志:journalctl -b -f -u bootkube.service
RHCOS 的默認用戶是?core,如果想獲取 root 權限,可以執行命令?sudo su(不需要輸入密碼)。
Master
控制節點和之前類似,先創建虛擬機,然后修改引導參數,引導參數調整為:
ip=192.168.57.62::192.168.57.1:255.255.255.0:master1.openshift4.example.com:ens192:none nameserver=192.168.57.60 coreos.inst.install_dev=sda coreos.inst.image_url=http://192.168.57.60:8080/install/rhcos-4.4.3-x86_64-metal.x86_64.raw.gz coreos.inst.ignition_url=http://192.168.57.60:8080/ignition/master.ign 復制代碼控制節點安裝成功后會重啟一次,之后同樣可以從基礎節點通過 SSH 密鑰登錄。
然后重復相同的步驟創建其他兩臺控制節點,注意修改引導參數(IP 和主機名)。先不急著創建計算節點,先在基礎節點執行以下命令完成生產控制平面的創建:
$ openshift-install --dir=/ocpinstall wait-for bootstrap-complete --log-level=debugDEBUG OpenShift Installer 4.4.5 DEBUG Built from commit 15eac3785998a5bc250c9f72101a4a9cb767e494 INFO Waiting up to 20m0s for the Kubernetes API at https://api.openshift4.example.com:6443... INFO API v1.17.1 up INFO Waiting up to 40m0s for bootstrapping to complete... DEBUG Bootstrap status: complete INFO It is now safe to remove the bootstrap resources復制代碼待出現?It is now safe to remove the bootstrap resources?提示之后,從負載均衡器中刪除引導主機,本文使用的是 Envoy,只需從?cds.yaml?中刪除引導主機的 endpoint,然后重新加載就好了。
觀察引導節點的日志:
$ journalctl -b -f -u bootkube.service... Jun 05 00:24:12 bootstrap.openshift4.example.com bootkube.sh[12571]: I0605 00:24:12.108179 1 waitforceo.go:67] waiting on condition EtcdRunningInCluster in etcd CR /cluster to be True. Jun 05 00:24:21 bootstrap.openshift4.example.com bootkube.sh[12571]: I0605 00:24:21.595680 1 waitforceo.go:67] waiting on condition EtcdRunningInCluster in etcd CR /cluster to be True. Jun 05 00:24:26 bootstrap.openshift4.example.com bootkube.sh[12571]: I0605 00:24:26.250214 1 waitforceo.go:67] waiting on condition EtcdRunningInCluster in etcd CR /cluster to be True. Jun 05 00:24:26 bootstrap.openshift4.example.com bootkube.sh[12571]: I0605 00:24:26.306421 1 waitforceo.go:67] waiting on condition EtcdRunningInCluster in etcd CR /cluster to be True. Jun 05 00:24:29 bootstrap.openshift4.example.com bootkube.sh[12571]: I0605 00:24:29.097072 1 waitforceo.go:64] Cluster etcd operator bootstrapped successfully Jun 05 00:24:29 bootstrap.openshift4.example.com bootkube.sh[12571]: I0605 00:24:29.097306 1 waitforceo.go:58] cluster-etcd-operator bootstrap etcd Jun 05 00:24:29 bootstrap.openshift4.example.com podman[16531]: 2020-06-05 00:24:29.120864426 +0000 UTC m=+17.965364064 container died 77971b6ca31755a89b279fab6f9c04828c4614161c2e678c7cba48348e684517 (image=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:9f7a02df3a5d91326d95e444e2e249f8205632ae986d6dccc7f007ec65c8af77, name=recursing_cerf) Jun 05 00:24:29 bootstrap.openshift4.example.com bootkube.sh[12571]: bootkube.service complete復制代碼Worker
計算節點和之前類似,先創建虛擬機,然后修改引導參數,引導參數調整為:
ip=192.168.57.65::192.168.57.1:255.255.255.0:worker1.openshift4.example.com:ens192:none nameserver=192.168.57.60 coreos.inst.install_dev=sda coreos.inst.image_url=http://192.168.57.60:8080/install/rhcos-4.4.3-x86_64-metal.x86_64.raw.gz coreos.inst.ignition_url=http://192.168.57.60:8080/ignition/worker.ign 復制代碼計算節點安裝成功后也會重啟一次,之后同樣可以從基礎節點通過 SSH 密鑰登錄。
然后重復相同的步驟創建其他計算節點,注意修改引導參數(IP 和主機名)。
登錄集群
可以通過導出集群 kubeconfig 文件以默認系統用戶身份登錄到集群。kubeconfig 文件包含有關 CLI 用于將客戶端連接到正確的集群和 API Server 的集群信息,該文件在 OCP 安裝期間被創建。
$ mkdir ~/.kube $ cp /ocpinstall/auth/kubeconfig ~/.kube/config $ oc whoami system:admin復制代碼批準 CSR
將節點添加到集群時,會為添加的每臺節點生成兩個待處理證書簽名請求(CSR)。必須確認這些 CSR 已獲得批準,或者在必要時自行批準。
$ oc get nodeNAME STATUS ROLES AGE VERSION master1.openshift4.example.com Ready master,worker 6h25m v1.17.1 master2.openshift4.example.com Ready master,worker 6h39m v1.17.1 master3.openshift4.example.com Ready master,worker 6h15m v1.17.1 worker1.openshift4.example.com NotReady worker 5h8m v1.17.1 worker2.openshift4.example.com NotReady worker 5h9m v1.17.1復制代碼輸出列出了創建的所有節點。查看掛起的證書簽名請求(CSR),并確保添加到集群的每臺節點都能看到具有?Pending?或?Approved?狀態的客戶端和服務端請求。針對 Pending 狀態的 CSR 批準請求:
$ oc adm certificate approve xxx復制代碼或者執行以下命令批準所有 CSR:
$ oc get csr -ojson | jq -r '.items[] | select(.status == {} ) | .metadata.name' | xargs oc adm certificate approve復制代碼Operator 自動初始化
控制平面初始化后,需要確認所有的?Operator?都處于可用的狀態,即確認所有 Operator 的?Available?字段值皆為?True:
$ oc get clusteroperatorsNAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE authentication 4.4.5 True False False 150m cloud-credential 4.4.5 True False False 7h7m cluster-autoscaler 4.4.5 True False False 6h12m console 4.4.5 True False False 150m csi-snapshot-controller 4.4.5 True False False 6h13m dns 4.4.5 True False False 6h37m etcd 4.4.5 True False False 6h19m image-registry 4.4.5 True False False 6h12m ingress 4.4.5 True False False 150m insights 4.4.5 True False False 6h13m kube-apiserver 4.4.5 True False False 6h15m kube-controller-manager 4.4.5 True False False 6h36m kube-scheduler 4.4.5 True False False 6h36m kube-storage-version-migrator 4.4.5 True False False 6h36m machine-api 4.4.5 True False False 6h37m machine-config 4.4.5 True False False 6h36m marketplace 4.4.5 True False False 6h12m monitoring 4.4.5 True False False 6h6m network 4.4.5 True False False 6h39m node-tuning 4.4.5 True False False 6h38m openshift-apiserver 4.4.5 True False False 6h14m openshift-controller-manager 4.4.5 True False False 6h12m openshift-samples 4.4.5 True False False 6h11m operator-lifecycle-manager 4.4.5 True False False 6h37m operator-lifecycle-manager-catalog 4.4.5 True False False 6h37m operator-lifecycle-manager-packageserver 4.4.5 True False False 6h15m service-ca 4.4.5 True False False 6h38m service-catalog-apiserver 4.4.5 True False False 6h38m service-catalog-controller-manager 4.4.5 True False False 6h39m storage 4.4.5 True False False 6h12m復制代碼如果 Operator 不正常,需要進行問題診斷和修復。
完成安裝
最后一步,完成集群的安裝,執行以下命令:
$ openshift-install --dir=/ocpinstall wait-for install-complete --log-level=debug復制代碼注意最后提示訪問?Web Console?的網址及用戶密碼。如果密碼忘了也沒關系,可以查看文件?/ocpinstall/auth/kubeadmin-password?來獲得密碼。
本地訪問 Web Console,需要添加 hosts:
192.168.57.60 console-openshift-console.apps.openshift4.example.com 192.168.57.60 oauth-openshift.apps.openshift4.example.com復制代碼瀏覽器訪問?https://console-openshift-console.apps.openshift4.example.com,輸入上面輸出的用戶名密碼登錄。首次登錄后會提示:
You are logged in as a temporary administrative user. Update the Cluster OAuth configuration to allow others to log in.復制代碼我們可以通過 htpasswd 自定義管理員賬號,步驟如下:
①?htpasswd -c -B -b users.htpasswd admin xxxxx
② 將?users.htpasswd?文件下載到本地。
③ 在 Web Console 頁面打開?Global Configuration:
然后找到?OAuth,點擊進入,然后添加?HTPasswd?類型的?Identity Providers,并上傳?users.htpasswd?文件。
④ 退出當前用戶,注意要退出到如下界面:
選擇?htpasswd,然后輸入之前創建的用戶名密碼登錄。
如果退出后出現的就是用戶密碼輸入窗口,實際還是?kube:admin?的校驗,如果未出現如上提示,可以手動輸入 Web Console 地址來自動跳轉。
⑤ 登錄后貌似能看到?Administrator?菜單項,但訪問如?OAuth Details?仍然提示:
oauths.config.openshift.io "cluster" is forbidden: User "admin" cannot get resource "oauths" in API group "config.openshift.io" at the cluster scope復制代碼因此需要授予集群管理員權限:
$ oc adm policy add-cluster-role-to-user cluster-admin admin復制代碼Web Console 部分截圖:
如果想刪除默認賬號,可以執行以下命令:
$ oc -n kube-system delete secrets kubeadmin總結
以上是生活随笔為你收集整理的Openshift 4.4 静态 IP 离线安装系列:初始安装的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: Prometheus 监控Mysql服务
- 下一篇: 复习----使用链表实现栈(后进先出)及