coroSync packmarker
CoroSync+Pacemaker實現web高可用 2015-04-12 23:38:19
標簽:CoroSync pacemaker 原創作品,允許轉載,轉載時請務必以超鏈接形式標明文章 原始出處 、作者信息和本聲明。否則將追究法律責任。http://lu2yu.blog.51cto.com/10009517/1631677一、簡介
? ? CoroSync最初只是用來演示OpenAIS集群框架接口規范的一個應用,可以說CoroSync是OpenAIS的一部分,但后面的發展明顯超越了官方最初的設想,越來越多的廠商嘗試使用CoroSync作為集群解決方案。如Redhat的RHCS集群套件就是基于CoroSync實現。
?????CoroSync只提供了message layer,而沒有直接提供CRM,一般使用Pacemaker進行資源管理。
? ? ?CoroSync和Pacemaker的配合使用有2種方式:①Pacemaker以插件形式使用 ?②Pacemaker獨立的守護進程
? ? 本文Pacemaker以插件的形式運行。
?
二、配置web高可用
? ? ? ? 0、前提
????????????①時間同步、ssh互信、hosts域名通信 、uname -n的節點名稱(略)
????????????②安裝httpd服務,開機不啟動
????????????③關閉NetworkManager,開啟不啟動,開啟network服務
?
? ? ? ? 1、安裝CoroSync、Pacemaker、crmsh
| 1 | #?yum??install?corosync?pacemaker |
????????????????CoroSync的軟件包組成:
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | [root@node1?~]#?rpm?-ql?corosync /etc/corosync??????????????#CoroSync的配置文件目錄 /etc/corosync/corosync.conf.example????#CoroSync的配置樣例 /etc/corosync/corosync.conf.example.udpu /etc/corosync/service.d /etc/corosync/uidgid.d /etc/dbus-1/system.d/corosync-signals.conf /etc/rc.d/init.d/corosync????????????#CoroSync的服務腳本 /etc/rc.d/init.d/corosync-notifyd /usr/bin/corosync-blackbox /usr/libexec/lcrso /usr/libexec/lcrso/coroparse.lcrso /usr/libexec/lcrso/objdb.lcrso /usr/libexec/lcrso/quorum_testquorum.lcrso /usr/libexec/lcrso/quorum_votequorum.lcrso /usr/libexec/lcrso/service_cfg.lcrso /usr/libexec/lcrso/service_confdb.lcrso /usr/libexec/lcrso/service_cpg.lcrso /usr/libexec/lcrso/service_evs.lcrso /usr/libexec/lcrso/service_pload.lcrso /usr/libexec/lcrso/vsf_quorum.lcrso /usr/libexec/lcrso/vsf_ykd.lcrso /usr/sbin/corosync /usr/sbin/corosync-cfgtool /usr/sbin/corosync-cpgtool /usr/sbin/corosync-fplay /usr/sbin/corosync-keygen????????????#生成節點message?layer通信秘鑰 /usr/sbin/corosync-notifyd /usr/sbin/corosync-objctl /usr/sbin/corosync-pload /usr/sbin/corosync-quorumtool /usr/share/doc/corosync-1.4.1 /var/lib/corosync /var/log/cluster |
| 1 | #?yum??install??-y??crmsh-1.2.6-4.el6.x86_64.rpm??pssh-2.3.1-2.el6.x86_64.rpm??????###crmsh依賴于pssh |
? ? ??從pacemaker 1.1.8開始,crm發展成了一個獨立項目,叫crmsh。也就是說,我們安裝了pacemaker后,并沒有crm這個命令,我們要實現對集群資源管理,還需要獨立安裝crmsh。
? ??
? ? ? 2、配置CoroSync
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 | [root@node1?corosync]#??cat??/etc/corosync/corosync.conf compatibility:?whitetank??????##是否兼容舊版本 totem?{ ????????version:?2????????????##版本號,無法修改 ????????secauth:?off??????????##安全認證,當使用aisexec時,會非常消耗CPU? ????????threads:?0????????????##線程數,根據CPU個數和核心數確定? ????????interface?{ ????????????????ringnumber:?0 ????????????????bindnetaddr:192.168.192.0??##綁定心跳網絡IP地址 ????????????????mcastaddr:?226.94.1.1????????##組播地址 ????????????????mcastport:?5405??????????????##組播端口 ????????????????ttl:?1???????????????????????##組播的ttl值 ????????} } logging?{ ????????fileline:?off ????????to_stderr:?no?????????????##是否發送到標準錯誤輸出? ????????to_logfile:?yes???????????##是否記錄到文件 ????????to_syslog:?yes????????????##是否記錄到syslog ????????logfile:?/var/log/cluster/corosync.log????##日志文件位置 ????????debug:?off ????????timestamp:?on????????##是否打印時間戳,利于定位錯誤,但會消耗CPU ????????logger_subsys?{ ????????????????subsys:?AMF ????????????????debug:?off ????????} } service?{????????????????##定義pacemaker以插件的形式啟動 ??ver:??0? ??name:?pacemaker?? } aisexec?{????????????????##corosync啟動的身份,由于corosync需要管理服務,需要root身份 ??user:?root? ??group:??root? } amf?{ ????????mode:?disabled } |
?
?????????3、配置CoroSync的認證(authkey)
????????????使用corosync-keygen生成key時,由于要使用/dev/random生成隨機數,因此如果新裝的系統操作不多,如果沒有足夠的熵,可能會出現如下提示:
| 1 2 3 4 5 | [root@node1?corosync]#?corosync-keygen? Corosync?Cluster?Engine?Authentication?key?generator.? Gathering?1024?bits?for?key?from?/dev/random.? Press?keys?on?your?keyboard?to?generate?entropy.? Press?keys?on?your?keyboard?to?generate?entropy?(bits?=?240). |
? ?解決辦法:在本地隨意輸入即可,可以通過安裝,卸載軟件方式解決。
?
? ? ? ? authkey的權限為默認400
????
???? ?4、將配置文件復制到集群節點
????? 5、啟動CoroSync
| 1 2 3 4 | [root@node1?corosync]#?service?corosync?start? Starting?Corosync?Cluster?Engine?(corosync):???????????????[確定]? [root@node1?corosync]#?ssh?node2?'service?corosync?start'? Starting?Corosync?Cluster?Engine?(corosync):?[確定] |
?
????
?
? ? ? ? ? 檢查CoroSync的引擎啟動是否成功:
| 1 2 3 | [root@node1?corosync]#?grep?-e?"Corosync?Cluster?Engine"?-e?"configuration?file"?/var/log/messages? Oct?19?19:21:21?node1?corosync[2360]:???[MAIN??]?Corosync?Cluster?Engine?('1.4.1'):?started?and?ready?to?provide?service.? Oct?19?19:21:21?node1?corosync[2360]:???[MAIN??]?Successfully?read?main?configuration?file?'/etc/corosync/corosync.conf'. |
????????查看初始化成員節點通知是否正常發出:
| 1 2 3 4 5 6 | [root@node1?corosync]#?grep??TOTEM??/var/log/messages? Oct?19?19:21:21?node1?corosync[2360]:???[TOTEM?]?Initializing?transport?(UDP/IP?Multicast).? Oct?19?19:21:21?node1?corosync[2360]:???[TOTEM?]?Initializing?transmit/receive?security:?libtomcrypt?SOBER128/SHA1HMAC?(mode?0).? Oct?19?19:21:22?node1?corosync[2360]:???[TOTEM?]?The?network?interface?[192.168.192.208]?is?now?up.? Oct?19?19:21:23?node1?corosync[2360]:???[TOTEM?]?Process?pause?detected?for?1264?ms,?flushing?membership?messages.? Oct?19?19:21:23?node1?corosync[2360]:???[TOTEM?]?A?processor?joined?or?left?the?membership?and?a?new?membership?was?formed. |
????????檢查啟動過程中是否有錯誤產生:
| 1 2 3 | [root@node1?corosync]#?grep?ERROR:?/var/log/messages?|?grep?-v?unpack_resources? Oct?19?19:21:22?node1?corosync[2360]:???[pcmk??]?ERROR:?process_ais_conf:?You?have?configured?a?cluster?using?the?Pacemaker?plugin?for?Corosync.?The?plugin?is?not?supported?in?this?environment?and?will?be?removed?very?soon.? Oct?19?19:21:22?node1?corosync[2360]:???[pcmk??]?ERROR:?process_ais_conf:??Please?see?Chapter?8?of?'Clusters?from?Scratch'?(http://www.clusterlabs.org/doc)?for?details?on?using?Pacemaker?with?CMAN |
???? ? ?查看pacemaker是否正常啟動:
| 1 2 3 4 5 6 | [root@node1?corosync]#?grep?pcmk_startup?/var/log/messages? Oct?19?19:21:22?node1?corosync[2360]:???[pcmk??]?info:?pcmk_startup:?CRM:?Initialized? Oct?19?19:21:22?node1?corosync[2360]:???[pcmk??]?Logging:?Initialized?pcmk_startup? Oct?19?19:21:22?node1?corosync[2360]:???[pcmk??]?info:?pcmk_startup:?Maximum?core?file?size?is:?18446744073709551615? Oct?19?19:21:23?node1?corosync[2360]:???[pcmk??]?info:?pcmk_startup:?Service:?9? Oct?19?19:21:23?node1?corosync[2360]:???[pcmk??]?info:?pcmk_startup:?Local?hostname:?node1.yu.com |
? ? ? ? ? ? ? ? ?這里可能存在的問題:iptables,selinux
?
????????6、配置crmsh實現資源管理
? ? ? ? crmsh具有補全命令功能,并且交互式,可隨時help查看幫助說明??
? ? ? ?crmsh簡介: ?????
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 | [root@node1?~]#?crm??<--進入crmsh? crm(live)#?help???##查看幫助 This?is?crm?shell,?a?Pacemaker?command?line?interface. Available?commands: ????cib??????????????manage?shadow?CIBs????????????##CIB管理模塊? ????resource?????????resources?management?????##資源管理模塊? ????configure????????CRM?cluster?configuration??##CRM配置,包含資源粘性、資源類型、資源約束等? ????node?????????????nodes?management??##節點管理? ????options??????????user?preferences??##用戶偏好? ????history??????????CRM?cluster?history??##CRM?歷史? ????site?????????????Geo-cluster?support?##地理集群支持? ????ra???????????????resource?agents?information?center?##資源代理配置? ????status???????????show?cluster?status?##查看集群狀態? ????help,????????????show?help?(help?topics?for?list?of?topics)?##查看幫助? ????end,cd,up????????go?back?one?level?##返回上一級? ????quit,bye,exit????exit?the?program?##退出? crm(live)#?configure?????????????<--進入配置模式 crm(live)configure#?show?????##查看當前配置? node?node1.yu.com? node?node2.yu.com? property?$id="cib-bootstrap-options"?\? ????dc-version="1.1.8-7.el6-394e906"?\? ????cluster-infrastructure="classic?openais?(with?plugin)"?\? ????expected-quorum-votes="2" crm(live)configure#?verify????##檢查當前配置語法,由于沒有STONITH,所以報錯,可關閉? ???error:?unpack_resources:?????Resource?start-up?disabled?since?no?STONITH?resources?have?been?defined? ???error:?unpack_resources:?????Either?configure?some?or?disable?STONITH?with?the?stonith-enabled?option? ???error:?unpack_resources:?????NOTE:?Clusters?with?shared?data?need?STONITH?to?ensure?data?integrity? Errors?found?during?check:?config?not?valid? ??-V?may?provide?more?details ??? crm(live)configure#?property?stonith-enabled=false???##禁用stonith后再次檢查配置,無報錯? crm(live)configure#?verify crm(live)configure#?commit??##提交配置 crm(live)configure#?cd crm(live)#?ra???<--進入RA(資源代理配置)模式? crm(live)ra#?help This?level?contains?commands?which?show?various?information?about? the?installed?resource?agents.?It?is?available?both?at?the?top? level?and?at?the?`configure`?level. Available?commands: ????classes??????????list?classes?and?providers????##查看RA類型? ????list?????????????list?RA?for?a?class?(and?provider)??##查看指定類型(或提供商)的RA? ????meta,info????????show?meta?data?for?a?RA???##查看RA詳細信息? ????providers????????show?providers?for?a?RA?and?a?class??##查看指定資源的提供商和類型? ????help,????????????show?help?(help?topics?for?list?of?topics)? ????end,cd,up????????go?back?one?level? ????quit,bye,exit????exit?the?program? crm(live)ra#?classes? lsb? ocf?/?heartbeat?pacemaker?redhat? service? stonith crm(live)ra#?list?ocf?pacemaker? ClusterMon????Dummy?????????HealthCPU?????HealthSMART???Stateful??????SysInfo???????SystemHealth??controld??????o2cb??????ping??????????pingd crm(live)ra#?info?ocf:heartbeat:IPaddr Manages?virtual?IPv4?addresses?(portable?version)?(ocf:heartbeat:IPaddr) This?script?manages?IP?alias?IP?addresses? It?can?add?an?IP?alias,?or?remove?one. Parameters?(*?denotes?required,?[]?the?default): ip*?(string):?IPv4?address? ????The?IPv4?address?to?be?configured?in?dotted?quad?notation,?for?example? ????"192.168.192.208". nic?(string,?[eth0]):?Network?interface? ????The?base?network?interface?on?which?the?IP?address?will?be?brought? ????online. ……下略…… crm(live)ra#?cd? crm(live)#?status??<--查看集群狀態? Last?updated:?Sun?Oct?20?22:06:16?2013? Last?change:?Sun?Oct?20?21:58:46?2013?via?cibadmin?on?node1.yu.com? Stack:?classic?openais?(with?plugin)? Current?DC:?node2.yu.com?-?partition?with?quorum? Version:?1.1.8-7.el6-394e906? 2?Nodes?configured,?2?expected?votes? 0?Resources?configured. Online:?[?node1.yu.com?node2.yu.com?] |
? ? ????? ?
? ? ?法定票數問題:
????在雙節點集群中,由于票數是偶數,當心跳出現問題(腦裂)時,兩個節點都將達不到法定票數,默認quorum策略會關閉集群服務,為了避免這種情況,可以增加票數為奇數(如前文的增加ping節點,qdisk),或者調整默認quorum策略為【ignore】。
| 1 2 3 4 5 6 7 8 9 10 11 | crm(live)configure#?property?no-quorum-policy=ignore? crm(live)configure#?show? node?node1.yu.com? node?node2.yu.com? property?$id="cib-bootstrap-options"?\? ????dc-version="1.1.8-7.el6-394e906"?\? ????cluster-infrastructure="classic?openais?(with?plugin)"?\? ????expected-quorum-votes="2"?\? ????stonith-enabled="false"?\? ????no-quorum-policy="ignore"? crm(live)configure#?commit |
?
? ? ? ?資源來回轉移問題:(防止資源轉移后,故障點恢復又轉移)
故障發生時,資源會遷移到正常節點上,但當故障節點恢復后,資源可能再次回到原來節點,這樣有時候不一定是好事,例如某些繁忙的場景,來回飄逸就會出現問題,這里通過資源粘性來避免
| 1 | crm(live)configure#??property?default-resource-stickiness=INFINITY |
?
?
????????????7.配置httpd的高可用樣例
| 1 2 3 | ?????##?配置浮動IP crm(live)configure#?primitive?webip?ocf:heartbeat:IPaddr?params?ip=192.168.192.222 crm(live)configure#?commit |
? ? ? ? ? ? ? ?
| 1 2 3 4 5 6 | ????##配置httpd crm(live)configure#?primitive?web?lsb:httpd?\ ????????op?monitor?interval="30s"?timeout="20s"?on-fail="restart"?\ ????????meta?target-role="Started" ??????????????????##監控,超時為20s,間隔30s,失敗后資源重啟,失敗則有可能在其他節點啟動 crm(live)configure#?commit |
?
| 1 2 3 4 5 6 | ????##配置nfs實現web的頁面 crm(live)configure#?primitive?webstore?ocf:heartbeat:Filesystem?\ ????params?device="192.168.192.196:/webdocs"?fstype="nfs"?directory="/var/www/html"?\ ????op?start?timeout="60"?interval="0"?\ ????op?stop?timeout="60"?interval="0"?\ ????op?monitor?interval="60"?timeout="60"?on-fail="standby" |
?
?
| 1 2 3 | ????##資源,默認會平均分配在各個不同的幾點,將webip、webstore、httpd定義為一個組資源,可以將資源運行于同一個節點 crm(live)configure#group?web_srv?webip?webstore?web crm(live)configure#?commit |
?
?
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | [root@node1?corosync]#?crm crm(live)#?status Last?updated:?Sun?Apr?12?08:31:35?2015 Last?change:?Sun?Apr?12?08:17:47?2015?via?cibadmin?on?node1.yu.com Stack:?classic?openais?(with?plugin) Current?DC:?node1.yu.com?-?partition?with?quorum Version:?1.1.8-7.el6-394e906 2?Nodes?configured,?2?expected?votes 5?Resources?configured. Online:?[?node1.yu.com?node2.yu.com?] ?Resource?Group:?web_srv ?????webip??????(ocf::heartbeat:IPaddr):????????Started?node2.yu.com ?????web????????(lsb:httpd):????Started?node2.yu.com ?????webstore???????(ocf::heartbeat:Filesystem):????Started?node2.yu.com |
?
至此httpd的HA完成。
?
? ? ? 8、去除group屬性,通過約束來完成資源的融合
????前提: 已經配置了 webip、httpd、webstore主資源
????colocation:? ? 約束資源是否運行在同一個節點
| 1 | crm(live)configure#?colocation?webservice_with_webstore_with_webip?inf:?web?webstore?webip |
? ? order ?:約束資源的啟動關閉順序
| 1 | crm(live)configure#?order?webip_before_webstore_before_httpd??webip??webstore??web |
?
效果:
?
切換效果,pkill httpd,看httpd是否會自動起來:
?
模擬節點故障,切換是否成功:
《新程序員》:云原生和全面數字化實踐50位技術專家共同創作,文字、視頻、音頻交互閱讀總結
以上是生活随笔為你收集整理的coroSync packmarker的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: iOS开发笔记 - 界面调试神器Reve
- 下一篇: 技术文:微信小程序和服务器通信-WebS