Centos6.3下DRBD+HeartBeat+NFS配置笔记
--------------閑???扯------------------???
??????這里首先感謝酒哥的構(gòu)建高可用的Linux服務(wù)器的這本書,看了這本書上并參考里面的配置讓自己對(duì)DRBD+HeartBeat+NFS思路清晰了許多。
?????drbd簡單來說就是一個(gè)網(wǎng)絡(luò)raid-1,一般有2到多個(gè)node節(jié)點(diǎn),各個(gè)節(jié)點(diǎn)創(chuàng)建的磁盤塊會(huì)映射到本地drbd塊,而后通過網(wǎng)絡(luò)對(duì)各個(gè)節(jié)點(diǎn)drbd磁盤塊進(jìn)行互相同步更新。
?????heartbeat的作用就可以增加drbd的可用性,它能在某節(jié)點(diǎn)故障后,自動(dòng)切換drbd塊到備份節(jié)點(diǎn),并自動(dòng)進(jìn)行虛IP從新綁定,DRBD塊提權(quán),磁盤掛載以及啟動(dòng)NFS等腳本操作,這一系列操作因?yàn)橹辉谒蠖斯?jié)點(diǎn)間完成,前端用戶訪問的是heartbeat的虛IP,所以對(duì)用戶來說無任何感知。
?????最后吐槽下,yum安裝真心坑爹,以后如果非必須,盡量源碼包安裝。
---------------開???搞----------------
系統(tǒng)版本:?centos6.3?x64(內(nèi)核2.6.32)
DRBD:?????DRBD-8.4.3
HeartBeat:epel更新源(真坑)
NFS:???????系統(tǒng)自帶
HeartBeat?VIP:????192.168.7.90
node1?DRBD+HeartBeat:?????192.168.7.88(drbd1.example.com)
node2?DRBD+HeartBeat:???192.168.7.89?(drbd2.example.com)
(node1)為僅主節(jié)點(diǎn)端配置
(node2)為僅從節(jié)點(diǎn)端配置
(node1,node2)為主從節(jié)點(diǎn)都需配置
一.DRBD配置,傳送門:http://showerlee.blog.51cto.com/2047005/1211963
二.Hearbeat配置;
這里接著DRBD系統(tǒng)環(huán)境及安裝配置:
1.安裝heartbeat(CentOS6.3中默認(rèn)不帶有Heartbeat包,因此需要從第三方下載)(node1,node2)
#?wget?ftp://mirror.switch.ch/pool/1/mirror/scientificlinux/6rolling/i386/os/Packages/epel-release-6-5.noarch.rpm
#?rpm?-ivUh?epel-release-6-5.noarch.rpm
#?yum?--enablerepo=epel?install?heartbeat?-y
2.配置heartbeat
(node1)
#?vi?/etc/ha.d/ha.cf
---------------
#?日志
logfile?????????/var/log/ha-log
logfacility?????local0
#?心跳監(jiān)測(cè)時(shí)間
keepalive???????2
#?死亡時(shí)間
deadtime????????5
#?指定對(duì)方IP:
ucast???????????eth0?192.168.7.89
#?服務(wù)器正常后由主服務(wù)器接管資源,另一臺(tái)服務(wù)器放棄該資源
auto_failback???off
#定義節(jié)點(diǎn)
node????????????drbd1.example.com?drbd2.example.com
---------------
(node2)
#?vi?/etc/ha.d/ha.cf
---------------
#?日志
logfile?????????/var/log/ha-log
logfacility?????local0
#?心跳監(jiān)測(cè)時(shí)間
keepalive???????2
#?死亡時(shí)間
deadtime????????5
#?指定對(duì)方IP:
ucast???????????eth0?192.168.7.88
#?服務(wù)器正常后由主服務(wù)器接管資源,另一臺(tái)服務(wù)器放棄該資源
auto_failback???off
#定義節(jié)點(diǎn)
node????????????drbd1.example.com?drbd2.example.com
---------------
編輯雙機(jī)互聯(lián)驗(yàn)證文件:(node1,node2)
#?vi?/etc/ha.d/authkeys
--------------
auth?1
1?crc
--------------
#?chmod?600?/etc/ha.d/authkeys
編輯集群資源文件:(node1,node2)
#?vi?/etc/ha.d/haresources
--------------
drbd1.example.com?IPaddr::192.168.7.90/24/eth0?drbddisk::r0?Filesystem::/dev/drbd0::/data::ext4?killnfsd
--------------
注:該文件內(nèi)IPaddr,Filesystem等腳本存放路徑在/etc/ha.d/resource.d/下,也可在該目錄下存放服務(wù)啟動(dòng)腳本(例如:mysql,www),將相同腳本名稱添加到/etc/ha.d/haresources內(nèi)容中,從而跟隨heartbeat啟動(dòng)而啟動(dòng)該腳本。
IPaddr::192.168.7.90/24/eth0:用IPaddr腳本配置浮動(dòng)VIP
drbddisk::r0:用drbddisk腳本實(shí)現(xiàn)DRBD主從節(jié)點(diǎn)資源組的掛載和卸載
Filesystem::/dev/drbd0::/data::ext4:用Filesystem腳本實(shí)現(xiàn)磁盤掛載和卸載
編輯腳本文件killnfsd,用來重啟NFS服務(wù):
注:因?yàn)镹FS服務(wù)切換后,必須重新mount?NFS共享出來的目錄,否則會(huì)報(bào)錯(cuò)(待驗(yàn)證)
#?vi?/etc/ha.d/resource.d/killnfsd
-----------------
killall?-9?nfsd;?/etc/init.d/nfs?restart;exit?0
-----------------
賦予執(zhí)行權(quán)限:
#?chmod?755?/etc/ha.d/resource.d/killnfsd
創(chuàng)建DRBD腳本文件drbddisk:(node1,node2)
注:
此處又是一個(gè)大坑,如果不明白Heartbeat目錄結(jié)構(gòu)的朋友估計(jì)要在這里被卡到死,因?yàn)槟J(rèn)yum安裝Heartbeat,不會(huì)在/etc/ha.d/resource.d/創(chuàng)建drbddisk腳本,而且也無法在安裝后從本地其他路徑找到該文件。
此處本人也是因?yàn)閱?dòng)Heartbeat后無法PING通虛IP,最后通過查看/var/log/ha-log日志,找到一行
ERROR:?Cannot?locate?resource?script?drbddisk
然后進(jìn)而到/etc/ha.d/resource.d/路徑下發(fā)現(xiàn)竟然沒有drbddisk腳本,最后在google上找到該代碼,創(chuàng)建該腳本,終于測(cè)試通過:
#?vi?/etc/ha.d/resource.d/drbddisk
-----------------------
#!/bin/bash
#
#?This?script?is?inteded?to?be?used?as?resource?script?by?heartbeat
#
#?Copright?2003-2008?LINBIT?Information?Technologies
#?Philipp?Reisner,?Lars?Ellenberg
#
###
DEFAULTFILE="/etc/default/drbd"
DRBDADM="/sbin/drbdadm"
if?[?-f?$DEFAULTFILE?];?then
??.?$DEFAULTFILE
fi
if?[?"$#"?-eq?2?];?then
??RES="$1"
??CMD="$2"
else
??RES="all"
??CMD="$1"
fi
##?EXIT?CODES
#?since?this?is?a?"legacy?heartbeat?R1?resource?agent"?script,
#?exit?codes?actually?do?not?matter?that?much?as?long?as?we?conform?to
#??http://wiki.linux-ha.org/HeartbeatResourceAgent
#?but?it?does?not?hurt?to?conform?to?lsb?init-script?exit?codes,
#?where?we?can.
#??http://refspecs.linux-foundation.org/LSB_3.1.0/
#LSB-Core-generic/LSB-Core-generic/iniscrptact.html
####
drbd_set_role_from_proc_drbd()
{
local?out
if?!?test?-e?/proc/drbd;?then
ROLE="Unconfigured"
return
fi
dev=$(?$DRBDADM?sh-dev?$RES?)
minor=${dev#/dev/drbd}
if?[[?$minor?=?*[!0-9]*?]]?;?then
#?sh-minor?is?only?supported?since?drbd?8.3.1
minor=$(?$DRBDADM?sh-minor?$RES?)
fi
if?[[?-z?$minor?]]?||?[[?$minor?=?*[!0-9]*?]]?;?then
ROLE=Unknown
return
fi
if?out=$(sed?-ne?"/^?*$minor:?cs:/?{?s/:/?/g;?p;?q;?}"?/proc/drbd);?then
set?--?$out
ROLE=${5%/**}
:?${ROLE:=Unconfigured}?#?if?it?does?not?show?up
else
ROLE=Unknown
fi
}
case?"$CMD"?in
????start)
#?try?several?times,?in?case?heartbeat?deadtime
#?was?smaller?than?drbd?ping?time
try=6
while?true;?do
$DRBDADM?primary?$RES?&&?break
let?"--try"?||?exit?1?#?LSB?generic?error
sleep?1
done
;;
????stop)
#?heartbeat?(haresources?mode)?will?retry?failed?stop
#?for?a?number?of?times?in?addition?to?this?internal?retry.
try=3
while?true;?do
$DRBDADM?secondary?$RES?&&?break
#?We?used?to?lie?here,?and?pretend?success?for?anything?!=?11,
#?to?avoid?the?reboot?on?failed?stop?recovery?for?"simple
#?config?errors"?and?such.?But?that?is?incorrect.
#?Don't?lie?to?your?cluster?manager.
#?And?don't?do?config?errors...
let?--try?||?exit?1?#?LSB?generic?error
sleep?1
done
;;
????status)
if?[?"$RES"?=?"all"?];?then
????echo?"A?resource?name?is?required?for?status?inquiries."
????exit?10
fi
ST=$(?$DRBDADM?role?$RES?)
ROLE=${ST%/**}
case?$ROLE?in
Primary|Secondary|Unconfigured)
#?expected
;;
*)
#?unexpected.?whatever...
#?If?we?are?unsure?about?the?state?of?a?resource,?we?need?to
#?report?it?as?possibly?running,?so?heartbeat?can,?after?failed
#?stop,?do?a?recovery?by?reboot.
#?drbdsetup?may?fail?for?obscure?reasons,?e.g.?if?/var/lock/?is
#?suddenly?readonly.??So?we?retry?by?parsing?/proc/drbd.
drbd_set_role_from_proc_drbd
esac
case?$ROLE?in
Primary)
echo?"running?(Primary)"
exit?0?#?LSB?status?"service?is?OK"
;;
Secondary|Unconfigured)
echo?"stopped?($ROLE)"
exit?3?#?LSB?status?"service?is?not?running"
;;
*)
#?NOTE?the?"running"?in?below?message.
#?this?is?a?"heartbeat"?resource?script,
#?the?exit?code?is?_ignored_.
echo?"cannot?determine?status,?may?be?running?($ROLE)"
exit?4?#??LSB?status?"service?status?is?unknown"
;;
esac
;;
????*)
echo?"Usage:?drbddisk?[resource]?{start|stop|status}"
exit?1
;;
esac
exit?0
-----------------------
賦予執(zhí)行權(quán)限:
#?chmod?755?/etc/ha.d/resource.d/drbddisk
在兩個(gè)節(jié)點(diǎn)上啟動(dòng)HeartBeat服務(wù),先啟動(dòng)node1:(node1,node2)
#?service?heartbeat?start
#?chkconfig?heartbeat?on
這里能夠PING通虛IP?192.168.7.90,表示配置成功
三.配置NFS:(node1,node2)
#?vi?/etc/exports
-----------------
/data????????*(rw,no_root_squash)
-----------------
重啟NFS服務(wù):
#?service?rpcbind?restart
#?service?nfs?restart
#?chkconfig?rpcbind?on
#?chkconfig?nfs?off
這里設(shè)置NFS開機(jī)不要自動(dòng)運(yùn)行,因?yàn)?etc/ha.d/resource.d/killnfsd?該腳本內(nèi)容控制NFS的啟動(dòng)。
四.最終測(cè)試
在另外一臺(tái)LINUX的客戶端掛載虛IP:192.168.7.90,掛載成功表明NFS+DRBD+HeartBeat大功告成.
#?mount?-t?nfs?192.168.7.90:/data?/tmp
#?df?-h
---------------
......
192.168.7.90:/data???1020M???34M??934M???4%?/tmp
---------------
測(cè)試DRBD+HeartBeat+NFS可用性:
1.向掛載的/tmp目錄傳送文件,忽然重新啟動(dòng)主端DRBD服務(wù),查看變化
經(jīng)本人測(cè)試能夠?qū)崿F(xiàn)斷點(diǎn)續(xù)傳
2.正常狀態(tài)重啟Primary主機(jī)后,觀察主DRBD狀態(tài)是否恢復(fù)Primary并能正常被客戶端掛載并且之前寫入的文件存在,可以正常再寫入文件。
經(jīng)本人測(cè)試可以正常恢復(fù),且客戶端無需重新掛載NFS共享目錄,之前數(shù)據(jù)存在,且可直接寫入文件。
3.當(dāng)Primary主機(jī)因?yàn)橛布p壞或其他原因需要關(guān)機(jī)維修,需要將Secondary提升為Primary主機(jī),如何手動(dòng)操作?
如果設(shè)備能夠正常啟動(dòng)則按照如下操作,無法啟動(dòng)則強(qiáng)行提升Secondary為Primary,待宕機(jī)設(shè)備能夠正常啟動(dòng),若“腦裂”,再做后續(xù)修復(fù)工作。
首先先卸載客戶端掛載的NFS主機(jī)目錄
#?umount?/tmp
(node1)
卸載DRBD設(shè)備:
#?service?nfs?stop
#?umount?/data
降權(quán):
#?drbdadm?secondary?r0
查看狀態(tài),已降權(quán)
#?service?drbd?status
-----------------
drbd?driver?loaded?OK;?device?status:
version:?8.4.3?(api:1/proto:86-101)
GIT-hash:?89a294209144b68adb3ee85a73221f964d3ee515?build?by?root@drbd1.example.com,?2013-05-27?20:45:19
m:res??cs?????????ro???????????????????ds?????????????????p??mounted??fstype
0:r0???Connected??Secondary/Secondary??UpToDate/UpToDate??C
-----------------
(node2)
提權(quán):
#?drbdadm?primary?r0
查看狀態(tài),已提權(quán):
#?service?drbd?status
----------------
drbd?driver?loaded?OK;?device?status:
version:?8.4.3?(api:1/proto:86-101)
GIT-hash:?89a294209144b68adb3ee85a73221f964d3ee515?build?by?root@drbd2.example.com,?2013-05-27?20:49:06
m:res??cs?????????ro?????????????????ds?????????????????p??mounted??fstype
0:r0???Connected??Primary/Secondary??UpToDate/UpToDate??C
----------------
這里還未掛載DRBD目錄,讓Heartbeat幫忙掛載:
注:若重啟過程中發(fā)現(xiàn)Heartbeat日志報(bào)錯(cuò):
ERROR:?glib:?ucast:?error?binding?socket.?Retrying:?Permission?denied
請(qǐng)檢查selinux是否關(guān)閉
#?service?heartbeat?restart
#?service?drbd?status
-----------------------
drbd?driver?loaded?OK;?device?status:
version:?8.4.3?(api:1/proto:86-101)
GIT-hash:?89a294209144b68adb3ee85a73221f964d3ee515?build?by?root@drbd2.example.com,?2013-05-27?20:49:06
m:res??cs?????????ro?????????????????ds?????????????????p??mounted??fstype
0:r0???Connected??Primary/Secondary??UpToDate/UpToDate??C??/data????ext4
------------------------
成功讓HeartBeat掛載DRBD目錄
重新在客戶端做NFS掛載測(cè)試:
#?mount?-t?nfs?192.168.7.90:/data?/tmp
#?ll?/tmp
------------------
1??10??2??2222??3??4??5??6??7??8??9??lost+found??orbit-root
------------------
重啟剛剛被提權(quán)的主機(jī),待重啟查看狀態(tài):
#?service?drbd?status
------------------------
drbd?driver?loaded?OK;?device?status:
version:?8.4.3?(api:1/proto:86-101)
GIT-hash:?89a294209144b68adb3ee85a73221f964d3ee515?build?by?root@drbd2.example.com,?2013-05-27?20:49:06
m:res??cs????????????ro???????????????ds?????????????????p??mounted??fstype
0:r0???WFConnection??Primary/Unknown??UpToDate/DUnknown??C??/data????ext4
------------------------
HeartBeat成功掛載DRBD目錄,drbd無縫連接到備份節(jié)點(diǎn),客戶端使用NFS掛載點(diǎn)對(duì)故障無任何感知。
4.測(cè)試最后剛才那臺(tái)宕機(jī)重新恢復(fù)正常后,他是否會(huì)從新奪取Primary資源?
重啟后不會(huì)重新獲取資源,需手動(dòng)切換主從權(quán)限方可。
注:vi?/etc/ha.d/ha.cf配置文件內(nèi)該參數(shù):
--------------------
auto_failback???off
--------------------
表示服務(wù)器正常后由新的主服務(wù)器接管資源,另一臺(tái)舊服務(wù)器放棄該資源
5.以上都未利用heartbeat實(shí)現(xiàn)故障自動(dòng)轉(zhuǎn)移,當(dāng)線上DRBD主節(jié)點(diǎn)宕機(jī),備份節(jié)點(diǎn)是否立即無縫接管,heartbeat+drbd高可用性是否能夠?qū)崿F(xiàn)?
首先先在客戶端掛載NFS共享目錄
#?mount?-t?nfs?192.168.7.90:/data?/tmp
a.模擬將主節(jié)點(diǎn)node1?的heartbeat服務(wù)停止,則備節(jié)點(diǎn)node2是否接管服務(wù)?
(node1)
#?service?drbd?status
----------------------------
drbd?driver?loaded?OK;?device?status:
version:?8.4.3?(api:1/proto:86-101)
GIT-hash:?89a294209144b68adb3ee85a73221f964d3ee515?build?by?root@drbd1.example.com,?2013-05-27?20:45:19
m:res??cs?????????ro?????????????????ds?????????????????p??mounted??fstype
0:r0???Connected??Primary/Secondary??UpToDate/UpToDate??C??/data????ext4
----------------------------
#?service?heartbeat?stop
(node2)
#?service?drbd?status
----------------------------------------
drbd?driver?loaded?OK;?device?status:
version:?8.4.3?(api:1/proto:86-101)
GIT-hash:?89a294209144b68adb3ee85a73221f964d3ee515?build?by?root@drbd2.example.com,?2013-05-27?20:49:06
m:res??cs?????????ro?????????????????ds?????????????????p??mounted??fstype
0:r0???Connected??Primary/Secondary??UpToDate/UpToDate??C??/data????ext4
-----------------------------------------
從機(jī)無縫接管,測(cè)試客戶端是否能夠使用NFS共享目錄
#?cd?/tmp
#?touch?test01?
#?ls?test01
------------------
test01
------------------
測(cè)試通過。。。
b.模擬將主節(jié)點(diǎn)宕機(jī)(直接強(qiáng)行關(guān)機(jī)),則備節(jié)點(diǎn)node2是否接管服務(wù)?
(node1)
強(qiáng)制關(guān)機(jī),直接關(guān)閉node1虛擬機(jī)電源
(node2)
#?service?drbd?status
-------------------------------
drbd?driver?loaded?OK;?device?status:
version:?8.4.3?(api:1/proto:86-101)
GIT-hash:?89a294209144b68adb3ee85a73221f964d3ee515?build?by?root@drbd2.example.com,?2013-05-27?20:49:06
m:res??cs????????????ro???????????????ds?????????????????p??mounted??fstype
0:r0???WFConnection??Primary/Unknown??UpToDate/DUnknown??C??/data????ext4
-------------------------------
從機(jī)無縫接管,測(cè)試客戶端是否能夠使用NFS共享目錄
#?cd?/tmp
#?touch?test02?
#?ls?test02
------------------
test02
------------------
待node1恢復(fù)啟動(dòng),查看drbd狀態(tài)信息:
#?service?drbd?status
------------------------------
drbd?driver?loaded?OK;?device?status:
version:?8.4.3?(api:1/proto:86-101)
GIT-hash:?89a294209144b68adb3ee85a73221f964d3ee515?build?by?root@drbd2.example.com,?2013-05-27?20:49:06
m:res??cs?????????ro?????????????????ds?????????????????p??mounted??fstype
0:r0???Connected??Primary/Secondary??UpToDate/UpToDate??C??/data????ext4
-------------------------------
node1已連接上線,處于UpToDate狀態(tài),測(cè)試通過。。。
注:這里node1的heartbeat有幾率在關(guān)閉服務(wù)時(shí),node2無法接管,所以有一定維護(hù)成本,因?yàn)楸救司€上未跑該服務(wù),建議實(shí)際使用在上線前多做模擬故障演練,再實(shí)際上線。
-------大功告成----------
參考:酒哥的“構(gòu)建高可用LINUX服務(wù)器”一書
本文出自?“一路向北”?博客,請(qǐng)務(wù)必保留此出處http://showerlee.blog.51cto.com/2047005/1212185
?
轉(zhuǎn)載于:https://blog.51cto.com/lucifer119/1222533
總結(jié)
以上是生活随笔為你收集整理的Centos6.3下DRBD+HeartBeat+NFS配置笔记的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: easyexcel 无模板写入_关于Ea
- 下一篇: weibo.cn html5,微博爬虫: