Redis碎碎念
Redis Sentinel是Redis的高可用方案。是Redis 2.8中正式引入的。
在之前的主從復(fù)制方案中,如果主節(jié)點(diǎn)出現(xiàn)問題,需要手動(dòng)將一個(gè)從節(jié)點(diǎn)升級(jí)為主節(jié)點(diǎn),然后將其它從節(jié)點(diǎn)指向新的主節(jié)點(diǎn),并且需要修改應(yīng)用方主節(jié)點(diǎn)的地址。整個(gè)過程都需要人工干預(yù)。
?
Sentinel的相關(guān)參數(shù)
# bind 127.0.0.1 192.168.1.1 # protected-mode no port 26379 # sentinel announce-ip <ip> # sentinel announce-port <port> dir /tmp sentinel monitor mymaster 127.0.0.1 6379 2 # sentinel auth-pass <master-name> <password> sentinel down-after-milliseconds mymaster 30000 sentinel parallel-syncs mymaster 1 sentinel failover-timeout mymaster 180000 # sentinel notification-script mymaster /var/redis/notify.sh # sentinel client-reconfig-script mymaster /var/redis/reconfig.sh sentinel deny-scripts-reconfig yes其中,
dir:設(shè)置Sentinel的工作目錄。
sentinel monitor mymaster 127.0.0.1 6379 2:其中2是quorum,即權(quán)重,代表至少需要兩個(gè)Sentinel節(jié)點(diǎn)認(rèn)為主節(jié)點(diǎn)主觀下線,才可判定主節(jié)點(diǎn)為客觀下線。一般建議將其設(shè)置為Sentinel節(jié)點(diǎn)的一半加1。不僅如此,quorum還與Sentinel節(jié)點(diǎn)的領(lǐng)導(dǎo)者選舉有關(guān)。為了選出Sentinel的領(lǐng)導(dǎo)者,至少需要max(quorum, num(sentinels) / 2 + 1)個(gè)Sentinel節(jié)點(diǎn)參與選舉。
?
sentinel down-after-milliseconds mymaster 30000:每個(gè)Sentinel節(jié)點(diǎn)都要通過定期發(fā)送ping命令來判斷Redis節(jié)點(diǎn)和其余Sentinel節(jié)點(diǎn)是否可達(dá)。
如果在指定的時(shí)間內(nèi),沒有收到主節(jié)點(diǎn)的有效回復(fù),則判斷其為主觀下線。需要注意的是,該參數(shù)不僅用來判斷主節(jié)點(diǎn)狀態(tài),同樣也用來判斷該主節(jié)點(diǎn)下面的從節(jié)點(diǎn)及其它Sentinel的狀態(tài)。其默認(rèn)值為30s。
?
sentinel parallel-syncs mymaster 1:在failover期間,允許多少個(gè)slave同時(shí)指向新的主節(jié)點(diǎn)。如果numslaves設(shè)置較大的話,雖然復(fù)制操作并不會(huì)阻塞主節(jié)點(diǎn),但多個(gè)節(jié)點(diǎn)同時(shí)指向新的主節(jié)點(diǎn),會(huì)增加主節(jié)點(diǎn)的網(wǎng)絡(luò)和磁盤IO負(fù)載。
?
sentinel failover-timeout mymaster 180000:定義故障轉(zhuǎn)移超時(shí)時(shí)間。默認(rèn)180000,單位秒,即3min。需要注意的是,該時(shí)間不是總的故障轉(zhuǎn)移的時(shí)間,而是適用于故障轉(zhuǎn)移的多個(gè)場景。
# Specifies the failover timeout in milliseconds. It is used in many ways: # # - The time needed to re-start a failover after a previous failover was # already tried against the same master by a given Sentinel, is two # times the failover timeout. # # - The time needed for a slave replicating to a wrong master according # to a Sentinel current configuration, to be forced to replicate # with the right master, is exactly the failover timeout (counting since # the moment a Sentinel detected the misconfiguration). # # - The time needed to cancel a failover that is already in progress but # did not produced any configuration change (SLAVEOF NO ONE yet not # acknowledged by the promoted slave). # # - The maximum time a failover in progress waits for all the slaves to be # reconfigured as slaves of the new master. However even after this time # the slaves will be reconfigured by the Sentinels anyway, but not with # the exact parallel-syncs progression as specified.第一種場景:
?
sentinel notification-script:定義通知腳本,當(dāng)Sentinel出現(xiàn)WARNING級(jí)別的事件時(shí),會(huì)調(diào)用該腳本,其會(huì)傳入兩個(gè)參數(shù):事件類型,事件描述。
sentinel client-reconfig-script:當(dāng)主節(jié)點(diǎn)發(fā)生切換時(shí),會(huì)調(diào)用該參數(shù)定義的腳本,其會(huì)傳入以下參數(shù):<master-name> <role> <state> <from-ip> <from-port> <to-ip> <to-port>
關(guān)于腳本,其必須遵循一定的規(guī)則。
# SCRIPTS EXECUTION # # sentinel notification-script and sentinel reconfig-script are used in order # to configure scripts that are called to notify the system administrator # or to reconfigure clients after a failover. The scripts are executed # with the following rules for error handling: # # If script exits with "1" the execution is retried later (up to a maximum # number of times currently set to 10). # # If script exits with "2" (or an higher value) the script execution is # not retried. # # If script terminates because it receives a signal the behavior is the same # as exit code 1. # # A script has a maximum running time of 60 seconds. After this limit is # reached the script is terminated with a SIGKILL and the execution retried.?
sentinel deny-scripts-reconfig:不允許使用SENTINEL SET設(shè)置notification-script和client-reconfig-script。
?
Sentinel的常見操作
- PING This command simply returns PONG.
- SENTINEL masters Show a list of monitored masters and their state.
- SENTINEL master <master name> Show the state and info of the specified master.
- SENTINEL slaves <master name> Show a list of slaves for this master, and their state.
- SENTINEL sentinels <master name> Show a list of sentinel instances for this master, and their state.
- SENTINEL get-master-addr-by-name <master name> Return the ip and port number of the master with that name. If a failover is in progress or terminated successfully for this master it returns the address and port of the promoted slave.
- SENTINEL reset <pattern> This command will reset all the masters with matching name. The pattern argument is a glob-style pattern. The reset process clears any previous state in a master (including a failover in progress), and removes every slave and sentinel already discovered and associated with the master.
- SENTINEL failover <master name> Force a failover as if the master was not reachable, and without asking for agreement to other Sentinels (however a new version of the configuration will be published so that the other Sentinels will update their configurations).
- SENTINEL ckquorum <master name> Check if the current Sentinel configuration is able to reach the quorum needed to failover a master, and the majority needed to authorize the failover. This command should be used in monitoring systems to check if a Sentinel deployment is ok.
- SENTINEL flushconfig Force Sentinel to rewrite its configuration on disk, including the current Sentinel state. Normally Sentinel rewrites the configuration every time something changes in its state (in the context of the subset of the state which is persisted on disk across restart). However sometimes it is possible that the configuration file is lost because of operation errors, disk failures, package upgrade scripts or configuration managers. In those cases a way to to force Sentinel to rewrite the configuration file is handy. This command works even if the previous configuration file is completely missing.
sentinel masters
輸出被監(jiān)控的主節(jié)點(diǎn)的狀態(tài)信息
127.0.0.1:26379> sentinel masters 1) 1) "name"2) "mymaster"3) "ip"4) "127.0.0.1"5) "port"6) "6379"7) "runid"8) "6ab2be5db3a37c10f2473c8fb9daed147a32df3e"9) "flags"10) "master"11) "link-pending-commands"12) "0"13) "link-refcount"14) "1"15) "last-ping-sent"16) "0"17) "last-ok-ping-reply"18) "639"19) "last-ping-reply"20) "639"21) "down-after-milliseconds"22) "30000"23) "info-refresh"24) "2075"25) "role-reported"26) "master"27) "role-reported-time"28) "759682"29) "config-epoch"30) "0"31) "num-slaves"32) "2"33) "num-other-sentinels"34) "2"35) "quorum"36) "2"37) "failover-timeout"38) "180000"39) "parallel-syncs"40) "1" View Code?
也可單獨(dú)查看某個(gè)主節(jié)點(diǎn)的狀態(tài)
sentinel master mymaster
?
sentinel slaves mymaster?
查看某個(gè)主節(jié)點(diǎn)slave的狀態(tài)
127.0.0.1:26379> sentinel slaves mymaster 1) 1) "name"2) "127.0.0.1:6380"3) "ip"4) "127.0.0.1"5) "port"6) "6380"7) "runid"8) "983b87fd070c7f052b26f5135bbb30fdeb170a54"9) "flags"10) "slave"11) "link-pending-commands"12) "0"13) "link-refcount"14) "1"15) "last-ping-sent"16) "0"17) "last-ok-ping-reply"18) "178"19) "last-ping-reply"20) "178"21) "down-after-milliseconds"22) "30000"23) "info-refresh"24) "6160"25) "role-reported"26) "slave"27) "role-reported-time"28) "489019"29) "master-link-down-time"30) "0"31) "master-link-status"32) "ok"33) "master-host"34) "127.0.0.1"35) "master-port"36) "6379"37) "slave-priority"38) "100"39) "slave-repl-offset"40) "70375" 2) 1) "name"2) "127.0.0.1:6381"3) "ip"4) "127.0.0.1"5) "port"6) "6381"7) "runid"8) "b88059cce9104dd4e0366afd6ad07a163dae8b15"9) "flags"10) "slave"11) "link-pending-commands"12) "0"13) "link-refcount"14) "1"15) "last-ping-sent"16) "0"17) "last-ok-ping-reply"18) "178"19) "last-ping-reply"20) "178"21) "down-after-milliseconds"22) "30000"23) "info-refresh"24) "2918"25) "role-reported"26) "slave"27) "role-reported-time"28) "489019"29) "master-link-down-time"30) "0"31) "master-link-status"32) "ok"33) "master-host"34) "127.0.0.1"35) "master-port"36) "6379"37) "slave-priority"38) "100"39) "slave-repl-offset"40) "71040" View Code?
sentinel sentinels mymaster
查看其它Sentinel的狀態(tài)
127.0.0.1:26379> sentinel sentinels mymaster 1) 1) "name"2) "738ccbddaa0d4379d89a147613d9aecfec765bcb"3) "ip"4) "127.0.0.1"5) "port"6) "26381"7) "runid"8) "738ccbddaa0d4379d89a147613d9aecfec765bcb"9) "flags"10) "sentinel"11) "link-pending-commands"12) "0"13) "link-refcount"14) "1"15) "last-ping-sent"16) "0"17) "last-ok-ping-reply"18) "475"19) "last-ping-reply"20) "475"21) "down-after-milliseconds"22) "30000"23) "last-hello-message"24) "79"25) "voted-leader"26) "?"27) "voted-leader-epoch"28) "0" 2) 1) "name"2) "7251bb129ca373ad0d8c7baf3b6577ae2593079f"3) "ip"4) "127.0.0.1"5) "port"6) "26380"7) "runid"8) "7251bb129ca373ad0d8c7baf3b6577ae2593079f"9) "flags"10) "sentinel"11) "link-pending-commands"12) "0"13) "link-refcount"14) "1"15) "last-ping-sent"16) "0"17) "last-ok-ping-reply"18) "475"19) "last-ping-reply"20) "475"21) "down-after-milliseconds"22) "30000"23) "last-hello-message"24) "985"25) "voted-leader"26) "?"27) "voted-leader-epoch"28) "0" View Code?
sentinel get-master-addr-by-name <master name>
返回指定<master name>主節(jié)點(diǎn)的IP地址和端口。如果在進(jìn)行故障轉(zhuǎn)移,則顯示的是新主的信息。
127.0.0.1:26379> sentinel get-master-addr-by-name mymaster 1) "127.0.0.1" 2) "6379"?
sentinel reset <pattern>
對符合<pattern>(通配符風(fēng)格)主節(jié)點(diǎn)的配置進(jìn)行重置。
如果某個(gè)slave宕機(jī)了,其依然處于sentinel的管理中,所以,在其恢復(fù)正常后,其依然會(huì)加入到之前的復(fù)制環(huán)境中,即使配置文件中沒有指定slaveof選項(xiàng)。不僅如此,如果主節(jié)點(diǎn)宕機(jī)了,在其重啟后,其默認(rèn)會(huì)作為從節(jié)點(diǎn)接入到之前的復(fù)制環(huán)境中。
但很多時(shí)候,我們可能就是想移除old master,slave,這個(gè)時(shí)候,sentinel reset就派上用場了。其會(huì)基于當(dāng)前主節(jié)點(diǎn)的狀態(tài),重置其配置(they'll refresh the list of slaves within the next 10 seconds, only adding the ones listed as correctly replicating from the current master INFO output)。關(guān)鍵的是,對于非正常狀態(tài)的slave,會(huì)從當(dāng)前的配置中剔除。這樣,被剔除節(jié)點(diǎn)在恢復(fù)正常后(注意此時(shí)的配置文件,需剔除slaveof的配置),也不會(huì)自動(dòng)加入到之前的復(fù)制環(huán)境中。
需要注意的是,該命令僅對當(dāng)前sentinel節(jié)點(diǎn)有效,如果要剔除某個(gè)節(jié)點(diǎn),需要在所有的sentinel節(jié)點(diǎn)上執(zhí)行reset操作。
?
sentinel failover <master name>
對指定 <master name> 主節(jié)點(diǎn)進(jìn)行強(qiáng)制故障轉(zhuǎn)移。相對于常規(guī)的故障轉(zhuǎn)移,其無需
?
sentinel ckquorum <master name>
檢測當(dāng)前可達(dá)的Sentinel節(jié)點(diǎn)總數(shù)是否達(dá)到<quorum>的個(gè)數(shù)
127.0.0.1:26379> sentinel ckquorum mymaster OK 3 usable Sentinels. Quorum and failover authorization can be reached?
sentinel flushconfig
將Sentinel節(jié)點(diǎn)的配置信息強(qiáng)制刷到磁盤上,這個(gè)命令Sentinel節(jié)點(diǎn)自身用得比較多,對于開發(fā)和運(yùn)維人員只有當(dāng)外部原因(例如磁盤損壞)造成配置文件損壞或者丟失時(shí),才會(huì)用上。
?
sentinel remove <master name>
取消當(dāng)前Sentinel節(jié)點(diǎn)對于指定<master name>主節(jié)點(diǎn)的監(jiān)控。
[root@slowtech redis-4.0.11]# grep -Ev "^#|^$" sentinel_26379.conf port 26379 dir "/tmp" sentinel myid 2467530fa249dbbc435c50fbb0dc2a4e766146f8 sentinel deny-scripts-reconfig yes sentinel monitor mymaster 127.0.0.1 6381 2 sentinel config-epoch mymaster 12 sentinel leader-epoch mymaster 0 sentinel known-slave mymaster 127.0.0.1 6380 sentinel known-slave mymaster 127.0.0.1 6379 sentinel known-sentinel mymaster 127.0.0.1 26381 738ccbddaa0d4379d89a147613d9aecfec765bcb sentinel known-sentinel mymaster 127.0.0.1 26380 7251bb129ca373ad0d8c7baf3b6577ae2593079f sentinel current-epoch 12[root@slowtech redis-4.0.11]# redis-cli -p 26379 127.0.0.1:26379> sentinel remove mymaster OK 127.0.0.1:26379> quit[root@slowtech redis-4.0.11]# grep -Ev "^#|^$" sentinel_26379.conf port 26379 dir "/tmp" sentinel myid 2467530fa249dbbc435c50fbb0dc2a4e766146f8 sentinel deny-scripts-reconfig yes sentinel current-epoch 12 View Code?
總結(jié)
- 上一篇: docker 介绍
- 下一篇: Android重写菜单增加系统自带返回键