11gR2 RAC时间同异常导致节点down掉问题处理
????實驗環(huán)境下11204的RAC環(huán)境,出現(xiàn)了一個節(jié)點DOWN掉的問題。檢查日志信息后,在otcssd日志信息發(fā)現(xiàn)如下信息:
2016-01-17 23:15:20.564: [ ? ?CTSS][1175029504]ctsscomm_recv_cb2: Receive incoming message event. Msgtype [3].
2016-01-17 23:15:20.564: [ ? ?CTSS][1175029504]ctsscomm_recv_cb4_2: Receive active version change msg. Old active version [186647552] New active version [186647552].
2016-01-17 23:15:20.564: [ ? ?CTSS][1175029504]ctsscomm_recv_cb2: Receive incoming message event. Msgtype [2].
2016-01-17 23:15:20.564: [ ? ?CTSS][1175029504]ctssslave_msg_handler4_1: Waiting for slave_sync_with_master to finish sync process. sync_state[3].
2016-01-17 23:15:20.564: [ ? ?CTSS][1168725760]ctssslave_swm2_3: Received time sync message from master.
2016-01-17 23:15:20.565: [ ? ?CTSS][1168725760]ctssslave_swm: sendtime{sec[1453043718], usec[550689]}, receivetime{sec[1453043720], usec[564960]}.
2016-01-17 23:15:20.565: [ ? ?CTSS][1168725760]ctssslave_swm: The RTT of sync msg [2014271] is too large for time sync to be accurate. Recommends retry. Returns [17].
2016-01-17 23:15:20.565: [ ? ?CTSS][1168725760]ctssslave_swm: Received from master (mode [0x8c] nodenum [1] hostname [jason1] )
2016-01-17 23:15:20.565: [ ? ?CTSS][1168725760]ctsselect_monitor_steysync_mode: Failed in clsctssslave_sync_with_master [17]. Retries [0/3].?
2016-01-17 23:15:20.565: [ ? ?CTSS][1168725760]ctssslave_swm1_1: Waiting for last time sync process to finish. sync_state[6].
2016-01-17 23:15:20.565: [ ? ?CTSS][1175029504]ctssslave_msg_handler4_3: slave_sync_with_master finished sync process. Exiting clsctssslave_msg_handler
2016-01-17 23:15:20.565: [ ? ?CTSS][1168725760]ctssslave_swm1_2: Ready to initiate new time sync process.
2016-01-17 23:15:20.565: [ ? ?CTSS][1168725760]ctssslave_swm2_1: Waiting for time sync message from master. sync_state[2].
2016-01-17 23:15:20.566: [ ? ?CTSS][1175029504]ctsscomm_recv_cb2: Receive incoming message event. Msgtype [2].
2016-01-17 23:15:20.566: [ ? ?CTSS][1175029504]ctssslave_msg_handler4_1: Waiting for slave_sync_with_master to finish sync process. sync_state[3].
2016-01-17 23:15:20.566: [ ? ?CTSS][1168725760]ctssslave_swm2_3: Received time sync message from master.
2016-01-17 23:15:20.566: [ ? ?CTSS][1168725760]ctssslave_swm: The magnitude [733548803120 usec] of the offset [733548803120 usec] is larger than [86400000000 usec] sec which is the CTSS limit
.
2016-01-17 23:15:20.566: [ ? ?CTSS][1168725760]ctsselect_monitor_steysync_mode: Failed in clsctssslave_sync_with_master [12]: Time offset is too much to be corrected
2016-01-17 23:15:20.566: [ ? ?CTSS][1175029504]ctssslave_msg_handler4_3: slave_sync_with_master finished sync process. Exiting clsctssslave_msg_handler
2016-01-17 23:15:21.287: [ ? ?CTSS][1190360832]ctss_checkcb: clsdm requested check alive. checkcb_data{mode[0xd0], offset[733548803 ms]}, length=[8].
2016-01-17 23:15:21.287: [ ? ?CTSS][1168725760]ctsselect_monitor_steysync_mode: CTSS daemon exiting [12].
2016-01-17 23:15:21.287: [ ? ?CTSS][1168725760]CTSS daemon aborting
2016-01-17 23:15:22.290: [ ? ?CTSS][1190360832]ctss_checkcb: clsdm requested check alive. checkcb_data{mode[0xd0], offset[733548803 ms]}, length=[8].
查看兩臺服務(wù)器時間如下:
jason1:~ # date
Sat Jan ?9 11:37:18 CST 2016
jason2:~ # date date
Sun Jan 17 23:23:12 CST 2016
兩臺服務(wù)器時間相差8天,Oracle的時間調(diào)整限制是1天。時間相差8天,遠(yuǎn)遠(yuǎn)超過Oracle時間同步服務(wù)允許的最大限制。因此其中一個節(jié)點被踢出了CLUSTER,由于時間同步的問題,導(dǎo)致了節(jié)點重啟后試圖再次加入到集群中報錯。因此調(diào)整兩臺服務(wù)器時間一致,就可以解決節(jié)點DOWN掉的問題。首先關(guān)閉集群,然將兩節(jié)點時間調(diào)整當(dāng)前時間保持一致,再次啟動集群或者重新啟動兩臺服務(wù)器,問題解決。
參考:http://blog.itpub.net/4227/viewspace-695164/
轉(zhuǎn)載于:https://blog.51cto.com/369day/1737280
總結(jié)
以上是生活随笔為你收集整理的11gR2 RAC时间同异常导致节点down掉问题处理的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: thinkphp单入口和多入口的访问方法
- 下一篇: 如何设置CentOS 7获取动态及静态I