清理apache共享内存引起的oracle宕机
我的平臺(tái)是redhat as 3 ,oracle?9204.
其他應(yīng)用是apache,resin等。
因?yàn)橐郧鞍l(fā)現(xiàn)apache運(yùn)行時(shí)間長以后會(huì)出現(xiàn)共享內(nèi)存不足的錯(cuò)誤,具體錯(cuò)誤信息如下:
[Fri Apr 13 06:00:03 2007] [error] shm.create(): error creating shm 2 No such file or directory
[Fri Apr 13 06:00:03 2007] [error] shm.create(): error creating shm /home/apache/logs/shm.file
[Fri Apr 13 06:00:03 2007] [warn] pid file /home/apache/logs/httpd.pid overwritten -- Unclean shutdown of previous Apache run?
[Fri Apr 13 06:00:03 2007] [emerg] (28)No space left on device: Couldn't create accept lock
因此,我寫了一個(gè)腳本,來定時(shí)檢測并清理,一直很有效。
因此,我寫了一個(gè)腳本,來定時(shí)檢測并清理。一直很有效。
前一段時(shí)間,新開了一個(gè)小應(yīng)用,也是apache的應(yīng)用,由于沒地方放了,就放到oracle機(jī)器上了,一直運(yùn)行比較好;
今天早上接到信息,說新開的這個(gè)apache應(yīng)用服務(wù)停止了,打開log一看,又是共享內(nèi)存的問題,二話不說,把原來的腳本在系統(tǒng)上跑了一遍,restart apache,ok。系統(tǒng)可以了。
過了幾分鐘。問題大了,說oracle服務(wù)宕了。趕緊檢查,ps -ef|oracle??服務(wù)都沒了,看alterlog發(fā)現(xiàn)如下信息:
Errors in file /opt/oracle/admin/sc1/bdump/sc1_reco_5195.trc:
ORA-27157: OS post/wait facility removed
ORA-27300: OS system dependent operation:semop failed with status: 43
ORA-27301: OS failure message: Identifier removed
ORA-27302: failure occurred at: sskgpwwait1
Fri Apr 13 10:10:46 2007
Errors in file /opt/oracle/admin/sc1/bdump/sc1_smon_5193.trc:
ORA-27157: OS post/wait facility removed
ORA-27300: OS system dependent operation:semop failed with status: 43
ORA-27301: OS failure message: Identifier removed
ORA-27302: failure occurred at: sskgpwwait1
Fri Apr 13 10:10:46 2007
RECO: terminating instance due to error 27157
Fri Apr 13 10:10:46 2007
Errors in file /opt/oracle/admin/sc1/udump/sc1_ora_23824.trc:
ORA-27153: wait operation failed
ORA-27300: OS system dependent operation:semop failed with status: 22
ORA-27301: OS failure message: Invalid argument
ORA-27302: failure occurred at: sskgpwwait2
Fri Apr 13 10:10:46 2007
Errors in file /opt/oracle/admin/sc1/bdump/sc1_lgwr_5189.trc:
知道是系統(tǒng)問題導(dǎo)致oracle宕機(jī)了。想到剛才的操作,懷疑把oracle的共享內(nèi)存也給誤清理了,好在db能正常啟動(dòng),把數(shù)據(jù)庫啟動(dòng)后,檢查共享內(nèi)存:
[root@oracle]# ipcs -s
------ Semaphore Arrays --------
key? ?? ???semid? ?? ?owner? ?? ?perms? ?? ?nsems? ???
0x00000000 4849664? ? nobody? ? 600? ?? ???1? ?? ?? ?
0x00000000 4882433? ? nobody? ? 600? ?? ???1? ?? ?? ?
0x00000000 4915202? ? nobody? ? 600? ?? ???1? ?? ?? ?
0x00000000 4947971? ? nobody? ? 600? ?? ???1? ?? ?? ?
0x00000000 4980740? ? nobody? ? 600? ?? ???1? ?? ?? ?
0xbeae576c 5111813? ? oracle? ? 640? ?? ???201? ?? ?
0xbeae576d 5144582? ? oracle? ? 640? ?? ???201? ?? ?
0xbeae576e 5177351? ? oracle? ? 640? ?? ???201? ?? ?
0xbeae576f 5210120? ? oracle? ? 640? ?? ???201? ?? ?
0xbeae5770 5242889? ? oracle? ? 640? ?? ???201? ?? ?
0x00000000 5275658? ? nobody? ? 600? ?? ???1? ?? ?? ?
0x00000000 5308427? ? nobody? ? 600? ?? ???1? ?? ?? ?
0x00000000 5341196? ? nobody? ? 600? ?? ???1? ?? ?? ?
0x00000000 5373965? ? nobody? ? 600? ?? ???1? ?? ?? ?
0x00000000 5406734? ? nobody? ? 600? ?? ???1? ?? ?? ?
0x00000000 5439503? ? nobody? ? 600? ?? ???1? ?? ?? ?
0x00000000 5472272? ? nobody? ? 600? ?? ???1? ?? ?? ?
0x00000000 5505041? ? nobody? ? 600? ?? ???1
果然有oracle的共享內(nèi)存,而我的腳本沒有判斷,如果只是刪除apache用戶的共享內(nèi)存,可以這樣
ipcs -s | grep apache | perl -e 'while (<STDIN>) {@a=split(/\s+/); print `ipcrm sem $a[1]`}'
創(chuàng)作挑戰(zhàn)賽新人創(chuàng)作獎(jiǎng)勵(lì)來咯,堅(jiān)持創(chuàng)作打卡瓜分現(xiàn)金大獎(jiǎng)總結(jié)
以上是生活随笔為你收集整理的清理apache共享内存引起的oracle宕机的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 一次共享内存引起的线上事故分析
- 下一篇: 结构体自动化转为char数组的实现