[Oracle运维工程师手记] 如何从trace 文件,判断是否执行了并行
[Oracle運維工程師手記系列]如何從trace 文件,判斷是否執(zhí)行了并行
客戶說,明明指定了并行的hint,OEM 卻報說沒有并行,并且提供了畫面。
客戶的SQL文長這樣:
INSERT/*+ parallel(4) */ INTO TAB001_WORK SELECT/*+ FULL(USR002) */
USR002.IM_PRO_CD, USR002.IM_NO, USR002.PS_DATE, USR002.YY_MM,
到底如何呢,口說無憑,還是來作一次并行trace吧。
SQL> alter session set events 'trace[px_messaging] disk highest';
SQL> alter session set events 'trace[px_control] disk highest';
SQL> alter session set events 'trace[px_scheduler] disk high';
SQL> alter session set events 'trace[sql_compiler.*] disk medium';
SQL> 執(zhí)行 出現(xiàn)問題的 SQL 文
SQL> exit;
?
這樣,會一次性地得到幾個trace 文件。
MBL12_ora_92961_test1.trc
MBL11_p000_33866_test1.trc
MBL11_p001_32473_test1.trc
MBL11_p002_32741_test1.trc
......
第一個文件: 是 QC 文件。其他的應(yīng)當(dāng)是它生成的 Parallel 文件。
應(yīng)當(dāng)主要從QC 文件著手進行分析:
Automatic degree of parallelism (ADOP)
**************************
kkfdtParallel: parallel is possible (no statement type restrictions)
kkfdIsAutoDopSupported:Yes, ctxoct:2, boostrap SQL?:FALSE, remote?:FALSE, stmt?:FALSE.
Automatic degree of parallelism is enabled for this statement in hint mode.
kkopqSetForceParallelProperties: Hint:yes
Query: compute:no? forced:yes forceDop:4
DDLDML : compute:no? forced:yes forceDop:4
kkopqSetDopReason: Reason why we chose this DOP is: hint. <<<<< 此處是說明通過hint ,認識到要執(zhí)行 parallel 處理。
hint forces parallelism with dop=4
?
下面,內(nèi)容是這樣的:
2018-04-24 09:36:23.759351*:PX_Messaging:kxfp.c@18652:kxfpclinfo():load balancing disabled due to single PQ running (non-QA mode)on local inst 2 (total # inst: 2).
(default: 0) inst target is 40
number of active slaves on the instance: 0,
number of active slaves but available to use: 0 <<<<< 一開始的時候,沒有可用的資源
?
接下來,可以看到,正嘗試著去獲得slave 進程:
2018-04-24 09:36:23.760357*:PX_Messaging:kxfp.c@11588:kxfpg1srv(): trying to get slave P000 on instance 1 for q=0x80d9eac902018-04-2409:36:23.760927*:PX_Messaging:kxfp.c@11631:kxfpg1srv(): slave P000 is remote (inst=1) 2018-04-24 09:36:23.760927*:PX_Messaging:kxfp.c@11669:kxfpg1srv(): - acquired dp=(nil)
2018-04-24 09:36:23.760927*:PX_Messaging:kxfp.c@11104:kxfpg1sg(): Got It. 1 so far.
2018-04-24 09:36:23.760927*:PX_Messaging:kxfp.c@11588:kxfpg1srv(): trying to get slave P001 on instance 1 for q=0x80d9eac90
2018-04-2409:36:23.760927*:PX_Messaging:kxfp.c@11631:kxfpg1srv(): slave P001 is remote (inst=1) 2018-04-24 09:36:23.760927*:PX_Messaging:kxfp.c@11669:kxfpg1srv(): - acquired dp=(nil)
2018-04-24 09:36:23.760927*:PX_Messaging:kxfp.c@11104:kxfpg1sg(): Got It. 2 so far.
jStart=0 jEnd=60 jIncr=1 isGV=0 i=1 instno=2 kxfpilthno=2
2018-04-24 09:36:23.760927*:PX_Messaging:kxfp.c@11588:kxfpg1srv(): trying to get slave P000 on instance 2 for q=0x80d9eac90
2018-04-24 09:36:23.760927*:PX_Messaging:kxfp.c@11686:kxfpg1srv(): slave P000 is local
2018-04-24 09:36:23.760927*:PX_Messaging:kxfp.c@11716:kxfpg1srv(): found dp=0x85a9c9c68 flg=18
2018-04-24 09:36:23.760927*:PX_Messaging:kxfp.c@11747:kxfpg1srv(): local slave already started.. sid = 0, iid = 2
2018-04-24 09:36:23.760927*:PX_Messaging:kxfp.c@11104:kxfpg1sg(): Got It. 3 so far.
2018-04-24 09:36:23.760927*:PX_Messaging:kxfp.c@11588:kxfpg1srv(): trying to get slave P001 on instance 2 for q=0x80d9eac90
2018-04-24 09:36:23.761444*:PX_Messaging:kxfp.c@11686:kxfpg1srv(): slave P001 is local
2018-04-24 09:36:23.761444*:PX_Messaging:kxfp.c@11716:kxfpg1srv(): found dp=0x85a9c9cf8 flg=18
2018-04-24 09:36:23.761444*:PX_Messaging:kxfp.c@11747:kxfpg1srv(): local slave already started.. sid = 1, iid = 2
2018-04-24 09:36:23.761444*:PX_Messaging:kxfp.c@11104:kxfpg1sg(): Got It. 4 so far.
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<?上面的信息表明開始試著去獲取并行slave 進程,并漸次取得成功
?
再往下看,可以看到確實獲得了想要的那些個并行進程:
2018-04-24 09:36:23.774484*:PX_Messaging:kxfp.c@11536:kxfpg1sg():got4 servers (sync), errors=0x0 returning?? <<<<<<<<<<<<<<<<<<<<<<<<<???從這里也可以看得出來,確實取得了4個并行進程 ?
Acquired 8 slaves on 2 instances avg height=2 #set=2 qser=14923778 <<<<<<<<? 估計因為有兩個 set 的緣故,故此得到了8個并行進程
P000 inst 1 spid 33866
P001 inst 1 spid 32473
P000 inst 2 spid 81205
P001 inst 2 spid 88210
P002 inst 1 spid 32741
P003 inst 1 spid 38214
P002 inst 2 spid 82324
P003 inst 2 spid 88129
2018-04-24 09:36:23.774484*:PX_Messaging:kxfp.c@10729:kxfpgsg():
Instance(servers):
inst=1 #slvs=4
inst=2 #slvs=4
?
接下來,可以看到 QC 正在和 各個Parallel進程 set 進行通信:
????? QC enabled kgl EXPRESS bit on slave for SQL:----- Current SQL Statement for this session (sql_id=cay1xmzv2mtyz) -----
INSERT/*+parallel(4) */ INTO TAB001_WORK SELECT/*+ FULL(USR002) */ USR002.IM_PRO_CD, USR002.IM_NO, USR002.PS_DATE, USR002.YY_MM, USR002.YEAR,
.....
?????? size:? 40 aligned:? 40 total:9104 rem:7008 to:?? 4
??????Sending parse to slave set 1:? <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< QC 和 set 1 進行通信
???????? User sqllen sent from QC = 7317
......
2018-04-24 09:36:23.793626*:PX_Messaging:kxfp.c@5138:kxfpqrsnd():
Deliver to qref=0x85e94cd8 (points at 2.-1) msg=0x7bff77fd0 flg=0x1
?qref: qrser=14923778 qrseq=5 qrflg=0x1 fmh=0x2 state=00010
?msg:? mhser=14923778 mhseq=6 mhst=3 mhty=6 from=0x85e974c8
qref flow mode now 0x1
Receiver qref ending state=11010
after rsnd/qsnd for qref=0x85e974c8 state=10011 slm=0 bn=1 b0=0x7bff4dfb8 b1=(nil)
after buf swap qref=0x85e974c8 state=00000 slm=0 bn=1 b0=(nil) b1=0x7bff4dfb8
sender qref ending state=00000
kxfxcp1??????????????????????????????????????????????????????? [????? 50/???? 0]
??????Sending parse to nprocs:4 slave_set:2 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<QC 和 set 2 進行通信
2018-04-24 09:36:23.793843*:PX_Messaging:kxfp.c@4131:kxfprigeb():
Get buffer on q=0x80d9eac90 qref=0x85e9dbf0 server=1.2 flags0x0 qstat=10000
Got buffer on qref=0x85e9dbf0 qrser=14923778 qrseq=4 mh=0x7bff59fb8 fmh=a1 qstat=10011
......
?????? QC enabled kgl EXPRESS bit on slave for SQL:
----- Current SQL Statement for this session (sql_id=cay1xmzv2mtyz) -----
INSERT/*+ parallel(4) */ INTO TAB001_WORK SELECT/*+ FULL(USR002) */ USR002.IM_PRO_CD,
......
61394 行目:
?????? QC sends top nobj#:-1 ikc:0 #parts:1048576 flg:0
kxfxpnd??????????????????????????????????????????????????????? [????? 60/???? 0]
?????? size:? 40 aligned:? 40 total:8984 rem:7128 to:?? 4
?????? kxfxpoeobjv:65535, kxfxpoPMax:1048576, kxfxponobj:-1
?????? QC sends top nobj#:-1 ikc:0 #parts:1048576 flg:0
kxfxpnd??????????????????????????????????????????????????????? [????? 60/???? 0]
?????? size:? 40 aligned:? 40 total:9024 rem:7088 to:?? 4
?????? kxfxpoeobjv:65535, kxfxpoPMax:1048576, kxfxponobj:-1
?????? QC sends top nobj#:-1 ikc:0 #parts:1048576 flg:0
kxfxpnd??????????????????????????????????????????????????????? [????? 60/???? 0]
?????? size:? 40 aligned:? 40 total:9064 rem:7048 to:?? 4
?????? kxfxpoeobjv:65535, kxfxpoPMax:1048576, kxfxponobj:-1
?????? QC sends top nobj#:-1 ikc:0 #parts:1048576 flg:0
kxfxpnd??????????????????????????????????????????????????????? [????? 60/???? 0]
?????? size:? 40 aligned:? 40 total:9104 rem:7008 to:?? 4
?
到最后,可以看到QC與Slave進程的通信結(jié)束了, Slave進程被釋放:
2018-04-24 09:36:26.599729*:PX_Messaging:kxfp.c@21306:kxfpIDNDeregister():removinglink for qc 0x80d9eac90 sess 423 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<QC 與 Slave 進程的通信結(jié)束
removing link 0x87544b030 for qc 0x80d9eac90 on list 7
......
2018-04-24 09:36:26.608002*:PX_Messaging:kxfp.c@6477:kxfpGatherSlaveStats(begin):
q=0x80d9eac90 qser=14923778
2018-04-24 09:36:26.608002*:PX_Messaging:kxfp.c@6572:kxfpGatherSlaveStats(end):
2018-04-2409:36:26.608002*:PX_Messaging:kxfp.c@3102:kxfpqsod_qc_sod(): all slavesreleased qser=14923778 <<<<<<<<<<<<<? 所有的 Slave 進程工作結(jié)束,被回收
?
從上面就可以看到確實是完成了并行的工作。
但是,為什么 OEM 報 沒有并行,而是單進程執(zhí)行? 初步估計是OEM 顯示錯了。進一步的分析就要移交給 OEM 團隊通過 OEM 觀點進行分析了。
總結(jié)
以上是生活随笔為你收集整理的[Oracle运维工程师手记] 如何从trace 文件,判断是否执行了并行的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 漫步者推出新款 M16+ 桌面音箱:支持
- 下一篇: 饭店开除拾金不昧保洁员 收大量差评:领导