大数据下union all的视图分页解决方案
最近的一個項目需要用到大數據分頁,問題還是比較復雜的,下面分享一些自己的解決方法,與大家一起學習。
在對于union all的視圖中,如果采用傳統的采用rownum分頁方式的時候,如
WHERE rownum < M)
WHERE linenum >=N
因為,Oracle執行到linenum >=N的時候,將不知所措,導致執行計劃亂掉。如,假定bwm_users就是一個union all的視圖。
代碼如下:
select *
from mv_bmw_users_db1
union all
select? *
from mv_bmw_users_db2
如果我們在該視圖上執行如下操作,可以看到
SQL> select * from
? 2? (select rownum linenum,id,nick from
? 3? (select id,nick from bmw_users? where nick ='test' order by id)
? 4? where rownum < 50)
? 5? where linenum >=1;
Execution Plan
----------------------------------------------------------
?? 0????? SELECT STATEMENT Optimizer=CHOOSE (Cost=20385 Card=49 Bytes=2401)
?? 1??? 0?? VIEW (Cost=20385 Card=49 Bytes=2401)
?? 2??? 1???? COUNT (STOPKEY)
?? 3??? 2?????? VIEW (Cost=20385 Card=1728633 Bytes=62230788)
?? 4??? 3???????? SORT (ORDER BY STOPKEY) (Cost=20385 Card=1728633 Bytes=62230788)
?? 5??? 4?????????? VIEW OF 'BMW_USERS' (Cost=9278 Card=1728633 Bytes=62230788)
?? 6??? 5???????????? UNION-ALL
?? 7??? 6?????????????? TABLE ACCESS (FULL) OF 'MV_BMW_USERS_DB1' (Cost=4639 Card=864090 Bytes=38884050)
?? 8??? 6?????????????? TABLE ACCESS (FULL) OF 'MV_BMW_USERS_DB2' (Cost=4639 Card=864543 Bytes=38904435)
Statistics
----------------------------------------------------------
????????? 0? recursive calls
????????? 0? db block gets
????? 97298? consistent gets
????? 20770? physical reads
????????? 0? redo size
??????? 518? bytes sent via SQL*Net to client
??????? 504? bytes received via SQL*Net from client
????????? 2? SQL*Net roundtrips to/from client
????????? 1? sorts (memory)
????????? 0? sorts (disk)
????????? 1? rows processed
?????????
一個非常簡單的查詢,在nick上是有索引的,而且表與索引都是分析過的,居然是全表掃描,耗費非常大的資源,這個時候,Oracle已經不能正確的判斷使用索引了,所以錯誤的使用了全表,從統計信息也可以看到,該查詢產生了大量的cr讀與磁盤讀。這些在前出塞項目里用到,這個時候,就是強行指定hint也不能改變oracle的執行計劃,當然,這樣是行不通的,我們必須找到一個行之有效的辦法。
這樣的問題怎么解決呢?有兩個辦法,一個是仍然使用union all語句在查詢中,直接查詢基表而不是視圖。如以上語句改造為:
SQL> select * from
? 2? (select rownum linenum,id,nick from
? 3? (select * from
? 4? (select id,nick from MV_BMW_USERS_DB1 where nick ='test'
? 5? union all
? 6? select id,nick from MV_BMW_USERS_DB1 where nick ='test')
? 7? order by id)
? 8? where rownum < 50)
? 9? where linenum >=1;
Execution Plan
----------------------------------------------------------
?? 0????? SELECT STATEMENT Optimizer=CHOOSE (Cost=17 Card=2 Bytes=98)
?? 1??? 0?? VIEW (Cost=17 Card=2 Bytes=98)
?? 2??? 1???? COUNT (STOPKEY)
?? 3??? 2?????? VIEW (Cost=17 Card=2 Bytes=72)
?? 4??? 3???????? SORT (ORDER BY STOPKEY) (Cost=17 Card=2 Bytes=72)
?? 5??? 4?????????? VIEW (Cost=8 Card=2 Bytes=72)
?? 6??? 5???????????? UNION-ALL
?? 7??? 6?????????????? TABLE ACCESS (BY INDEX ROWID) OF 'MV_BMW_USERS_DB1' (Cost=4 Card=1 Bytes=45)
?? 8??? 7???????????????? INDEX (RANGE SCAN) OF 'IND_MV_BMW_USERS_NICK1' (NON-UNIQUE) (Cost=3 Card=1)
?? 9??? 6?????????????? TABLE ACCESS (BY INDEX ROWID) OF 'MV_BMW_USERS_DB1' (Cost=4 Card=1 Bytes=45)
? 10??? 9???????????????? INDEX (RANGE SCAN) OF 'IND_MV_BMW_USERS_NICK1' (NON-UNIQUE) (Cost=3 Card=1)
Statistics
----------------------------------------------------------
????????? 0? recursive calls
????????? 0? db block gets
????????? 8? consistent gets
????????? 0? physical reads
????????? 0? redo size
??????? 553? bytes sent via SQL*Net to client
??????? 504? bytes received via SQL*Net from client
????????? 2? SQL*Net roundtrips to/from client
????????? 1? sorts (memory)
????????? 0? sorts (disk)
????????? 2? rows processed
語句基本上是一樣的,只是這次查詢了基表,而不是視圖,執行計劃馬上發生了改變,這次能使用了索引,而且成本有了很大的減少,可以看到cr讀減少到只有8個塊,而且磁盤讀為0。
我們采用第二種方法,分析函數的辦法,把語句改寫為
SQL>select * from
?1 (select row_number() over(order by id) rn,id,nick from bmw_users where nick ='test')
?2 where rn <50 and rn >=1;
Execution Plan
----------------------------------------------------------
?? 0????? SELECT STATEMENT Optimizer=CHOOSE (Cost=13 Card=1 Bytes=49)
?? 1??? 0?? VIEW (Cost=13 Card=1 Bytes=49)
?? 2??? 1???? WINDOW (SORT PUSHED RANK) (Cost=13 Card=1 Bytes=45)
?? 3??? 2?????? VIEW OF 'BMW_USERS' (Cost=4 Card=1 Bytes=45)
?? 4??? 3???????? UNION-ALL (PARTITION)
?? 5??? 4?????????? TABLE ACCESS (BY INDEX ROWID) OF 'MV_BMW_USERS_DB1' (Cost=4 Card=1 Bytes=45)
?? 6??? 5???????????? INDEX (RANGE SCAN) OF 'IND_MV_BMW_USERS_NICK1' (NON-UNIQUE) (Cost=3 Card=1)
?? 7??? 4?????????? TABLE ACCESS (BY INDEX ROWID) OF 'MV_BMW_USERS_DB2' (Cost=4 Card=1 Bytes=45)
?? 8??? 7???????????? INDEX (RANGE SCAN) OF 'IND_MV_BMW_USERS_NICK2' (NON-UNIQUE) (Cost=3 Card=1)
Statistics
----------------------------------------------------------
????????? 0? recursive calls
????????? 0? db block gets
????????? 7? consistent gets
????????? 0? physical reads
????????? 0? redo size
??????? 513? bytes sent via SQL*Net to client
??????? 504? bytes received via SQL*Net from client
????????? 2? SQL*Net roundtrips to/from client
????????? 1? sorts (memory)
????????? 0? sorts (disk)
????????? 1? rows processed
可以看到,同樣的功能,分析函數的方法是最簡單的,同樣也能正確的使用索引。
以上是一個簡單的例子,我們再分析一個復雜的實際語句。
原始語句為:
SELECT /*+ ordered use_nl(u1,p2,u2)*/T2.*,u1.nick, u1.user_id, u1.id as userid,
u2.nick as user2, u2.user_id as id2, u2.id as userid2, p2.post_username as post_username2,
TO_CHAR(p2.post_time,'YYYY-MM-DD HH24:MI:SS') post_time
FROM
(SELECT * FROM
(SELECT T1.*, rownum as linenum
FROM
(SELECT /*+ index (t IND_FORUM_TOPICS_FOR_ID)*/t.topic_id,t.topic_type,t.topic_distillate,
t.topic_vote,t.topic_status, t.topic_moved_id,TO_CHAR(t.topic_time,'YYYY-MM-DD HH24:MI:SS')? topic_time,
t.topic_last_post_id, t.topic_views,t.topic_title, t.topic_replies, t.topic_poster
FROM forum_topics t
WHERE t.forum_id = ?
AND t.topic_type < 2
AND t.topic_status <> 3
ORDER BY t.topic_type DESC, t.topic_last_post_id DESC ) T1
WHERE rownum < ?)
WHERE linenum >=?) T2,
forum_posts p2,
bmw_users u1,bmw_users u2
WHERE T2.topic_poster = u1.user_id
AND p2.post_id = T2.topic_last_post_id?
AND u2.user_id = p2.poster_id
因為其中bmw_users是union all的視圖,所以,該查詢也使用了基表的全表掃描。如果把它改寫為union all的語句,也將是異常的復雜,如,該寫成union all將是這個樣子
select * from (
SELECT /*+ ordered use_nl(u1,p2,u2)*/T2.*,u1.nick, u1.user_id,
u1.id as userid, u2.nick as user2, u2.user_id as id2,
u2.id as userid2, p2.post_username as post_username2,
TO_CHAR(p2.post_time,'YYYY-MM-DD HH24:MI:SS')
post_time FROM (
SELECT *FROM (
SELECT T1.*, rownum as linenum FROM(
SELECT /*+ index (t IND_FORUM_TOPICS_FOR_ID)*/t.topic_id,t.topic_type,
t.topic_distillate, t.topic_vote,t.topic_status,t.topic_moved_id,
TO_CHAR(t.topic_time,'YYYY-MM-DD HH24:MI:SS')? topic_time,
t.topic_last_post_id,t.topic_views,t.topic_title, t.topic_replies,
t.topic_poster FROM forum_topics t WHERE t.forum_id = :bind0?
AND t.topic_type < 2 AND t.topic_status <> 3? ORDER BY t.topic_type DESC,
t.topic_last_post_id DESC) T1
WHERE rownum < :bind1)
WHERE linenum >=:bind2
) T2,
forum_posts p2,
MV_BMW_USERS_DB1 u1,
MV_BMW_USERS_DB1 u2
WHERE T2.topic_poster = u1.user_id
AND p2.post_id = T2.topic_last_post_id?
AND u2.user_id = p2.poster_id
union all
SELECT /*+ ordered use_nl(u1,p2,u2)*/T2.*,u1.nick,
u1.user_id, u1.id as userid, u2.nick as user2, u2.user_id as id2,
u2.id as userid2, p2.post_username as post_username2,
TO_CHAR(p2.post_time,'YYYY-MM-DD HH24:MI:SS') post_time
FROM (
SELECT * FROM (
SELECT T1.*, rownumas linenum FROM (
SELECT /*+ index (t IND_FORUM_TOPICS_FOR_ID)*/t.topic_id,
t.topic_type,t.topic_distillate, t.topic_vote,t.topic_status,t.topic_moved_id,
TO_CHAR(t.topic_time,'YYYY-MM-DD HH24:MI:SS')? topic_time,
t.topic_last_post_id,t.topic_views,t.topic_title,
t.topic_replies, t.topic_poster FROM forum_topics t
WHERE t.forum_id = :bind3?
AND t.topic_type < 2 AND t.topic_status <> 3 ORDER BY t.topic_type DESC,
t.topic_last_post_id DESC) T1
WHERE rownum < :bind4)
WHERE linenum >=:bind5
) T2,
forum_posts p2,
MV_BMW_USERS_DB1 u1,
MV_BMW_USERS_DB2 u2
WHERE T2.topic_poster = u1.user_id
AND p2.post_id = T2.topic_last_post_id?
AND u2.user_id = p2.poster_id
union all
SELECT /*+ ordered use_nl(u1,p2,u2)*/T2.*,u1.nick, u1.user_id,
u1.id as userid, u2.nick as user2, u2.user_id as id2, u2.id as userid2,
p2.post_username as post_username2,
TO_CHAR(p2.post_time,'YYYY-MM-DD HH24:MI:SS') post_time
FROM (
SELECT * FROM (
SELECT T1.*, rownum as linenum FROM (
SELECT /*+ index (t IND_FORUM_TOPICS_FOR_ID)*/ t.topic_id,
t.topic_type,t.topic_distillate, t.topic_vote,t.topic_status,t.topic_moved_id,
TO_CHAR(t.topic_time,'YYYY-MM-DD HH24:MI:SS')? topic_time,
t.topic_last_post_id,t.topic_views,t.topic_title, t.topic_replies,
t.topic_poster FROM forum_topics t
WHERE t.forum_id = :bind6? AND t.topic_type < 2 AND t.topic_status <> 3?
ORDER BY t.topic_type DESC, t.topic_last_post_id DESC) T1
WHERE rownum < :bind7)
WHERE linenum >=:bind8
) T2,
forum_posts p2,
MV_BMW_USERS_DB2 u1,
MV_BMW_USERS_DB1 u2
WHERE T2.topic_poster = u1.user_id
AND?? T2.topic_last_post_id = p2.post_id?
AND u2.user_id = p2.poster_id
union all
SELECT /*+ ordered use_nl(u1,p2,u2)*/T2.*,u1.nick, u1.user_id, u1.id as userid,
u2.nick as user2, u2.user_id as id2, u2.id as userid2, p2.post_username as post_username2,
TO_CHAR(p2.post_time,'YYYY-MM-DD HH24:MI:SS') post_time
FROM (
SELECT * FROM (
SELECT T1.*, rownum as linenum FROM (
SELECT /*+ index (t IND_FORUM_TOPICS_FOR_ID)*/t.topic_id,
t.topic_type,t.topic_distillate, t.topic_vote,t.topic_status,t.topic_moved_id,
TO_CHAR(t.topic_time,'YYYY-MM-DD HH24:MI:SS')? topic_time,
t.topic_last_post_id,t.topic_views,t.topic_title, t.topic_replies,
t.topic_poster FROM forum_topicst WHERE t.forum_id = :bind9?
AND t.topic_type < 2 AND t.topic_status <> 3?
ORDER BY t.topic_type DESC, t.topic_last_post_id DESC) T1
WHERE rownum < :bind10)
WHERE linenum >=:bind11
) T2, forum_posts p2,
MV_BMW_USERS_DB2 u1,MV_BMW_USERS_DB2 u2 WHERE T2.topic_poster =
u1.user_id AND p2.post_id = T2.topic_last_post_id? AND u2.user_id = p2.poster_id
)
order by topic_type DESC,topic_last_post_id desc
?
但是,我們利用分析函數,將顯得非常簡單,而且正確的使用索引
SELECT /*+ ordered use_nl(u1,p2,u2)*/ T2.*,u1.nick, u1.user_id, u1.id as userid,
u2.nick as user2, u2.user_id as id2, u2.id as userid2, p2.post_username as post_username2,
TO_CHAR(p2.post_time,'YYYY-MM-DD HH24:MI:SS') post_time
FROM (
SELECT * FROM (
SELECT /*+ index (t IND_FORUM_TOPICS_FOR_ID)*/
row_number() over(order by t.topic_type DESC, t.topic_last_post_id DESC) rn,
t.topic_id,t.topic_type,t.topic_distillate, t.topic_vote,t.topic_status,t.topic_moved_id,
TO_CHAR(t.topic_time,'YYYY-MM-DD HH24:MI:SS')? topic_time,
t.topic_last_post_id,t.topic_views,t.topic_title, t.topic_replies,
t.topic_poster FROM forum_topics t
WHERE t.forum_id = ?? AND t.topic_type < 2 AND t.topic_status <> 3?
) T1
WHERE rn < ? and rn >= ?
) T2,
forum_posts p2,
bmw_users u1,
bmw_users u2
WHERE T2.topic_poster = u1.user_id
AND p2.post_id = T2.topic_last_post_id?
AND u2.user_id = p2.poster_id
總結
以上是生活随笔為你收集整理的大数据下union all的视图分页解决方案的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: HONGJIN4 2013
- 下一篇: Python爬虫实践: 获取百度贴吧内容