當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

大数据下union all的视图分页解决方案

發布時間：2025/3/15 编程问答 18 豆豆

生活随笔收集整理的這篇文章主要介紹了大数据下union all的视图分页解决方案小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

最近的一個項目需要用到大數據分頁，問題還是比較復雜的，下面分享一些自己的解決方法，與大家一起學習。

在對于union all的視圖中，如果采用傳統的采用rownum分頁方式的時候，如
WHERE rownum < M)
WHERE linenum >=N
因為，Oracle執行到linenum >=N的時候，將不知所措，導致執行計劃亂掉。如，假定bwm_users就是一個union all的視圖。
代碼如下：
select *
from mv_bmw_users_db1
union all
select? *
from mv_bmw_users_db2

如果我們在該視圖上執行如下操作，可以看到
SQL> select * from
? 2? (select rownum linenum,id,nick from
? 3? (select id,nick from bmw_users? where nick ='test' order by id)
? 4? where rownum < 50)
? 5? where linenum >=1;

Execution Plan
----------------------------------------------------------
?? 0????? SELECT STATEMENT Optimizer=CHOOSE (Cost=20385 Card=49 Bytes=2401)
?? 1??? 0?? VIEW (Cost=20385 Card=49 Bytes=2401)
?? 2??? 1???? COUNT (STOPKEY)
?? 3??? 2?????? VIEW (Cost=20385 Card=1728633 Bytes=62230788)
?? 4??? 3???????? SORT (ORDER BY STOPKEY) (Cost=20385 Card=1728633 Bytes=62230788)
?? 5??? 4?????????? VIEW OF 'BMW_USERS' (Cost=9278 Card=1728633 Bytes=62230788)
?? 6??? 5???????????? UNION-ALL
?? 7??? 6?????????????? TABLE ACCESS (FULL) OF 'MV_BMW_USERS_DB1' (Cost=4639 Card=864090 Bytes=38884050)
?? 8??? 6?????????????? TABLE ACCESS (FULL) OF 'MV_BMW_USERS_DB2' (Cost=4639 Card=864543 Bytes=38904435)

Statistics
----------------------------------------------------------
????????? 0? recursive calls
????????? 0? db block gets
????? 97298? consistent gets
????? 20770? physical reads
????????? 0? redo size
??????? 518? bytes sent via SQL*Net to client
??????? 504? bytes received via SQL*Net from client
????????? 2? SQL*Net roundtrips to/from client
????????? 1? sorts (memory)
????????? 0? sorts (disk)
????????? 1? rows processed
?????????
一個非常簡單的查詢，在nick上是有索引的，而且表與索引都是分析過的，居然是全表掃描，耗費非常大的資源，這個時候，Oracle已經不能正確的判斷使用索引了，所以錯誤的使用了全表，從統計信息也可以看到，該查詢產生了大量的cr讀與磁盤讀。這些在前出塞項目里用到，這個時候，就是強行指定hint也不能改變oracle的執行計劃，當然，這樣是行不通的，我們必須找到一個行之有效的辦法。

這樣的問題怎么解決呢？有兩個辦法，一個是仍然使用union all語句在查詢中，直接查詢基表而不是視圖。如以上語句改造為：
SQL> select * from
? 2? (select rownum linenum,id,nick from
? 3? (select * from
? 4? (select id,nick from MV_BMW_USERS_DB1 where nick ='test'
? 5? union all
? 6? select id,nick from MV_BMW_USERS_DB1 where nick ='test')
? 7? order by id)
? 8? where rownum < 50)
? 9? where linenum >=1;

Execution Plan
----------------------------------------------------------
?? 0????? SELECT STATEMENT Optimizer=CHOOSE (Cost=17 Card=2 Bytes=98)
?? 1??? 0?? VIEW (Cost=17 Card=2 Bytes=98)
?? 2??? 1???? COUNT (STOPKEY)
?? 3??? 2?????? VIEW (Cost=17 Card=2 Bytes=72)
?? 4??? 3???????? SORT (ORDER BY STOPKEY) (Cost=17 Card=2 Bytes=72)
?? 5??? 4?????????? VIEW (Cost=8 Card=2 Bytes=72)
?? 6??? 5???????????? UNION-ALL
?? 7??? 6?????????????? TABLE ACCESS (BY INDEX ROWID) OF 'MV_BMW_USERS_DB1' (Cost=4 Card=1 Bytes=45)
?? 8??? 7???????????????? INDEX (RANGE SCAN) OF 'IND_MV_BMW_USERS_NICK1' (NON-UNIQUE) (Cost=3 Card=1)
?? 9??? 6?????????????? TABLE ACCESS (BY INDEX ROWID) OF 'MV_BMW_USERS_DB1' (Cost=4 Card=1 Bytes=45)
? 10??? 9???????????????? INDEX (RANGE SCAN) OF 'IND_MV_BMW_USERS_NICK1' (NON-UNIQUE) (Cost=3 Card=1)

Statistics
----------------------------------------------------------
????????? 0? recursive calls
????????? 0? db block gets
????????? 8? consistent gets
????????? 0? physical reads
????????? 0? redo size
??????? 553? bytes sent via SQL*Net to client
??????? 504? bytes received via SQL*Net from client
????????? 2? SQL*Net roundtrips to/from client
????????? 1? sorts (memory)
????????? 0? sorts (disk)
????????? 2? rows processed

語句基本上是一樣的，只是這次查詢了基表，而不是視圖，執行計劃馬上發生了改變，這次能使用了索引，而且成本有了很大的減少，可以看到cr讀減少到只有8個塊，而且磁盤讀為0。

我們采用第二種方法，分析函數的辦法，把語句改寫為
SQL>select * from
?1 (select row_number() over(order by id) rn,id,nick from bmw_users where nick ='test')
?2 where rn <50 and rn >=1;

Execution Plan
----------------------------------------------------------
?? 0????? SELECT STATEMENT Optimizer=CHOOSE (Cost=13 Card=1 Bytes=49)
?? 1??? 0?? VIEW (Cost=13 Card=1 Bytes=49)
?? 2??? 1???? WINDOW (SORT PUSHED RANK) (Cost=13 Card=1 Bytes=45)
?? 3??? 2?????? VIEW OF 'BMW_USERS' (Cost=4 Card=1 Bytes=45)
?? 4??? 3???????? UNION-ALL (PARTITION)
?? 5??? 4?????????? TABLE ACCESS (BY INDEX ROWID) OF 'MV_BMW_USERS_DB1' (Cost=4 Card=1 Bytes=45)
?? 6??? 5???????????? INDEX (RANGE SCAN) OF 'IND_MV_BMW_USERS_NICK1' (NON-UNIQUE) (Cost=3 Card=1)
?? 7??? 4?????????? TABLE ACCESS (BY INDEX ROWID) OF 'MV_BMW_USERS_DB2' (Cost=4 Card=1 Bytes=45)
?? 8??? 7???????????? INDEX (RANGE SCAN) OF 'IND_MV_BMW_USERS_NICK2' (NON-UNIQUE) (Cost=3 Card=1)

Statistics
----------------------------------------------------------
????????? 0? recursive calls
????????? 0? db block gets
????????? 7? consistent gets
????????? 0? physical reads
????????? 0? redo size
??????? 513? bytes sent via SQL*Net to client
??????? 504? bytes received via SQL*Net from client
????????? 2? SQL*Net roundtrips to/from client
????????? 1? sorts (memory)
????????? 0? sorts (disk)
????????? 1? rows processed
可以看到，同樣的功能，分析函數的方法是最簡單的，同樣也能正確的使用索引。

以上是一個簡單的例子，我們再分析一個復雜的實際語句。
原始語句為：
SELECT /*+ ordered use_nl(u1,p2,u2)*/T2.*,u1.nick, u1.user_id, u1.id as userid,
u2.nick as user2, u2.user_id as id2, u2.id as userid2, p2.post_username as post_username2,
TO_CHAR(p2.post_time,'YYYY-MM-DD HH24:MI:SS') post_time
FROM
(SELECT * FROM
(SELECT T1.*, rownum as linenum
FROM
(SELECT /*+ index (t IND_FORUM_TOPICS_FOR_ID)*/t.topic_id,t.topic_type,t.topic_distillate,
t.topic_vote,t.topic_status, t.topic_moved_id,TO_CHAR(t.topic_time,'YYYY-MM-DD HH24:MI:SS')? topic_time,
t.topic_last_post_id, t.topic_views,t.topic_title, t.topic_replies, t.topic_poster
FROM forum_topics t
WHERE t.forum_id = ?
AND t.topic_type < 2
AND t.topic_status <> 3
ORDER BY t.topic_type DESC, t.topic_last_post_id DESC ) T1
WHERE rownum < ?)
WHERE linenum >=?) T2,
forum_posts p2,
bmw_users u1,bmw_users u2
WHERE T2.topic_poster = u1.user_id
AND p2.post_id = T2.topic_last_post_id?
AND u2.user_id = p2.poster_id

因為其中bmw_users是union all的視圖，所以，該查詢也使用了基表的全表掃描。如果把它改寫為union all的語句，也將是異常的復雜，如，該寫成union all將是這個樣子
select * from (
SELECT /*+ ordered use_nl(u1,p2,u2)*/T2.*,u1.nick, u1.user_id,
u1.id as userid, u2.nick as user2, u2.user_id as id2,
u2.id as userid2, p2.post_username as post_username2,
TO_CHAR(p2.post_time,'YYYY-MM-DD HH24:MI:SS')
post_time FROM (
SELECT *FROM (
SELECT T1.*, rownum as linenum FROM(
SELECT /*+ index (t IND_FORUM_TOPICS_FOR_ID)*/t.topic_id,t.topic_type,
t.topic_distillate, t.topic_vote,t.topic_status,t.topic_moved_id,
TO_CHAR(t.topic_time,'YYYY-MM-DD HH24:MI:SS')? topic_time,
t.topic_last_post_id,t.topic_views,t.topic_title, t.topic_replies,
t.topic_poster FROM forum_topics t WHERE t.forum_id = :bind0?
AND t.topic_type < 2 AND t.topic_status <> 3? ORDER BY t.topic_type DESC,
t.topic_last_post_id DESC) T1
WHERE rownum < :bind1)
WHERE linenum >=:bind2
) T2,
forum_posts p2,
MV_BMW_USERS_DB1 u1,
MV_BMW_USERS_DB1 u2
WHERE T2.topic_poster = u1.user_id
AND p2.post_id = T2.topic_last_post_id?
AND u2.user_id = p2.poster_id
union all
SELECT /*+ ordered use_nl(u1,p2,u2)*/T2.*,u1.nick,
u1.user_id, u1.id as userid, u2.nick as user2, u2.user_id as id2,
u2.id as userid2, p2.post_username as post_username2,
TO_CHAR(p2.post_time,'YYYY-MM-DD HH24:MI:SS') post_time
FROM (
SELECT * FROM (
SELECT T1.*, rownumas linenum FROM (
SELECT /*+ index (t IND_FORUM_TOPICS_FOR_ID)*/t.topic_id,
t.topic_type,t.topic_distillate, t.topic_vote,t.topic_status,t.topic_moved_id,
TO_CHAR(t.topic_time,'YYYY-MM-DD HH24:MI:SS')? topic_time,
t.topic_last_post_id,t.topic_views,t.topic_title,
t.topic_replies, t.topic_poster FROM forum_topics t
WHERE t.forum_id = :bind3?
AND t.topic_type < 2 AND t.topic_status <> 3 ORDER BY t.topic_type DESC,
t.topic_last_post_id DESC) T1
WHERE rownum < :bind4)
WHERE linenum >=:bind5
) T2,
forum_posts p2,
MV_BMW_USERS_DB1 u1,
MV_BMW_USERS_DB2 u2
WHERE T2.topic_poster = u1.user_id
AND p2.post_id = T2.topic_last_post_id?
AND u2.user_id = p2.poster_id
union all
SELECT /*+ ordered use_nl(u1,p2,u2)*/T2.*,u1.nick, u1.user_id,
u1.id as userid, u2.nick as user2, u2.user_id as id2, u2.id as userid2,
p2.post_username as post_username2,
TO_CHAR(p2.post_time,'YYYY-MM-DD HH24:MI:SS') post_time
FROM (
SELECT * FROM (
SELECT T1.*, rownum as linenum FROM (
SELECT /*+ index (t IND_FORUM_TOPICS_FOR_ID)*/ t.topic_id,
t.topic_type,t.topic_distillate, t.topic_vote,t.topic_status,t.topic_moved_id,
TO_CHAR(t.topic_time,'YYYY-MM-DD HH24:MI:SS')? topic_time,
t.topic_last_post_id,t.topic_views,t.topic_title, t.topic_replies,
t.topic_poster FROM forum_topics t
WHERE t.forum_id = :bind6? AND t.topic_type < 2 AND t.topic_status <> 3?
ORDER BY t.topic_type DESC, t.topic_last_post_id DESC) T1
WHERE rownum < :bind7)
WHERE linenum >=:bind8
) T2,
forum_posts p2,
MV_BMW_USERS_DB2 u1,
MV_BMW_USERS_DB1 u2
WHERE T2.topic_poster = u1.user_id
AND?? T2.topic_last_post_id = p2.post_id?
AND u2.user_id = p2.poster_id
union all
SELECT /*+ ordered use_nl(u1,p2,u2)*/T2.*,u1.nick, u1.user_id, u1.id as userid,
u2.nick as user2, u2.user_id as id2, u2.id as userid2, p2.post_username as post_username2,
TO_CHAR(p2.post_time,'YYYY-MM-DD HH24:MI:SS') post_time
FROM (
SELECT * FROM (
SELECT T1.*, rownum as linenum FROM (
SELECT /*+ index (t IND_FORUM_TOPICS_FOR_ID)*/t.topic_id,
t.topic_type,t.topic_distillate, t.topic_vote,t.topic_status,t.topic_moved_id,
TO_CHAR(t.topic_time,'YYYY-MM-DD HH24:MI:SS')? topic_time,
t.topic_last_post_id,t.topic_views,t.topic_title, t.topic_replies,
t.topic_poster FROM forum_topicst WHERE t.forum_id = :bind9?
AND t.topic_type < 2 AND t.topic_status <> 3?
ORDER BY t.topic_type DESC, t.topic_last_post_id DESC) T1
WHERE rownum < :bind10)
WHERE linenum >=:bind11
) T2, forum_posts p2,
MV_BMW_USERS_DB2 u1,MV_BMW_USERS_DB2 u2 WHERE T2.topic_poster =
u1.user_id AND p2.post_id = T2.topic_last_post_id? AND u2.user_id = p2.poster_id
)
order by topic_type DESC,topic_last_post_id desc

但是，我們利用分析函數，將顯得非常簡單，而且正確的使用索引
SELECT /*+ ordered use_nl(u1,p2,u2)*/ T2.*,u1.nick, u1.user_id, u1.id as userid,
u2.nick as user2, u2.user_id as id2, u2.id as userid2, p2.post_username as post_username2,
TO_CHAR(p2.post_time,'YYYY-MM-DD HH24:MI:SS') post_time
FROM (
SELECT * FROM (
SELECT /*+ index (t IND_FORUM_TOPICS_FOR_ID)*/
row_number() over(order by t.topic_type DESC, t.topic_last_post_id DESC) rn,
t.topic_id,t.topic_type,t.topic_distillate, t.topic_vote,t.topic_status,t.topic_moved_id,
TO_CHAR(t.topic_time,'YYYY-MM-DD HH24:MI:SS')? topic_time,
t.topic_last_post_id,t.topic_views,t.topic_title, t.topic_replies,
t.topic_poster FROM forum_topics t
WHERE t.forum_id = ?? AND t.topic_type < 2 AND t.topic_status <> 3?
) T1
WHERE rn < ? and rn >= ?
) T2,
forum_posts p2,
bmw_users u1,
bmw_users u2
WHERE T2.topic_poster = u1.user_id
AND p2.post_id = T2.topic_last_post_id?
AND u2.user_id = p2.poster_id

總結

以上是生活随笔為你收集整理的大数据下union all的视图分页解决方案的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： HONGJIN4 2013
下一篇： Python爬虫实践：获取百度贴吧内容