導(dǎo)讀 今天給大家分享一個(gè)通過(guò)SQL改寫(xiě)而獨(dú)辟蹊徑的SQL優(yōu)化案例
待優(yōu)化場(chǎng)景 發(fā)現(xiàn)SLOW QUERY LOG中有下面這樣一條記錄:
...
# Query_time: 59.503827 ?Lock_time: 0.000198 ?Rows_sent: 641227 ?Rows_examined: 13442472 ?Rows_affected: 0
...
select uid,sum(power) powerup from t1 where?
date>='2017-03-31' and?
UNIX_TIMESTAMP(STR_TO_DATE(concat(date,' ',hour),'%Y-%m-%d %H'))>=1490965200 and?
UNIX_TIMESTAMP(STR_TO_DATE(concat(date,' ',hour),'%Y-%m-%d %H'))<1492174801 ?and?
aType in (1,6,9) group by uid;
實(shí)話說(shuō),看到這個(gè)SQL我也忍不住想罵人啊,究竟是哪個(gè)腦殘的XX狗設(shè)計(jì)的?
竟然把日期時(shí)間中的 date 和 hour 給獨(dú)立出來(lái)成兩列,查詢時(shí)再合并成一個(gè)新的條件,簡(jiǎn)直無(wú)力吐槽。
吐槽歸吐槽,該干活還得干活,誰(shuí)讓咱是DBA呢,SQL優(yōu)化是咱的拿手好戲不是嘛~
SQL優(yōu)化之路 SQL優(yōu)化思路 不厭其煩地再說(shuō)一遍SQL優(yōu)化思路。
想要優(yōu)化一個(gè)SQL,一般來(lái)說(shuō)就是先看執(zhí)行計(jì)劃,觀察是否盡可能用到索引,
同時(shí)要關(guān)注預(yù)計(jì)掃描的行數(shù),
以及是否產(chǎn)生了臨時(shí)表(Using temporary) 或者?
是否需要進(jìn)行排序(Using filesort),
想辦法消除這些情況。
SQL性能瓶頸定位 毫無(wú)疑問(wèn),想要優(yōu)化,先看表DDL以及執(zhí)行計(jì)劃:
CREATE TABLE `t1` (`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,`date` date NOT NULL DEFAULT '0000-00-00',`hour` char(2) NOT NULL DEFAULT '00',`kid` int(4) NOT NULL DEFAULT '0',`uid` int(11) NOT NULL DEFAULT '0',`aType` tinyint(2) NOT NULL DEFAULT '0',`src` tinyint(2) NOT NULL DEFAULT '1',`aid` int(11) NOT NULL DEFAULT '1',`acount` int(11) NOT NULL DEFAULT '1',`power` decimal(20,2) DEFAULT '0.00',PRIMARY KEY (`id`,`date`),UNIQUE KEY `did` (`date`,`hour`,`kid`,`uid`,`aType`,`src`,`aid`)
) ENGINE=InnoDB AUTO_INCREMENT=50486620 DEFAULT CHARSET=utf8mb4
/*!50500 PARTITION BY RANGE ?COLUMNS(`date`)
(PARTITION p20170316 VALUES LESS THAN ('2017-03-17') ENGINE = InnoDB,PARTITION p20170317 VALUES LESS THAN ('2017-03-18') ENGINE = InnoDB
...
yejr@imysql.com[myDB]> EXPLAIN select uid,sum(power) powerup from t1 where?
date>='2017-03-31' and?
UNIX_TIMESTAMP(STR_TO_DATE(concat(date,' ',hour),'%Y-%m-%d %H'))>=1490965200 and?
UNIX_TIMESTAMP(STR_TO_DATE(concat(date,' ',hour),'%Y-%m-%d %H'))<1492174801 ?and?
aType in (1,6,9) group by uid\G
*************************** 1. row ***************************id: 1select_type: SIMPLEtable: t1partitions: p20170324,p20170325,....all partitiontype: ALL
possible_keys: didkey: NULLkey_len: NULLref: NULLrows: 25005577filtered: 15.00Extra: Using where; Using temporary; Using filesort
明顯的,這個(gè)SQL效率非常低,全表掃描、沒(méi)有索引、有臨時(shí)表、需要額外排序,什么倒霉催的全趕上了。
優(yōu)化思考 這個(gè)SQL是想統(tǒng)計(jì)符合條件的power列總和,雖然 date 列已有索引,但WHERE子句中卻對(duì) date 列加了函數(shù),而且還是 date 和 hour 兩列的組合條件,那就無(wú)法用到這個(gè)索引了。
還好,有個(gè)聰明伶俐的妹子,突發(fā)起想(事實(shí)上這位妹子本來(lái)就擅長(zhǎng)做SQL優(yōu)化的~),可以用 CASE WHEN 方法來(lái)改造下SQL,改成像下面這樣的:
select uid,sum(powerup+powerup1) from
(select uid,case when concat(date,' ',hour) >='2017-03-24 13:00' then power else '0' end as powerup,case when concat(date,' ',hour) < '2017-03-25 13:00' then power else '0' end as powerup1from t1where date>='2017-03-24'?and ??date <'2017-03-25'and ?aType in (1,6,9)
) a ?group by uid;
是不是很有才,直接把這個(gè)沒(méi)辦法用到索引的條件給用CASE WHEN來(lái)改造了。看看新的SQL執(zhí)行計(jì)劃:
*************************** 1. row ***************************id: 1select_type: SIMPLEtable: t1partitions: p20170324type: range
possible_keys: didkey: idx2_date_addRedTypekey_len: 4ref: NULLrows: 876375filtered: 30.00Extra: Using index condition; Using temporary; Using filesort
看看這個(gè)SQL的執(zhí)行代價(jià):
+----------------------------+---------+
| Variable_name ?????????????| Value ??|
+----------------------------+---------+
| Handler_read_first ????????| 1 ??????|
| Handler_read_key ??????????| 1834590 |
| Handler_read_last ?????????| 0 ??????|
| Handler_read_next ?????????| 1834589 |
| Handler_read_prev ?????????| 0 ??????|
| Handler_read_rnd ??????????| 232276 ?|
| Handler_read_rnd_next ?????| 232277 ?|
+----------------------------+---------+
及其SLOW QUERY LOG記錄的信息:
# Query_time: 6.381254 ?Lock_time: 0.000166 ?Rows_sent: 232276 ?Rows_examined: 2299141 ?Rows_affected: 0
# Bytes_sent: 4237347 ?Tmp_tables: 1 ?Tmp_disk_tables: 0 ?Tmp_table_sizes: 4187168
# InnoDB_trx_id: 0
# QC_Hit: No ?Full_scan: No ?Full_join: No ?Tmp_table: Yes ?Tmp_table_on_disk: No
# Filesort: Yes ?Filesort_on_disk: No ?Merge_passes: 0
# ??InnoDB_IO_r_ops: 0 ?InnoDB_IO_r_bytes: 0 ?InnoDB_IO_r_wait: 0.000000
# ??InnoDB_rec_lock_wait: 0.000000 ?InnoDB_queue_wait: 0.000000
# ??InnoDB_pages_distinct: 9311
看起來(lái)還不是太理想啊,雖然不再掃描全表了,但畢竟還是?有臨時(shí)表?和?額外排序,想辦法消除后再對(duì)比看下。
有個(gè)變化不知道大家注意到?jīng)],新的SLOW QUERY LOG記錄多了不少信息,這是因?yàn)橛昧薖ercona分支版本的插件才支持,這個(gè)功能確實(shí)不錯(cuò),甚至還能記錄Profiling的詳細(xì)信息,強(qiáng)烈推薦。
我們新建個(gè) uid 列上的索引,看看能除臨時(shí)表及排序后的代價(jià)如何,看看這個(gè)的開(kāi)銷會(huì)不會(huì)更低。
yejr@imysql.com[myDB]> ALTER TABLE t1 ADD INDEX idx_uid(uid);
yejr@imysql.com[myDB]> EXPLAIN select uid,sum(powerup+powerup1) from
(select uid,case when concat(date,' ',hour) >='2017-03-24 13:00' then power else '0' end as powerup,case when concat(date,' ',hour) < '2017-03-25 13:00' then power else '0' end as powerup1from t1where date>='2017-03-24'?and ??date <'2017-03-25'and ?aType in (1,6,9)
) a ?group by uid\G*************************** 1. row ***************************id: 1select_type: SIMPLEtable: if_date_hour_army_countpartitions: p20170331,p20170401...type: index
possible_keys: did,idx_uidkey: idx_uidkey_len: 4ref: NULLrows: 12701520filtered: 15.00Extra: Using where
看看添加索引后SQL的執(zhí)行代價(jià):
+----------------------------+---------+
| Variable_name ?????????????| Value ??|
+----------------------------+---------+
| Handler_read_first ????????| 1 ??????|
| Handler_read_key ??????????| 1 ??????|
| Handler_read_last ?????????| 0 ??????|
| Handler_read_next ?????????| 1834589 |
| Handler_read_prev ?????????| 0 ??????|
| Handler_read_rnd ??????????| 0 ??????|
| Handler_read_rnd_next ?????| 0 ??????|
+----------------------------+---------+
及其SLOW QUERY LOG記錄的信息:
# Query_time: 5.772286 ?Lock_time: 0.000330 ?Rows_sent: 232276 ?Rows_examined: 1834589 ?Rows_affected: 0
# Bytes_sent: 4215071 ?Tmp_tables: 0 ?Tmp_disk_tables: 0 ?Tmp_table_sizes: 0
# InnoDB_trx_id: 0
# QC_Hit: No ?Full_scan: Yes ?Full_join: No ?Tmp_table: No ?Tmp_table_on_disk: No
# Filesort: No ?Filesort_on_disk: No ?Merge_passes: 0
# ??InnoDB_IO_r_ops: 0 ?InnoDB_IO_r_bytes: 0 ?InnoDB_IO_r_wait: 0.000000
# ??InnoDB_rec_lock_wait: 0.000000 ?InnoDB_queue_wait: 0.000000
# ??InnoDB_pages_distinct: 11470
我們注意到,雖然加了 uid 列索引后的SQL掃描的data page更多了,但執(zhí)行效率其實(shí)是更高的,因?yàn)橄?臨時(shí)表 和 額外排序,這從 Handlerread% 的結(jié)果中也能看出來(lái),很顯然它的順序I/O更多,隨機(jī)I/O更少,所以雖然需要掃描的 data page 更多,實(shí)際上效率卻是更快的。
后記 再想想這個(gè)SQL還有優(yōu)化空間嗎,顯然是有的,那就是把數(shù)據(jù)表重新設(shè)計(jì),將 date 和 hour 列整合到一起,這樣就不用費(fèi)勁的拼湊條件并且也能用到索引了。 ?
?
總結(jié)
以上是生活随笔 為你收集整理的优化案例 | CASE WHEN进行SQL改写优化 的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
如果覺(jué)得生活随笔 網(wǎng)站內(nèi)容還不錯(cuò),歡迎將生活随笔 推薦給好友。