mysql 亿级_mysql 亿级数据量 ( sum ,group by )的优化
今天開(kāi)發(fā)提出需求,讓統(tǒng)計(jì)數(shù)據(jù),一詢問(wèn)才得知表中的數(shù)據(jù)量已達(dá)億級(jí)以上。具體的sql如下:
SELECT id_province_code,gender,age,COUNT(1),SUM(zy_days),SUM(zf),SUM(ybnje)FROM medicare2017 WHERE zy_enter_date BETWEEN '2017-01-01 00:00:00' AND '2017-12-31 12:59:59' GROUP BY id_province_code,age,gender;
然后查看該sql的執(zhí)行計(jì)劃
mysql> explain SELECT id_province_code,gender,age,COUNT(1),SUM(zy_days),SUM(zf),SUM(ybnje) FROM medicare2017 WHERE zy_enter_date BETWEEN '2017-01-01 00:00:00' AND '2017-12-31 12:59:59' GROUP BYid_province_code,age,gender;+----+-------------+--------------+------------+-------+-------------------------------------------+-------------------+---------+------+---------+----------+--------------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------------+------------+-------+-------------------------------------------+-------------------+---------+------+---------+----------+--------------------------------------------------------+
| 1 | SIMPLE | medicare2017 | NULL | range | idx_zy_enter_date,idx_province_age_gender | idx_zy_enter_date | 4 | NULL | 4836248 | 100.00 | Using index condition; Using temporary; Using filesort |
執(zhí)行計(jì)劃中“Extra”中竟然出現(xiàn)了“Using temporary; Using filesort ”,看到這種情況我們就得進(jìn)行優(yōu)化了,雖然“type”列出現(xiàn)了“range”。出現(xiàn)這種情況是因?yàn)閟ql語(yǔ)句中使用了“group by”或者是“order by ”,然后進(jìn)行了文件排序。
接著,我們就需要給group by后面的字段建立索引了,mysql索引原則是最左匹配前綴原則,我們給“id_province_code,age,gender”三字段添加一個(gè)復(fù)合索引(按照最左匹配原則):
alter table medicare2017 add index idx_ipc_age_gener(id_province_code,age,gender);
但一查看表結(jié)構(gòu)
KEY `idx_province_age_gender` (`id_province_code`,`gender`,`age`)
早期已經(jīng)建立好了,但是,怎么還會(huì)出現(xiàn)“Using temporary; Using filesort”,查看官檔發(fā)現(xiàn),group by 默認(rèn)是要排序的,所以即使我們添加了索引,還是會(huì)引起文件排序。這樣,我們的解決方案是:強(qiáng)制關(guān)閉排序:order by null
最后我們根據(jù)官方文檔的建議,進(jìn)行了sql的最終優(yōu)化:
mysql> mysql> explain SELECT id_province_code,gender,age,COUNT(1),SUM(zy_days),SUMM(ybnje) FROM medicare2017 WHERE zy_enter_date BETWEEN '2017-01-01 00:00:00' AND '2017-12-31 12:59:59' GROUP BY id_province_code,age,gender order by null;+----+-------------+--------------+------------+-------+-------------------------------------------+-------------------+---------+------+---------+----------+----------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------------+------------+-------+-------------------------------------------+-------------------+---------+------+---------+----------+----------------------------------------+
| 1 | SIMPLE | medicare2017 | NULL | range | idx_zy_enter_date,idx_province_age_gender | idx_zy_enter_date | 4 | NULL | 4836248 | 100.00 | Using index condition; Using temporary |
+----+-------------+--------------+------------+-------+-------------------------------------------+-------------------+---------+------+---------+----------+----------------------------------------+
查看執(zhí)行計(jì)劃,發(fā)現(xiàn)沒(méi)有文件排序了,但是還是有“using tempory”,別著急,這是不重要的,只要執(zhí)行sql的性能提升了就可以了
最終,運(yùn)行最終優(yōu)化后的sql發(fā)現(xiàn)比沒(méi)有強(qiáng)制關(guān)閉排序的sql,整整快了將近4個(gè)小時(shí)(表中數(shù)據(jù)將近3億,沒(méi)關(guān)閉排序前的sql我運(yùn)行了將近4個(gè)小時(shí),還在跑,但優(yōu)化后只跑了幾十秒)看來(lái)優(yōu)化的綜合評(píng)估是很重要的。
總結(jié)
以上是生活随笔為你收集整理的mysql 亿级_mysql 亿级数据量 ( sum ,group by )的优化的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: mysql 5.6压缩安装_MySQL
- 下一篇: mysql游标事例_Mysql 游标示例