當前位置：首頁 > 运维知识 > 数据库 >内容正文

数据库

SQLite | Group By 和 Order By 子句

發布時間：2025/3/15 数据库 42 豆豆

生活随笔收集整理的這篇文章主要介紹了 SQLite | Group By 和 Order By 子句小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

文章目錄

1. Group by and Order by
- 1.1 Group Records
- 1.2 Ordering Records
- 1.3 Aggregate Functions
- 1.4 The Having Statement
- 1.5 Getting Distinct Records
- 參考資料

1. Group by and Order by

我們在上一篇中介紹了 Where 子句，接下來我們將使用 Group by 和 Order by 子句，對數據進行聚合和排序。

使用Jupyter Notebook 運行 SQL 語句需安裝 ipython-sql
%sql 以及 %%sql 為在 Notebook 中運行 SQL 語句，在 SQLite 命令行或 SQLite Stiduo 中不需要 %sql 或 %%sql

載入 SQL 以及連接 SQLite：

%load_ext sql %sql sqlite:///DataBase/weather_stations.db 'Connected: @DataBase/weather_stations.db'

本文將使用 weather_stations.db 數據庫，其中包含了 STATION_DATA 表。

首先查看 STATION_DATA 表中的數據：

%sql select * from station_data limit 0,5; -- 篩選前五行 * sqlite:///DataBase/weather_stations.db Done. station_numberreport_codeyearmonthdaydew_pointstation_pressurevisibilitywind_speedtemperatureprecipitationsnow_depthfograinhailthundertornado

143080	34DDA7	2002	12	21	33.8	987.4	3.4	0.2	36	0	None	1	1	1	1	1
766440	39537B	1998	10	1	72.7	1014.6	5.9	6.7	83.3	0	None	0	0	0	0	0
176010	C3C6D5	2001	5	18	55.7	None	7.3	4.3	69.1	0	None	0	0	0	0	0
125600	145150	2007	10	14	33	None	6.9	2.5	39.7	0	None	0	0	0	0	0
470160	EF616A	1967	7	29	65.6	None	9.2	1.2	72.4	0.04	None	0	0	0	0	0

1.1 Group Records

首先從最簡單的聚合方法開始：計數：

%%sql select count(*) as record_cound from station_data; * sqlite:///DataBase/weather_stations.db Done. record_cound

28000

count(*) 意味著計算記錄的長度，你也可以和其他 SQL 操作符結合起來使用，比如 where，我們可以這樣計算 tornado 出現的次數：

%%sql select count(*) as record_count from station_data where tornado == 1; * sqlite:///DataBase/weather_stations.db Done. record_count

3000

我們找到了 3000 條包含 tornado 的記錄，但如果我們想要按年計數呢？我們可以這樣寫：

%%sql select year, count(*) as record_count from station_data where tornado == 1 group by year limit 0,3; -- 只展示前三條 * sqlite:///DataBase/weather_stations.db Done. yearrecord_count

1937	3
1941	3
1942	3

我們現在可以看到每年的計數，讓我們拆分下這個查詢來看看怎么執行的：

select year, -- 1. 首先，我們選擇了 year（select year） count(*) as record_count -- 2. 然后我們用 **count(\*)** 對篩選的記錄進行了計數 from station_data where tornado == 1 -- 3. 我們篩選了 tornado 為 true 的數據 group by year -- 4. 最后，按年進行分類

我們也可以在多個 field 上進行聚合：

%%sql select year, month, count(*) as record_count from station_data where tornado == 1 group by year, month limit 0,3; * sqlite:///DataBase/weather_stations.db Done. yearmonthrecord_count

1937	7	3
1941	8	3
1942	10	3

此外，在使用 group by 時，我們可以也用 序數位置（ordinal positions）：

%%sql select year, month, count(*) as record_count from station_data where tornado == 1 group by 1, 2 -- ordinal positions limit 0,5; * sqlite:///DataBase/weather_stations.db Done. yearmonthrecord_count

1937	7	3
1941	8	3
1942	10	3
1943	1	3
1943	4	3

不是所有的平臺都支持 ordinal positions，例如 Oracle 和 SQL Server，就只能寫全稱

1.2 Ordering Records

需要注意到，我們通過 group 得到的數據中 month 并不是按自然月份排序的，所以字哦好就是同時使用 oreder by 操作符來進行排序，如果你想要先按年份排序，再按月份排序，你只需要添加：

%%sql select year, month, count(*) as record_count from station_data where tornado == 1 group by 1, 2 -- ordinal positions order by 1, 2 -- order by 同樣支持 ordinal positions limit 0,5; * sqlite:///DataBase/weather_stations.db Done. yearmonthrecord_count

1937	7	3
1941	8	3
1942	10	3
1943	1	3
1943	4	3

order by 默認是按升序（ASC）排列的，然而你可能更對近期的數據感興趣，你可以通過添加 DESC 來指定排序方式：

%%sql select year, month, count(*) as record_count from station_data where tornado == 1 group by year, month order by year DESC, month limit 0,5; * sqlite:///DataBase/weather_stations.db Done. yearmonthrecord_count

2010	3	6
2009	1	3
2009	2	3
2009	4	2
2009	5	6

1.3 Aggregate Functions

我們已經使用 count(*) 來對記錄進行計數了，但還有其他的一些聚合函數（AggregateyFunctions），
如 sum()、min()、max() 和 avg()。我們可以在特定的列上使用聚合函數來進行計算。

圖1 SQLite 內置聚合函數

但首先讓我們來看看 count() 的另一種使用方式， count() 可以用于除了計數以外的其他用途。如果你不使用 * ，
而是指定某一列，那么它將會計算所有非缺失值（non-null）的個數。舉個例子，我們可以計算 snow_depth 中非缺失值的個數：

%%sql select count(snow_depth) as recorded_snow_depth_count from station_data * sqlite:///DataBase/weather_stations.db Done. recorded_snow_depth_count

1552

讓我們進一步看看聚合函數，如果你想要看看你從 2000 年開始每個月的平均溫度，你可以先篩選 2000 年的記錄，
然后按月份分組，最后計算平均溫度：

%%sql select month, avg(temperature) as avg_temp from station_data where year >= 2000 group by month limit 0,3; * sqlite:///DataBase/weather_stations.db Done. monthavg_temp

1	41.55585443037976
2	38.98063127690104
3	48.975062656641576

sum() 是另一個常見的聚合操作符，為了得到 2000 年至今每年的下雪深度，你可以這樣查詢：

%%sql select year, sum(snow_depth) as total_snow from station_data where year >= 2000 group by year limit 0,3; * sqlite:///DataBase/weather_stations.db Done. yeartotal_snow

2000	685.8999999999999
2001	391.90000000000003
2002	437.69999999999993

你可以在一次查詢中多次使用聚合操作，我們將 2000 年以來的下雪總量、下雨總量和最大降雨量分別統計出來，并保留兩位小數：

%%sql select year, round(sum(snow_depth), 2) as total_snow, round(sum(precipitation), 2) as total_precipitation, round(max(precipitation), 2) as max_precipitation from station_data where year >= 2000 group by year limit 0,3; * sqlite:///DataBase/weather_stations.db Done. yeartotal_snowtotal_precipitationmax_precipitation

2000	685.9	27.57	0.87
2001	391.9	38.15	2.95
2002	437.7	43.06	5.0

1.4 The Having Statement

假設你想要基于一個聚合值來篩選記錄，你的第一反應應該是使用 where 子句。確實， where 子句可以
用來篩選記錄，但是卻無法用于聚合值上。舉個例子，如果你想使用 where 子句篩選出總下雨量大于 30 的記錄，
就會出現以下錯誤：

%%sql select year, sum(precipitation) as total_precipitation from station_data where total_precipitation > 30 group by year limit 0,3; * sqlite:///DataBase/weather_stations.db (sqlite3.OperationalError) misuse of aggregate: sum() [SQL: select year, sum(precipitation) as total_precipitation from station_data where total_precipitation > 30 group by year limit 0,3;] (Background on this error at: http://sqlalche.me/e/e3q8)

為什么不起作用呢？首先我們來看下聚合的原理，首先程序一行一行的掃描，找出那些在 where 子句
上成立的數據，然后再進行聚合。然而在聚合前并沒有 total_precipitation 這一列數據，因此出錯。

當你想在聚合值上執行 where 這個方法時，只能使用 having 這個關鍵詞：

%%sql select year, sum(precipitation) as total_precipitation from station_data group by year having total_precipitation > 30 limit 0,3 * sqlite:///DataBase/weather_stations.db Done. yeartotal_precipitation

1973	35.07999999999996
1974	42.209999999999994
1975	48.25999999999997

having 相當于聚合版的 where，但并不是所有平臺都支持在 aliases 上使用 having ，
如 Oracle（group by 也不行），這意味著當你使用 having 時需要再輸入一次聚合函數，像這樣：

%%sql select year, sum(precipitation) as total_preicipitation from station_data group by year having sum(precipitation) > 30 limit 0,3 * sqlite:///DataBase/weather_stations.db Done. yeartotal_preicipitation

1973	35.07999999999996
1974	42.209999999999994
1975	48.25999999999997

1.5 Getting Distinct Records

當我們使用 **select from** 時，記錄中可能會包含重復值，如果你只想要返回**唯一值（distinct records）**，你可以使用 **select distinct from**，比如我們的 station_data，表中 station_number 一列包含了 28000 個值，但你通過 **select distinct from** 后會發現其中是 6368 個值不斷重復出現組成的 %%sql select count(station_number) as duplicate_num from station_data; * sqlite:///DataBase/weather_stations.db Done. duplicate_num

28000

%%sql select count(distinct station_number) as distinct_num from station_data; * sqlite:///DataBase/weather_stations.db Done. distinct_num

6368

參考資料

[1] Thomas Nield.Getting Started with SQL[M].US: O’Reilly, 2016: 29-37

相關文章：

總結

以上是生活随笔為你收集整理的SQLite | Group By 和 Order By 子句的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： Opencv 图像入门一之基本操作
下一篇：网线水晶头插法