當(dāng)前位置：首頁(yè) > 编程资源 > 编程问答 >内容正文

编程问答

学习MongoDB 十一： MongoDB聚合（Aggregation Pipeline基础篇上）（三）

發(fā)布時(shí)間：2024/8/26 编程问答 30 豆豆

生活随笔收集整理的這篇文章主要介紹了学习MongoDB 十一： MongoDB聚合（Aggregation Pipeline基础篇上）（三）小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

一、Aggregate簡(jiǎn)介?

? ? ? ? ? ? ? ??db.collection.aggregate()是基于數(shù)據(jù)處理的聚合管道，每個(gè)文檔通過一個(gè)由多個(gè)階段（stage）組成的管道，可以對(duì)每個(gè)階段的管道進(jìn)行分組、過濾等功能，然后經(jīng)過一系列的處理，輸出相應(yīng)的結(jié)果。

? ? ? ? ? ? ? ?

? ? ???圖來自https://docs.mongodb.com/manual/aggregation/ 官方網(wǎng)

??? ? ? 我們通過這張圖，可以清晰的了解Aggregate處理的過程

? ? ? ? ?1、db.collection.aggregate()可以多個(gè)管道，能方便的進(jìn)行數(shù)據(jù)的處理。

? ? ? ? ?2、db.collection.aggregate()使用了MongoDB內(nèi)置的原生操作，聚合效率非常高,支持類似于SQL Group By操作的功能，而不再需要用戶編寫自定義的JavaScript例程。

? ? ? ? ?3、?每個(gè)階段管道限制為100MB的內(nèi)存。如果一個(gè)節(jié)點(diǎn)管道超過這個(gè)極限,MongoDB將產(chǎn)生一個(gè)錯(cuò)誤。為了能夠在處理大型數(shù)據(jù)集,可以設(shè)置allowDiskUse為true來在聚合管道節(jié)點(diǎn)把數(shù)據(jù)寫入臨時(shí)文件。這樣就可以解決100MB的內(nèi)存的限制。

? ? ? ? ?4、db.collection.aggregate()可以作用在分片集合，但結(jié)果不能輸在分片集合，MapReduce可以作用在分片集合，結(jié)果也可以輸在分片集合。

? ? 5、db.collection.aggregate()方法可以返回一個(gè)指針（cursor），數(shù)據(jù)放在內(nèi)存中，直接操作。跟Mongo shell 一樣指針操作。

? ? ? ? ?6、db.collection.aggregate()輸出的結(jié)果只能保存在一個(gè)文檔中，BSON Document大小限制為16M。可以通過返回指針解決，版本2.6中后面：DB.collect.aggregate()方法返回一個(gè)指針，可以返回任何結(jié)果集的大小。

? ? ? ? ?

?二、aggregate語法：

? ? ? db.collection.aggregate(pipeline, options)

? ? ?【pipeline ?$group參數(shù)】

? ? ? ? ?pipeline?類型是Array ?語法：?db.collection.aggregate( [ { <stage> }, ... ] )?

? ? ? ?$group :?將集合中的文檔分組，可用于統(tǒng)計(jì)結(jié)果，$group首先將數(shù)據(jù)根據(jù)key進(jìn)行分組。

? ? ? $group語法：?{ $group: { _id: <expression>, <field1>: { <accumulator1> : <expression1> }, ... } }

? ? ? ? ? _id?是要進(jìn)行分組的key

? ? ? ?$group：可以分組的數(shù)據(jù)執(zhí)行如下的表達(dá)式計(jì)算：

? ? ? ? ? ?$sum：計(jì)算總和。

? ? ? ? ? ?$avg：計(jì)算平均值。

? ? ? ? ? ?$min：根據(jù)分組，獲取集合中所有文檔對(duì)應(yīng)值得最小值。

? ? ? ? ? ?$max：根據(jù)分組，獲取集合中所有文檔對(duì)應(yīng)值得最大值。

? ? ? ? ? ?$push：將指定的表達(dá)式的值添加到一個(gè)數(shù)組中。

? ? ? ? ? ?$addToSet：將表達(dá)式的值添加到一個(gè)集合中（無重復(fù)值）。

? ? ? ? ? ?$first：返回每組第一個(gè)文檔，如果有排序，按照排序，如果沒有按照默認(rèn)的存儲(chǔ)的順序的第一個(gè)文檔。

? ? ? ? ? ?$last：返回每組最后一個(gè)文檔，如果有排序，按照排序，如果沒有按照默認(rèn)的存儲(chǔ)的順序的最后個(gè)文檔。

? ? ?我們可以通過Aggregation pipeline一些使用跟sql用法一樣，我們能很清晰的怎么去使用

? ? ? ? ? ?pipeline ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? sql

? ? ? ? ? ?$avg ?? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?avg

? ? ? ? ? ?$min ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?min

? ? ? ? ? ?$max ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ??max?

? ? ? ? ? ?$group ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? group by

? ? ? ? ? ?$sort ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? order by

? ? ? ? ? ?$limit ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? limit? ? ? ? ? ? ? ? ? ?

? ? ? ? ? ?$sum ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?sum()

? ? ? ? ? ?$sum ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?count()

? ?三、pipeline $group 簡(jiǎn)單的例子

? ? ??【數(shù)據(jù) 】?

[sql]?view plaincopy

db.items.insert(?[??

??{??

???"quantity"?:?2,??

???"price"?:?5.0,??

???"pnumber"?:?"p003",??

??},{??

???"quantity"?:?2,??

???"price"?:?8.0,??

???"pnumber"?:?"p002"??

??},{??

???"quantity"?:?1,??

???"price"?:?4.0,??

???"pnumber"?:?"p002"??

??},{??

???"quantity"?:?2,??

???"price"?:?4.0,??

???"pnumber"?:?"p001"??

??},{??

???"quantity"?:?4,??

???"price"?:?10.0,??

???"pnumber"?:?"p003"??

??},{??

???"quantity"?:?10,??

???"price"?:?20.0,??

???"pnumber"?:?"p001"??

??},{??

???"quantity"?:?10,??

???"price"?:?20.0,??

???"pnumber"?:?"p003"??

??},{??

???"quantity"?:?5,??

???"price"?:?10.0,??

???"pnumber"?:?"p002"??

??}??

])???????

? ??【$group】

??????1、將集合中的文檔分組，可用于統(tǒng)計(jì)結(jié)果，$group首先將數(shù)據(jù)根據(jù)key進(jìn)行分組。

? ? ? ? ? ? ?_id?是要進(jìn)行分組的key，如果_id為null ?相當(dāng)于select ?count(*) from table

??【 $sum】 ?

?????1、我們統(tǒng)計(jì)items有幾條，相當(dāng)于SQL： ?select count(1) as count from items

[sql]?view plaincopy

>?db.items.count()??

8??

>?db.items.aggregate([{$group:{_id:null,count:{$sum:1}}}])??

{?"_id"?:?null,?"count"?:?8?}??

????2、我們統(tǒng)計(jì)一下數(shù)量，相當(dāng)于SQL： select sum(quantity) as total ?from ?items

[sql]?view plaincopy

>?db.items.aggregate([{$group:{_id:null,total:{$sum:"$quantity"}}}])??

{?"_id"?:?null,?"total"?:?36?}??

? 3、我們通過產(chǎn)品類型來進(jìn)行分組，然后在統(tǒng)計(jì)賣出的數(shù)量是多少，相當(dāng)于SQL：select sum(quantity) as total from ?items ?group by pnumber

[sql]?view plaincopy

??>?db.items.aggregate([{$group:{_id:"$pnumber",total:{$sum:"$quantity"}}}])??

{?"_id"?:?"p001",?"total"?:?12?}??

{?"_id"?:?"p002",?"total"?:?8?}??

{?"_id"?:?"p003",?"total"?:?16?}??

【$min?、?$max 】?

? ?1、我們通過相同的產(chǎn)品類型來進(jìn)行分組，然后查詢相同產(chǎn)品類型賣出最多的訂單詳情，相當(dāng)于SQL： select max(quantity) as quantity from ?items ?group by pnumber

[sql]?view plaincopy

>?db.items.aggregate([{$group:{_id:"$pnumber",max:{$max:"$quantity"}}}])??

{?"_id"?:?"p001",?"max"?:?10?}??

{?"_id"?:?"p002",?"max"?:?5?}??

{?"_id"?:?"p003",?"max"?:?10?}??

? ??2、我們通過相同的產(chǎn)品類型來進(jìn)行分組，然后查詢相同產(chǎn)品類型賣出最多的訂單詳情，相當(dāng)于SQL：select min(quantity) as quantity from ?items ?group by pnumber

[sql]?view plaincopy

>?db.items.aggregate([{$group:{_id:"$pnumber",min:{$min:"$quantity"}}}])??

{?"_id"?:?"p001",?"min"?:?2?}??

{?"_id"?:?"p002",?"min"?:?1?}??

{?"_id"?:?"p003",?"min"?:?2?}??

? 3、我們通過相同的產(chǎn)品類型來進(jìn)行分組，統(tǒng)計(jì)各個(gè)產(chǎn)品數(shù)量，然后獲取最大的數(shù)量，相當(dāng)于SQL：?select max(t.total) from (select sum(quantity) as total from ?items ?group by pnumber) t

[sql]?view plaincopy

>?db.items.aggregate([{$group:{_id:"$pnumber",total:{$sum:"$quantity"}}}])??

{?"_id"?:?"p001",?"total"?:?12?}??

{?"_id"?:?"p002",?"total"?:?8?}??

{?"_id"?:?"p003",?"total"?:?16?}??

>?db.items.aggregate([{$group:{_id:"$pnumber",total:{$sum:"$quantity"}}},{$group:{_id:null,max:{$max:"$total"}}}])??

{?"_id"?:?null,?"max"?:?16?}??

【$avg】

??? 先根據(jù)$group，在計(jì)算平均值,只會(huì)針對(duì)數(shù)字的進(jìn)行計(jì)算，會(huì)對(duì)字符串忽略

? ? 1、我們通過相同的產(chǎn)品類型來進(jìn)行分組，然后查詢每個(gè)訂單詳情相同產(chǎn)品類型賣出的平均價(jià)格，相當(dāng)于SQL：select avg(price) as price from ?items ?group by pnumber

[sql]?view plaincopy

???

>?db.items.aggregate([{$group:{_id:"$pnumber",price:{$avg:"$price"}}}])??

{?"_id"?:?"p001",?"price"?:?12?}??

{?"_id"?:?"p002",?"price"?:?7.333333333333333?}??

{?"_id"?:?"p003",?"price"?:?11.666666666666666?}??

【$push】

??????將指定的表達(dá)式的值添加到一個(gè)數(shù)組中，這個(gè)值不要超過16M，不然會(huì)出現(xiàn)錯(cuò)誤

? ? 1、我們通過相同的產(chǎn)品類型來進(jìn)行分組，然后查詢每個(gè)相同產(chǎn)品賣出的數(shù)量放在數(shù)組里面

[sql]?view plaincopy

>?db.items.aggregate([{$group:{_id:"$pnumber",quantitys:{$push:"$quantity"}}}])??

{?"_id"?:?"p001",?"quantitys"?:?[?2,?10?]?}??

{?"_id"?:?"p002",?"quantitys"?:?[?2,?1,?5?]?}??

{?"_id"?:?"p003",?"quantitys"?:?[?2,?4,?10?]?}??

[sql]?view plaincopy

>?db.items.aggregate([{$group:{_id:"$pnumber",quantitys:{$push:{quantity:"$quantity",price:"$price"}}}}])??

{?"_id"?:?"p001",?"quantitys"?:?[?{?"quantity"?:?2,?"price"?:?4?},?{?"quantity":?10,?"price"?:?20?}?]?}??

{?"_id"?:?"p002",?"quantitys"?:?[?{?"quantity"?:?2,?"price"?:?8?},?{?"quantity":?1,?"price"?:?4?},?{?"quantity"?:?5,?"price"?:?10?}?]?}??

{?"_id"?:?"p003",?"quantitys"?:?[?{?"quantity"?:?2,?"price"?:?5?},?{?"quantity":?4,?"price"?:?10?},?{?"quantity"?:?10,?"price"?:?20?}?]?}??

【 $addToSet】

? ? ? ?將表達(dá)式的值添加到一個(gè)數(shù)組中（無重復(fù)值），這個(gè)值不要超過16M，不然會(huì)出現(xiàn)錯(cuò)誤

[sql]?view plaincopy

>?db.items.aggregate([{$group:{_id:"$pnumber",quantitys:{$addToSet:"$quantity"}}}])??

{?"_id"?:?"p001",?"quantitys"?:?[?10,?2?]?}??

{?"_id"?:?"p002",?"quantitys"?:?[?5,?1,?2?]?}??

{?"_id"?:?"p003",?"quantitys"?:?[?10,?4,?2?]?}??

【 $first、?$last】

? ? $first：返回每組第一個(gè)文檔，如果有排序，按照排序，如果沒有按照默認(rèn)的存儲(chǔ)的順序的第一個(gè)文檔。

? ? $last：返回每組最后一個(gè)文檔，如果有排序，按照排序，如果沒有按照默認(rèn)的存儲(chǔ)的順序的最后個(gè)文檔。

[sql]?view plaincopy

>?db.items.aggregate([{$group:{_id:"$pnumber",quantityFrist:{$first:"$quantity"}}}])??

{?"_id"?:?"p001",?"quantityFrist"?:?2?}??

{?"_id"?:?"p002",?"quantityFrist"?:?2?}??

{?"_id"?:?"p003",?"quantityFrist"?:?2?}??

? ? ??我們這篇主要介紹了aggregate ?pipeline的$group 基礎(chǔ)操作，后續(xù)介紹了 pipeline其他參數(shù)和options使用

總結(jié)

以上是生活随笔為你收集整理的学习MongoDB 十一： MongoDB聚合（Aggregation Pipeline基础篇上）（三）的全部?jī)?nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇： MySQL小误区：关于set globa
下一篇：关于MySQL建表对DML的影响