深入理解GO语言:GC原理及源码分析
Go 中的runtime 類似 Java的虛擬機,它負責管理包括內存分配、垃圾回收、棧處理、goroutine、channel、切片(slice)、map 和反射(reflection)等。Go 的可執行文件都比相對應的源代碼文件要大很多,這是因為 Go 的 runtime 嵌入到了每一個可執行文件當中。
常見的幾種gc算法:
引用計數:對每個對象維護一個引用計數,當引用該對象的對象被銷毀時,引用計數減1,當引用計數器為0是回收該對象。
優點:對象可以很快的被回收,不會出現內存耗盡或達到某個閥值時才回收。
缺點:不能很好的處理循環引用,而且實時維護引用計數,有也一定的代價。
代表語言:Python、PHP、Swift
標記-清除:從根變量開始遍歷所有引用的對象,引用的對象標記為"被引用",沒有被標記的進行回收。
優點:解決了引用計數的缺點。
缺點:需要STW,即要暫時停掉程序運行。
代表語言:Golang(其采用三色標記法)
分代收集:按照對象生命周期長短劃分不同的代空間,生命周期長的放入老年代,而短的放入新生代,不同代有不能的回收算法和回收頻率。
優點:回收性能好
缺點:算法復雜
代表語言: JAVA
每種算法都不是完美的,都是折中的產物。
Gc流程圖:
Stack scan:收集根對象(全局變量,和G stack),開啟寫屏障。全局變量、開啟寫屏障需要STW,G stack只需要停止該G就好,時間比較少。
?Mark: 掃描所有根對象, 和根對象可以到達的所有對象, 標記它們不被回收
Mark Termination: 完成標記工作, 重新掃描部分根對象(要求STW)
Sweep: 按標記結果清掃span
從上圖中我們可以看到整個GC流程會進行兩次STW(Stop The World), 第一次是Mark階段的開始, 第二次是Mark Termination階段.
第一次STW會準備根對象的掃描, 啟動寫屏障(Write Barrier)和輔助GC(mutator assist).
第二次STW會重新掃描部分根對象, 禁用寫屏障(Write Barrier)和輔助GC(mutator assist).
需要注意的是, 不是所有根對象的掃描都需要STW, 例如掃描棧上的對象只需要停止擁有該棧的G.
三色標記
有黑、灰、白三個集合,每種顏色的含義:
白色:對象未被標記,gcmarkBits對應的位為0
灰色:對象已被標記,但這個對象包含的子對象未標記,gcmarkBits對應的位為1
黑色:對象已被標記,且這個對象包含的子對象也已標記,gcmarkBits對應的位為1
灰色和黑色的gcmarkBits都是1,如何區分二者呢?
標記任務有標記隊列,在標記隊列中的是灰色,不在標記隊里中的是黑色。標記過程見下圖:
?
上圖中根對象A是棧上分配的對象,H是堆中分配的全局變量,根對象A、H內部有分別引用了其他對象,而其他對象內部可能還引用額其他對象,各個對象見的關系如上圖所示。
屏障
????????????????????????
上圖,假如B對象變黑后,又給B指向對象G,因為這個時候G對象已經掃描過了,所以G 對象還是白色,會被誤回收。怎么解決這個問題呢?
最簡單的方法就是STW(stop the world)。也就是說,停止所有的協程。這個方法比較暴力會引起程序的卡頓,并不友好。讓GC回收器,滿足下面兩種情況之一時,可保對象不丟失. 所以引出強-弱三色不變式:
強三色不變式:黑色不能引用白色對象。
弱三色不變式:被黑色引用的白色對象都處于灰色保護。
如何實現這個兩個公式呢?這就是屏障機制。
GO1.5 采用了插入屏障、刪除屏障。到了GO1.8采用混合屏障。黑色對象的內存槽有兩種位置, 棧和堆. 棧空間的特點是容量小,但是要求相應速度快,因為函數調用彈出頻繁使用, 所以“插入屏障”機制,在棧空間的對象操作中不使用. 而僅僅使用在堆空間對象的操作中。
插入屏障:插入屏障只對堆上的內存分配起作用,棧空間先掃描一遍然后啟動STW后再重新掃描一遍掃描后停止STW。如果在對象在插入平展期間分配內存會自動設置成灰色,不用再重新掃描。
刪除屏障:刪除屏障適用于棧和堆,在刪除屏障機制下刪除一個節點該節點會被置成灰色,后續會繼續掃描該灰色對象的子對象。該方法就是精準度不夠高
混合屏障:
插入寫屏障和刪除寫屏障的短板:
插入寫屏障:結束時需要STW來重新掃描棧,標記棧上引用的白色對象的存活;
刪除寫屏障:回收精度低,GC開始時STW掃描堆棧來記錄初始快照,這個過程會保護開始時刻的所有存活對象。
混合寫屏障規則
具體操作:
1、GC開始將棧上的對象全部掃描并標記為黑色(之后不再進行第二次重復掃描,無需STW),
2、GC期間,任何在棧上創建的新對象,均為黑色。
3、被刪除的對象標記為灰色。
4、被添加的對象標記為灰色。
滿足: 變形的弱三色不變式.
偽代碼如下:
添加下游對象(當前下游對象slot, 新下游對象ptr) {//1 標記灰色(當前下游對象slot) //只要當前下游對象被移走,就標記灰色//2 標記灰色(新下游對象ptr)//3當前下游對象slot = 新下游對象ptr }上面說到整個GC有兩次STW,采用混合屏障后可以大幅壓縮第二次STW的時間。
Gc pacer
觸發gc的時機:
閾值gcTriggerHeap:默認內存擴大一倍,啟動gc
定期gcTriggerTime:默認2min觸發一次gc,src/runtime/proc.go:forcegcperiod
手動gcTriggerCycle:runtime.gc()
當然了閥值是根據使用內存的增加動態變化的。假如前一次GC之后內存使用Hm(n-1)為1GB,默認GCGO=100,那么下一次會在接近Hg(2GB)的位置發起新一輪的GC。如下圖:
Ht的時候開始GC,Ha的時候結束GC,Ha非常接近Hg。
(1)如何保證在Ht開始gc時所有的span都被清掃完?
除了有一個后臺清掃協程外,用戶的分配內存時也需要輔助清掃來保證在開啟下一輪的gc時span都被清掃完畢。假設有k page的span需要sweep,那么距離下一次gc還有Ht-Hm(n-1)的內存可供分配,那么平均每申請1byte內存需要清掃k/ Ht-Hm(n-1) page?的span。(k值會根據sweep進度更改)
輔助清掃申請新span時才會檢查,,輔助清掃的觸發可以看cacheSpan函數, 觸發時G會幫助回收"工作量"頁的對象, 工作量的計算公式是:
spanBytes * sweepPagesPerByte意思是分配的大小乘以系數sweepPagesPerByte, sweepPagesPerByte的計算在函數gcSetTriggerRatio中, 公式是:
// 當前的Heap大小 heapLiveBasis := atomic.Load64(&memstats.heap_live) // 距離觸發GC的Heap大小 = 下次觸發GC的Heap大小 - 當前的Heap大小 heapDistance := int64(trigger) - int64(heapLiveBasis) heapDistance -= 1024 * 1024 if heapDistance < _PageSize {heapDistance = _PageSize } // 已清掃的頁數 pagesSwept := atomic.Load64(&mheap_.pagesSwept) // 未清掃的頁數 = 使用中的頁數 - 已清掃的頁數 sweepDistancePages := int64(mheap_.pagesInUse) - int64(pagesSwept) if sweepDistancePages <= 0 {mheap_.sweepPagesPerByte = 0 } else {// 每分配1 byte(的span)需要輔助清掃的頁數 = 未清掃的頁數 / 距離觸發GC的Heap大小mheap_.sweepPagesPerByte = float64(sweepDistancePages) / float64(heapDistance) }?
(2)如何保證在Ha時gc都被mark完?
Gc在Ht開始,在到達Hg時盡量標記完所有的對象,除了后臺的標記協程外還需要在分配內存是進行輔助mark。從Ht到Hg的內存可以分配,這個時候還有scanWorkExpected的對象需要scan,那么平均分配1byte內存需要輔助mark量:scanWorkExpected/(Hg-Ht) 個對象,scanWorkExpected會根據mark進度更改。
輔助標記的觸發可以查看上面的mallocgc函數, 觸發時G會幫助掃描"工作量"個對象, 工作量的計算公式是:
debtBytes * assistWorkPerByte意思是分配的大小乘以系數assistWorkPerByte, assistWorkPerByte的計算在函數revise中, 公式是:
// 等待掃描的對象數量 = 未掃描的對象數量 - 已掃描的對象數量 scanWorkExpected := int64(memstats.heap_scan) - c.scanWork if scanWorkExpected < 1000 {scanWorkExpected = 1000 } // 距離觸發GC的Heap大小 = 期待觸發GC的Heap大小 - 當前的Heap大小 // 注意next_gc的計算跟gc_trigger不一樣, next_gc等于heap_marked * (1 + gcpercent / 100) heapDistance := int64(memstats.next_gc) - int64(atomic.Load64(&memstats.heap_live)) if heapDistance <= 0 {heapDistance = 1 } // 每分配1 byte需要輔助掃描的對象數量 = 等待掃描的對象數量 / 距離觸發GC的Heap大小 c.assistWorkPerByte = float64(scanWorkExpected) / float64(heapDistance) c.assistBytesPerWork = float64(heapDistance) / float64(scanWorkExpected)?根對象
在GC的標記階段首先需要標記的就是"根對象", 從根對象開始可到達的所有對象都會被認為是存活的.
根對象包含了全局變量, 各個G的棧上的變量等, GC會先掃描根對象然后再掃描根對象可到達的所有對象.
Fixed Roots: 特殊的掃描工作 :
fixedRootFinalizers: 掃描析構器隊列
fixedRootFreeGStacks: 釋放已中止的G的棧
Flush Cache Roots: 釋放mcache中的所有span, 要求STW
Data Roots: 掃描可讀寫的全局變量
BSS Roots: 掃描只讀的全局變量
Span Roots: 掃描各個span中特殊對象(析構器列表)
Stack Roots: 掃描各個G的棧
標記階段(Mark)會做其中的"Fixed Roots", "Data Roots", "BSS Roots", "Span Roots", "Stack Roots".
完成標記階段(Mark Termination)會做其中的"Fixed Roots", "Flush Cache Roots".
對象掃描
當拿到一個對象的p時如何找到該對象的span和heapbit。以下分析是基于go1.10
我們在內存分配部分介紹過2 bit表示一個字,一個字節就可以表示4個字。2bit中一個表示是否被scan另一個表示該對象內是否有指針類型,根據地址p可以根據固定偏移計算出該p對應的hbit:
func heapBitsForAddr(addr uintptr) heapBits {// 2 bits per work, 4 pairs per byte, and a mask is hard coded.off := (addr - mheap_.arena_start) / sys.PtrSizereturn heapBits{(*uint8)(unsafe.Pointer(mheap_.bitmap - off/4 - 1)), uint32(off & 3)} }查找p對應的span更簡單了,我們前面介紹過spans區域中就是記錄每個page對應的span結構,所以根據p對page取余計算出是第幾個page就可以找到對應的span指針了
mheap_.spans[(p-mheap_.arena_start)>>_PageShift]以下分析是基于go1.11及之后
Go1.11及以后的版本改用了稀疏索引的方式來管理整體的內存. 可以超過 512G 內存, 也可以允許內存空間擴展時不連續.在全局的 mheap struct 中有個 arenas 二階數組, 在 linux amd64 上,一階只有一個 slot, 二階有 4M 個 slot, 每個 slot 指向一個 heapArena 結構, 每個 heapArena 結構可以管理 64M 內存, 所以在新的版本中, go 可以管理 4M*64M=256TB 內存, 即目前 64 位機器中 48bit 的尋址總線全部 256TB 內存。可以通過指針加上一定得偏移量, 就知道屬于哪個 heap arean 64M 塊. 再通過對 64M 求余, 結合 spans 數組, 即可知道屬于哪個 mspan 了,結合 heapArean 的 bitmap 和每 8 個字節在 heapArean 中的偏移, 就可知道對象每 8 個字節是指針還是普通數據。
源碼分析
源碼分析引自:https://www.cnblogs.com/zkweb/p/7880099.html 講的很詳細:
?go觸發gc會從gcStart函數開始:
// gcStart transitions the GC from _GCoff to _GCmark (if // !mode.stwMark) or _GCmarktermination (if mode.stwMark) by // performing sweep termination and GC initialization. // // This may return without performing this transition in some cases, // such as when called on a system stack or with locks held. func gcStart(mode gcMode, trigger gcTrigger) {// 判斷當前G是否可搶占, 不可搶占時不觸發GC// Since this is called from malloc and malloc is called in// the guts of a number of libraries that might be holding// locks, don't attempt to start GC in non-preemptible or// potentially unstable situations.mp := acquirem()if gp := getg(); gp == mp.g0 || mp.locks > 1 || mp.preemptoff != "" {releasem(mp)return}releasem(mp)mp = nil// 并行清掃上一輪GC未清掃的span// Pick up the remaining unswept/not being swept spans concurrently//// This shouldn't happen if we're being invoked in background// mode since proportional sweep should have just finished// sweeping everything, but rounding errors, etc, may leave a// few spans unswept. In forced mode, this is necessary since// GC can be forced at any point in the sweeping cycle.//// We check the transition condition continuously here in case// this G gets delayed in to the next GC cycle.for trigger.test() && gosweepone() != ^uintptr(0) {sweep.nbgsweep++}// 上鎖, 然后重新檢查gcTrigger的條件是否成立, 不成立時不觸發GC// Perform GC initialization and the sweep termination// transition.semacquire(&work.startSema)// Re-check transition condition under transition lock.if !trigger.test() {semrelease(&work.startSema)return}// 記錄是否強制觸發, gcTriggerCycle是runtime.GC用的// For stats, check if this GC was forced by the user.work.userForced = trigger.kind == gcTriggerAlways || trigger.kind == gcTriggerCycle// 判斷是否指定了禁止并行GC的參數// In gcstoptheworld debug mode, upgrade the mode accordingly.// We do this after re-checking the transition condition so// that multiple goroutines that detect the heap trigger don't// start multiple STW GCs.if mode == gcBackgroundMode {if debug.gcstoptheworld == 1 {mode = gcForceMode} else if debug.gcstoptheworld == 2 {mode = gcForceBlockMode}}// Ok, we're doing it! Stop everybody elsesemacquire(&worldsema)// 跟蹤處理if trace.enabled {traceGCStart()}// 啟動后臺掃描任務(G)if mode == gcBackgroundMode {gcBgMarkStartWorkers()}// 重置標記相關的狀態gcResetMarkState()// 重置參數work.stwprocs, work.maxprocs = gcprocs(), gomaxprocswork.heap0 = atomic.Load64(&memstats.heap_live)work.pauseNS = 0work.mode = mode// 記錄開始時間now := nanotime()work.tSweepTerm = nowwork.pauseStart = now// 停止所有運行中的G, 并禁止它們運行systemstack(stopTheWorldWithSema)// !!!!!!!!!!!!!!!!// 世界已停止(STW)...// !!!!!!!!!!!!!!!!// 清掃上一輪GC未清掃的span, 確保上一輪GC已完成// Finish sweep before we start concurrent scan.systemstack(func() {finishsweep_m()})// 清掃sched.sudogcache和sched.deferpool// clearpools before we start the GC. If we wait they memory will not be// reclaimed until the next GC cycle.clearpools()// 增加GC計數work.cycles++// 判斷是否并行GC模式if mode == gcBackgroundMode { // Do as much work concurrently as possible// 標記新一輪GC已開始gcController.startCycle()work.heapGoal = memstats.next_gc// 設置全局變量中的GC狀態為_GCmark// 然后啟用寫屏障// Enter concurrent mark phase and enable// write barriers.//// Because the world is stopped, all Ps will// observe that write barriers are enabled by// the time we start the world and begin// scanning.//// Write barriers must be enabled before assists are// enabled because they must be enabled before// any non-leaf heap objects are marked. Since// allocations are blocked until assists can// happen, we want enable assists as early as// possible.setGCPhase(_GCmark)// 重置后臺標記任務的計數gcBgMarkPrepare() // Must happen before assist enable.// 計算掃描根對象的任務數量gcMarkRootPrepare()// 標記所有tiny alloc等待合并的對象// Mark all active tinyalloc blocks. Since we're// allocating from these, they need to be black like// other allocations. The alternative is to blacken// the tiny block on every allocation from it, which// would slow down the tiny allocator.gcMarkTinyAllocs()// 啟用輔助GC// At this point all Ps have enabled the write// barrier, thus maintaining the no white to// black invariant. Enable mutator assists to// put back-pressure on fast allocating// mutators.atomic.Store(&gcBlackenEnabled, 1)// 記錄標記開始的時間// Assists and workers can start the moment we start// the world.gcController.markStartTime = now// 重新啟動世界// 前面創建的后臺標記任務會開始工作, 所有后臺標記任務都完成工作后, 進入完成標記階段// Concurrent mark.systemstack(startTheWorldWithSema)// !!!!!!!!!!!!!!!// 世界已重新啟動...// !!!!!!!!!!!!!!!// 記錄停止了多久, 和標記階段開始的時間now = nanotime()work.pauseNS += now - work.pauseStartwork.tMark = now} else {// 不是并行GC模式// 記錄完成標記階段開始的時間t := nanotime()work.tMark, work.tMarkTerm = t, twork.heapGoal = work.heap0// 跳過標記階段, 執行完成標記階段// 所有標記工作都會在世界已停止的狀態執行// (標記階段會設置work.markrootDone=true, 如果跳過則它的值是false, 完成標記階段會執行所有工作)// 完成標記階段會重新啟動世界// Perform mark termination. This will restart the world.gcMarkTermination(memstats.triggerRatio)}semrelease(&work.startSema) }接下來一個個分析gcStart調用的函數, 建議配合上面的"回收對象的流程"中的圖理解.
函數gcBgMarkStartWorkers用于啟動后臺標記任務, 先分別對每個P啟動一個:
// gcBgMarkStartWorkers prepares background mark worker goroutines. // These goroutines will not run until the mark phase, but they must // be started while the work is not stopped and from a regular G // stack. The caller must hold worldsema. func gcBgMarkStartWorkers() {// Background marking is performed by per-P G's. Ensure that// each P has a background GC G.for _, p := range &allp {if p == nil || p.status == _Pdead {break}// 如果已啟動則不重復啟動if p.gcBgMarkWorker == 0 {go gcBgMarkWorker(p)// 啟動后等待該任務通知信號量bgMarkReady再繼續notetsleepg(&work.bgMarkReady, -1)noteclear(&work.bgMarkReady)}} }這里雖然為每個P啟動了一個后臺標記任務, 但是可以同時工作的只有25%, 這個邏輯在協程M獲取G時調用的findRunnableGCWorker中:
// findRunnableGCWorker returns the background mark worker for _p_ if it // should be run. This must only be called when gcBlackenEnabled != 0. func (c *gcControllerState) findRunnableGCWorker(_p_ *p) *g {if gcBlackenEnabled == 0 {throw("gcControllerState.findRunnable: blackening not enabled")}if _p_.gcBgMarkWorker == 0 {// The mark worker associated with this P is blocked// performing a mark transition. We can't run it// because it may be on some other run or wait queue.return nil}if !gcMarkWorkAvailable(_p_) {// No work to be done right now. This can happen at// the end of the mark phase when there are still// assists tapering off. Don't bother running a worker// now because it'll just return immediately.return nil}// 原子減少對應的值, 如果減少后大于等于0則返回true, 否則返回falsedecIfPositive := func(ptr *int64) bool {if *ptr > 0 {if atomic.Xaddint64(ptr, -1) >= 0 {return true}// We lost a raceatomic.Xaddint64(ptr, +1)}return false}// 減少dedicatedMarkWorkersNeeded, 成功時后臺標記任務的模式是Dedicated// dedicatedMarkWorkersNeeded是當前P的數量的25%去除小數點// 詳見startCycle函數if decIfPositive(&c.dedicatedMarkWorkersNeeded) {// This P is now dedicated to marking until the end of// the concurrent mark phase._p_.gcMarkWorkerMode = gcMarkWorkerDedicatedMode} else {// 減少fractionalMarkWorkersNeeded, 成功是后臺標記任務的模式是Fractional// 上面的計算如果小數點后有數值(不能夠整除)則fractionalMarkWorkersNeeded為1, 否則為0// 詳見startCycle函數// 舉例來說, 4個P時會執行1個Dedicated模式的任務, 5個P時會執行1個Dedicated模式和1個Fractional模式的任務if !decIfPositive(&c.fractionalMarkWorkersNeeded) {// No more workers are need right now.return nil}// 按Dedicated模式的任務的執行時間判斷cpu占用率是否超過預算值, 超過時不啟動// This P has picked the token for the fractional worker.// Is the GC currently under or at the utilization goal?// If so, do more work.//// We used to check whether doing one time slice of work// would remain under the utilization goal, but that has the// effect of delaying work until the mutator has run for// enough time slices to pay for the work. During those time// slices, write barriers are enabled, so the mutator is running slower.// Now instead we do the work whenever we're under or at the// utilization work and pay for it by letting the mutator run later.// This doesn't change the overall utilization averages, but it// front loads the GC work so that the GC finishes earlier and// write barriers can be turned off sooner, effectively giving// the mutator a faster machine.//// The old, slower behavior can be restored by setting// gcForcePreemptNS = forcePreemptNS.const gcForcePreemptNS = 0// TODO(austin): We could fast path this and basically// eliminate contention on c.fractionalMarkWorkersNeeded by// precomputing the minimum time at which it's worth// next scheduling the fractional worker. Then Ps// don't have to fight in the window where we've// passed that deadline and no one has started the// worker yet.//// TODO(austin): Shorter preemption interval for mark// worker to improve fairness and give this// finer-grained control over schedule?now := nanotime() - gcController.markStartTimethen := now + gcForcePreemptNStimeUsed := c.fractionalMarkTime + gcForcePreemptNSif then > 0 && float64(timeUsed)/float64(then) > c.fractionalUtilizationGoal {// Nope, we'd overshoot the utilization goalatomic.Xaddint64(&c.fractionalMarkWorkersNeeded, +1)return nil}_p_.gcMarkWorkerMode = gcMarkWorkerFractionalMode}// 安排后臺標記任務執行// Run the background mark workergp := _p_.gcBgMarkWorker.ptr()casgstatus(gp, _Gwaiting, _Grunnable)if trace.enabled {traceGoUnpark(gp, 0)}return gp }gcResetMarkState函數會重置標記相關的狀態:
// gcResetMarkState resets global state prior to marking (concurrent // or STW) and resets the stack scan state of all Gs. // // This is safe to do without the world stopped because any Gs created // during or after this will start out in the reset state. func gcResetMarkState() {// This may be called during a concurrent phase, so make sure// allgs doesn't change.lock(&allglock)for _, gp := range allgs {gp.gcscandone = false // set to true in gcphaseworkgp.gcscanvalid = false // stack has not been scannedgp.gcAssistBytes = 0}unlock(&allglock)work.bytesMarked = 0work.initialHeapLive = atomic.Load64(&memstats.heap_live)work.markrootDone = false }stopTheWorldWithSema函數會停止整個世界, 這個函數必須在g0中運行:
// stopTheWorldWithSema is the core implementation of stopTheWorld. // The caller is responsible for acquiring worldsema and disabling // preemption first and then should stopTheWorldWithSema on the system // stack: // // semacquire(&worldsema, 0) // m.preemptoff = "reason" // systemstack(stopTheWorldWithSema) // // When finished, the caller must either call startTheWorld or undo // these three operations separately: // // m.preemptoff = "" // systemstack(startTheWorldWithSema) // semrelease(&worldsema) // // It is allowed to acquire worldsema once and then execute multiple // startTheWorldWithSema/stopTheWorldWithSema pairs. // Other P's are able to execute between successive calls to // startTheWorldWithSema and stopTheWorldWithSema. // Holding worldsema causes any other goroutines invoking // stopTheWorld to block. func stopTheWorldWithSema() {_g_ := getg()// If we hold a lock, then we won't be able to stop another M// that is blocked trying to acquire the lock.if _g_.m.locks > 0 {throw("stopTheWorld: holding locks")}lock(&sched.lock)// 需要停止的P數量sched.stopwait = gomaxprocs// 設置gc等待標記, 調度時看見此標記會進入等待atomic.Store(&sched.gcwaiting, 1)// 搶占所有運行中的Gpreemptall()// 停止當前的P// stop current P_g_.m.p.ptr().status = _Pgcstop // Pgcstop is only diagnostic.// 減少需要停止的P數量(當前的P算一個)sched.stopwait--// 搶占所有在Psyscall狀態的P, 防止它們重新參與調度// try to retake all P's in Psyscall statusfor i := 0; i < int(gomaxprocs); i++ {p := allp[i]s := p.statusif s == _Psyscall && atomic.Cas(&p.status, s, _Pgcstop) {if trace.enabled {traceGoSysBlock(p)traceProcStop(p)}p.syscalltick++sched.stopwait--}}// 防止所有空閑的P重新參與調度// stop idle P'sfor {p := pidleget()if p == nil {break}p.status = _Pgcstopsched.stopwait--}wait := sched.stopwait > 0unlock(&sched.lock)// 如果仍有需要停止的P, 則等待它們停止// wait for remaining P's to stop voluntarilyif wait {for {// 循環等待 + 搶占所有運行中的G// wait for 100us, then try to re-preempt in case of any racesif notetsleep(&sched.stopnote, 100*1000) {noteclear(&sched.stopnote)break}preemptall()}}// 邏輯正確性檢查// sanity checksbad := ""if sched.stopwait != 0 {bad = "stopTheWorld: not stopped (stopwait != 0)"} else {for i := 0; i < int(gomaxprocs); i++ {p := allp[i]if p.status != _Pgcstop {bad = "stopTheWorld: not stopped (status != _Pgcstop)"}}}if atomic.Load(&freezing) != 0 {// Some other thread is panicking. This can cause the// sanity checks above to fail if the panic happens in// the signal handler on a stopped thread. Either way,// we should halt this thread.lock(&deadlock)lock(&deadlock)}if bad != "" {throw(bad)}// 到這里所有運行中的G都會變為待運行, 并且所有的P都不能被M獲取// 也就是說所有的go代碼(除了當前的)都會停止運行, 并且不能運行新的go代碼 }finishsweep_m函數會清掃上一輪GC未清掃的span, 確保上一輪GC已完成:
// finishsweep_m ensures that all spans are swept. // // The world must be stopped. This ensures there are no sweeps in // progress. // //go:nowritebarrier func finishsweep_m() {// sweepone會取出一個未sweep的span然后執行sweep// 詳細將在下面sweep階段時分析// Sweeping must be complete before marking commences, so// sweep any unswept spans. If this is a concurrent GC, there// shouldn't be any spans left to sweep, so this should finish// instantly. If GC was forced before the concurrent sweep// finished, there may be spans to sweep.for sweepone() != ^uintptr(0) {sweep.npausesweep++}// 所有span都sweep完成后, 啟動一個新的markbit時代// 這個函數是實現span的gcmarkBits和allocBits的分配和復用的關鍵, 流程如下// - span分配gcmarkBits和allocBits// - span完成sweep// - 原allocBits不再被使用// - gcmarkBits變為allocBits// - 分配新的gcmarkBits// - 開啟新的markbit時代// - span完成sweep, 同上// - 開啟新的markbit時代// - 2個時代之前的bitmap將不再被使用, 可以復用這些bitmapnextMarkBitArenaEpoch() }clearpools函數會清理sched.sudogcache和sched.deferpool, 讓它們的內存可以被回收:
func clearpools() {// clear sync.Poolsif poolcleanup != nil {poolcleanup()}// Clear central sudog cache.// Leave per-P caches alone, they have strictly bounded size.// Disconnect cached list before dropping it on the floor,// so that a dangling ref to one entry does not pin all of them.lock(&sched.sudoglock)var sg, sgnext *sudogfor sg = sched.sudogcache; sg != nil; sg = sgnext {sgnext = sg.nextsg.next = nil}sched.sudogcache = nilunlock(&sched.sudoglock)// Clear central defer pools.// Leave per-P pools alone, they have strictly bounded size.lock(&sched.deferlock)for i := range sched.deferpool {// disconnect cached list before dropping it on the floor,// so that a dangling ref to one entry does not pin all of them.var d, dlink *_deferfor d = sched.deferpool[i]; d != nil; d = dlink {dlink = d.linkd.link = nil}sched.deferpool[i] = nil}unlock(&sched.deferlock) }startCycle標記開始了新一輪的GC:
// startCycle resets the GC controller's state and computes estimates // for a new GC cycle. The caller must hold worldsema. func (c *gcControllerState) startCycle() {c.scanWork = 0c.bgScanCredit = 0c.assistTime = 0c.dedicatedMarkTime = 0c.fractionalMarkTime = 0c.idleMarkTime = 0// 偽裝heap_marked的值如果gc_trigger的值很小, 防止后面對triggerRatio做出錯誤的調整// If this is the first GC cycle or we're operating on a very// small heap, fake heap_marked so it looks like gc_trigger is// the appropriate growth from heap_marked, even though the// real heap_marked may not have a meaningful value (on the// first cycle) or may be much smaller (resulting in a large// error response).if memstats.gc_trigger <= heapminimum {memstats.heap_marked = uint64(float64(memstats.gc_trigger) / (1 + memstats.triggerRatio))}// 重新計算next_gc, 注意next_gc的計算跟gc_trigger不一樣// Re-compute the heap goal for this cycle in case something// changed. This is the same calculation we use elsewhere.memstats.next_gc = memstats.heap_marked + memstats.heap_marked*uint64(gcpercent)/100if gcpercent < 0 {memstats.next_gc = ^uint64(0)}// 確保next_gc和heap_live之間最少有1MB// Ensure that the heap goal is at least a little larger than// the current live heap size. This may not be the case if GC// start is delayed or if the allocation that pushed heap_live// over gc_trigger is large or if the trigger is really close to// GOGC. Assist is proportional to this distance, so enforce a// minimum distance, even if it means going over the GOGC goal// by a tiny bit.if memstats.next_gc < memstats.heap_live+1024*1024 {memstats.next_gc = memstats.heap_live + 1024*1024}// 計算可以同時執行的后臺標記任務的數量// dedicatedMarkWorkersNeeded等于P的數量的25%去除小數點// 如果可以整除則fractionalMarkWorkersNeeded等于0否則等于1// totalUtilizationGoal是GC所占的P的目標值(例如P一共有5個時目標是1.25個P)// fractionalUtilizationGoal是Fractiona模式的任務所占的P的目標值(例如P一共有5個時目標是0.25個P)// Compute the total mark utilization goal and divide it among// dedicated and fractional workers.totalUtilizationGoal := float64(gomaxprocs) * gcGoalUtilizationc.dedicatedMarkWorkersNeeded = int64(totalUtilizationGoal)c.fractionalUtilizationGoal = totalUtilizationGoal - float64(c.dedicatedMarkWorkersNeeded)if c.fractionalUtilizationGoal > 0 {c.fractionalMarkWorkersNeeded = 1} else {c.fractionalMarkWorkersNeeded = 0}// 重置P中的輔助GC所用的時間統計// Clear per-P statefor _, p := range &allp {if p == nil {break}p.gcAssistTime = 0}// 計算輔助GC的參數// 參考上面對計算assistWorkPerByte的公式的分析// Compute initial values for controls that are updated// throughout the cycle.c.revise()if debug.gcpacertrace > 0 {print("pacer: assist ratio=", c.assistWorkPerByte," (scan ", memstats.heap_scan>>20, " MB in ",work.initialHeapLive>>20, "->",memstats.next_gc>>20, " MB)"," workers=", c.dedicatedMarkWorkersNeeded,"+", c.fractionalMarkWorkersNeeded, "\n")} }setGCPhase函數會修改表示當前GC階段的全局變量和是否開啟寫屏障的全局變量:
//go:nosplit func setGCPhase(x uint32) {atomic.Store(&gcphase, x)writeBarrier.needed = gcphase == _GCmark || gcphase == _GCmarkterminationwriteBarrier.enabled = writeBarrier.needed || writeBarrier.cgo }gcBgMarkPrepare函數會重置后臺標記任務的計數:
// gcBgMarkPrepare sets up state for background marking. // Mutator assists must not yet be enabled. func gcBgMarkPrepare() {// Background marking will stop when the work queues are empty// and there are no more workers (note that, since this is// concurrent, this may be a transient state, but mark// termination will clean it up). Between background workers// and assists, we don't really know how many workers there// will be, so we pretend to have an arbitrarily large number// of workers, almost all of which are "waiting". While a// worker is working it decrements nwait. If nproc == nwait,// there are no workers.work.nproc = ^uint32(0)work.nwait = ^uint32(0) }gcMarkRootPrepare函數會計算掃描根對象的任務數量:
// gcMarkRootPrepare queues root scanning jobs (stacks, globals, and // some miscellany) and initializes scanning-related state. // // The caller must have call gcCopySpans(). // // The world must be stopped. // //go:nowritebarrier func gcMarkRootPrepare() {// 釋放mcache中的所有span的任務, 只在完成標記階段(mark termination)中執行if gcphase == _GCmarktermination {work.nFlushCacheRoots = int(gomaxprocs)} else {work.nFlushCacheRoots = 0}// 計算block數量的函數, rootBlockBytes是256KB// Compute how many data and BSS root blocks there are.nBlocks := func(bytes uintptr) int {return int((bytes + rootBlockBytes - 1) / rootBlockBytes)}work.nDataRoots = 0work.nBSSRoots = 0// data和bss每一輪GC只掃描一次// 并行GC中會在后臺標記任務中掃描, 完成標記階段(mark termination)中不掃描// 非并行GC會在完成標記階段(mark termination)中掃描// Only scan globals once per cycle; preferably concurrently.if !work.markrootDone {// 計算掃描可讀寫的全局變量的任務數量for _, datap := range activeModules() {nDataRoots := nBlocks(datap.edata - datap.data)if nDataRoots > work.nDataRoots {work.nDataRoots = nDataRoots}}// 計算掃描只讀的全局變量的任務數量for _, datap := range activeModules() {nBSSRoots := nBlocks(datap.ebss - datap.bss)if nBSSRoots > work.nBSSRoots {work.nBSSRoots = nBSSRoots}}}// span中的finalizer和各個G的棧每一輪GC只掃描一次// 同上if !work.markrootDone {// 計算掃描span中的finalizer的任務數量// On the first markroot, we need to scan span roots.// In concurrent GC, this happens during concurrent// mark and we depend on addfinalizer to ensure the// above invariants for objects that get finalizers// after concurrent mark. In STW GC, this will happen// during mark termination.//// We're only interested in scanning the in-use spans,// which will all be swept at this point. More spans// may be added to this list during concurrent GC, but// we only care about spans that were allocated before// this mark phase.work.nSpanRoots = mheap_.sweepSpans[mheap_.sweepgen/2%2].numBlocks()// 計算掃描各個G的棧的任務數量// On the first markroot, we need to scan all Gs. Gs// may be created after this point, but it's okay that// we ignore them because they begin life without any// roots, so there's nothing to scan, and any roots// they create during the concurrent phase will be// scanned during mark termination. During mark// termination, allglen isn't changing, so we'll scan// all Gs.work.nStackRoots = int(atomic.Loaduintptr(&allglen))} else {// We've already scanned span roots and kept the scan// up-to-date during concurrent mark.work.nSpanRoots = 0// The hybrid barrier ensures that stacks can't// contain pointers to unmarked objects, so on the// second markroot, there's no need to scan stacks.work.nStackRoots = 0if debug.gcrescanstacks > 0 {// Scan stacks anyway for debugging.work.nStackRoots = int(atomic.Loaduintptr(&allglen))}}// 計算總任務數量// 后臺標記任務會對markrootNext進行原子遞增, 來決定做哪個任務// 這種用數值來實現鎖自由隊列的辦法挺聰明的, 盡管google工程師覺得不好(看后面markroot函數的分析)work.markrootNext = 0work.markrootJobs = uint32(fixedRootCount + work.nFlushCacheRoots + work.nDataRoots + work.nBSSRoots + work.nSpanRoots + work.nStackRoots) }gcMarkTinyAllocs函數會標記所有tiny alloc等待合并的對象:
// gcMarkTinyAllocs greys all active tiny alloc blocks. // // The world must be stopped. func gcMarkTinyAllocs() {for _, p := range &allp {if p == nil || p.status == _Pdead {break}c := p.mcacheif c == nil || c.tiny == 0 {continue}// 標記各個P中的mcache中的tiny// 在上面的mallocgc函數中可以看到tiny是當前等待合并的對象_, hbits, span, objIndex := heapBitsForObject(c.tiny, 0, 0)gcw := &p.gcw// 標記一個對象存活, 并把它加到標記隊列(該對象變為灰色)greyobject(c.tiny, 0, 0, hbits, span, gcw, objIndex)// gcBlackenPromptly變量表示當前是否禁止本地隊列, 如果已禁止則把標記任務flush到全局隊列if gcBlackenPromptly {gcw.dispose()}} }startTheWorldWithSema函數會重新啟動世界:
func startTheWorldWithSema() {_g_ := getg()// 禁止G被搶占_g_.m.locks++ // disable preemption because it can be holding p in a local var// 判斷收到的網絡事件(fd可讀可寫或錯誤)并添加對應的G到待運行隊列gp := netpoll(false) // non-blockinginjectglist(gp)// 判斷是否要啟動gc helperadd := needaddgcproc()lock(&sched.lock)// 如果要求改變gomaxprocs則調整P的數量// procresize會返回有可運行任務的P的鏈表procs := gomaxprocsif newprocs != 0 {procs = newprocsnewprocs = 0}p1 := procresize(procs)// 取消GC等待標記sched.gcwaiting = 0// 如果sysmon在等待則喚醒它if sched.sysmonwait != 0 {sched.sysmonwait = 0notewakeup(&sched.sysmonnote)}unlock(&sched.lock)// 喚醒有可運行任務的Pfor p1 != nil {p := p1p1 = p1.link.ptr()if p.m != 0 {mp := p.m.ptr()p.m = 0if mp.nextp != 0 {throw("startTheWorld: inconsistent mp->nextp")}mp.nextp.set(p)notewakeup(&mp.park)} else {// Start M to run P. Do not start another M below.newm(nil, p)add = false}}// 如果有空閑的P,并且沒有自旋中的M則喚醒或者創建一個M// Wakeup an additional proc in case we have excessive runnable goroutines// in local queues or in the global queue. If we don't, the proc will park itself.// If we have lots of excessive work, resetspinning will unpark additional procs as necessary.if atomic.Load(&sched.npidle) != 0 && atomic.Load(&sched.nmspinning) == 0 {wakep()}// 啟動gc helperif add {// If GC could have used another helper proc, start one now,// in the hope that it will be available next time.// It would have been even better to start it before the collection,// but doing so requires allocating memory, so it's tricky to// coordinate. This lazy approach works out in practice:// we don't mind if the first couple gc rounds don't have quite// the maximum number of procs.newm(mhelpgc, nil)}// 允許G被搶占_g_.m.locks--// 如果當前G要求被搶占則重新嘗試if _g_.m.locks == 0 && _g_.preempt { // restore the preemption request in case we've cleared it in newstack_g_.stackguard0 = stackPreempt} }重啟世界后各個M會重新開始調度, 調度時會優先使用上面提到的findRunnableGCWorker函數查找任務, 之后就有大約25%的P運行后臺標記任務.
后臺標記任務的函數是gcBgMarkWorker:
gcDrain函數用于執行標記:
// gcDrain scans roots and objects in work buffers, blackening grey // objects until all roots and work buffers have been drained. // // If flags&gcDrainUntilPreempt != 0, gcDrain returns when g.preempt // is set. This implies gcDrainNoBlock. // // If flags&gcDrainIdle != 0, gcDrain returns when there is other work // to do. This implies gcDrainNoBlock. // // If flags&gcDrainNoBlock != 0, gcDrain returns as soon as it is // unable to get more work. Otherwise, it will block until all // blocking calls are blocked in gcDrain. // // If flags&gcDrainFlushBgCredit != 0, gcDrain flushes scan work // credit to gcController.bgScanCredit every gcCreditSlack units of // scan work. // //go:nowritebarrier func gcDrain(gcw *gcWork, flags gcDrainFlags) {if !writeBarrier.needed {throw("gcDrain phase incorrect")}gp := getg().m.curg// 看到搶占標志時是否要返回preemptible := flags&gcDrainUntilPreempt != 0// 沒有任務時是否要等待任務blocking := flags&(gcDrainUntilPreempt|gcDrainIdle|gcDrainNoBlock) == 0// 是否計算后臺的掃描量來減少輔助GC和喚醒等待中的GflushBgCredit := flags&gcDrainFlushBgCredit != 0// 是否只執行一定量的工作idle := flags&gcDrainIdle != 0// 記錄初始的已掃描數量initScanWork := gcw.scanWork// 掃描idleCheckThreshold(100000)個對象以后檢查是否要返回// idleCheck is the scan work at which to perform the next// idle check with the scheduler.idleCheck := initScanWork + idleCheckThreshold// 如果根對象未掃描完, 則先掃描根對象// Drain root marking jobs.if work.markrootNext < work.markrootJobs {// 如果標記了preemptible, 循環直到被搶占for !(preemptible && gp.preempt) {// 從根對象掃描隊列取出一個值(原子遞增)job := atomic.Xadd(&work.markrootNext, +1) - 1if job >= work.markrootJobs {break}// 執行根對象掃描工作markroot(gcw, job)// 如果是idle模式并且有其他工作, 則返回if idle && pollWork() {goto done}}}// 根對象已經在標記隊列中, 消費標記隊列// 如果標記了preemptible, 循環直到被搶占// Drain heap marking jobs.for !(preemptible && gp.preempt) {// 如果全局標記隊列為空, 把本地標記隊列的一部分工作分過去// (如果wbuf2不為空則移動wbuf2過去, 否則移動wbuf1的一半過去)// Try to keep work available on the global queue. We used to// check if there were waiting workers, but it's better to// just keep work available than to make workers wait. In the// worst case, we'll do O(log(_WorkbufSize)) unnecessary// balances.if work.full == 0 {gcw.balance()}// 從本地標記隊列中獲取對象, 獲取不到則從全局標記隊列獲取var b uintptrif blocking {// 阻塞獲取b = gcw.get()} else {// 非阻塞獲取b = gcw.tryGetFast()if b == 0 {b = gcw.tryGet()}}// 獲取不到對象, 標記隊列已為空, 跳出循環if b == 0 {// work barrier reached or tryGet failed.break}// 掃描獲取到的對象scanobject(b, gcw)// 如果已經掃描了一定數量的對象(gcCreditSlack的值是2000)// Flush background scan work credit to the global// account if we've accumulated enough locally so// mutator assists can draw on it.if gcw.scanWork >= gcCreditSlack {// 把掃描的對象數量添加到全局atomic.Xaddint64(&gcController.scanWork, gcw.scanWork)// 減少輔助GC的工作量和喚醒等待中的Gif flushBgCredit {gcFlushBgCredit(gcw.scanWork - initScanWork)initScanWork = 0}idleCheck -= gcw.scanWorkgcw.scanWork = 0// 如果是idle模式且達到了檢查的掃描量, 則檢查是否有其他任務(G), 如果有則跳出循環if idle && idleCheck <= 0 {idleCheck += idleCheckThresholdif pollWork() {break}}}}// In blocking mode, write barriers are not allowed after this// point because we must preserve the condition that the work// buffers are empty.done:// 把掃描的對象數量添加到全局// Flush remaining scan work credit.if gcw.scanWork > 0 {atomic.Xaddint64(&gcController.scanWork, gcw.scanWork)// 減少輔助GC的工作量和喚醒等待中的Gif flushBgCredit {gcFlushBgCredit(gcw.scanWork - initScanWork)}gcw.scanWork = 0} }markroot函數用于執行根對象掃描工作:
// markroot scans the i'th root. // // Preemption must be disabled (because this uses a gcWork). // // nowritebarrier is only advisory here. // //go:nowritebarrier func markroot(gcw *gcWork, i uint32) {// 判斷取出的數值對應哪種任務// (google的工程師覺得這種辦法可笑)// TODO(austin): This is a bit ridiculous. Compute and store// the bases in gcMarkRootPrepare instead of the counts.baseFlushCache := uint32(fixedRootCount)baseData := baseFlushCache + uint32(work.nFlushCacheRoots)baseBSS := baseData + uint32(work.nDataRoots)baseSpans := baseBSS + uint32(work.nBSSRoots)baseStacks := baseSpans + uint32(work.nSpanRoots)end := baseStacks + uint32(work.nStackRoots)// Note: if you add a case here, please also update heapdump.go:dumproots.switch {// 釋放mcache中的所有span, 要求STWcase baseFlushCache <= i && i < baseData:flushmcache(int(i - baseFlushCache))// 掃描可讀寫的全局變量// 這里只會掃描i對應的block, 掃描時傳入包含哪里有指針的bitmap數據case baseData <= i && i < baseBSS:for _, datap := range activeModules() {markrootBlock(datap.data, datap.edata-datap.data, datap.gcdatamask.bytedata, gcw, int(i-baseData))}// 掃描只讀的全局變量// 這里只會掃描i對應的block, 掃描時傳入包含哪里有指針的bitmap數據case baseBSS <= i && i < baseSpans:for _, datap := range activeModules() {markrootBlock(datap.bss, datap.ebss-datap.bss, datap.gcbssmask.bytedata, gcw, int(i-baseBSS))}// 掃描析構器隊列case i == fixedRootFinalizers:// Only do this once per GC cycle since we don't call// queuefinalizer during marking.if work.markrootDone {break}for fb := allfin; fb != nil; fb = fb.alllink {cnt := uintptr(atomic.Load(&fb.cnt))scanblock(uintptr(unsafe.Pointer(&fb.fin[0])), cnt*unsafe.Sizeof(fb.fin[0]), &finptrmask[0], gcw)}// 釋放已中止的G的棧case i == fixedRootFreeGStacks:// Only do this once per GC cycle; preferably// concurrently.if !work.markrootDone {// Switch to the system stack so we can call// stackfree.systemstack(markrootFreeGStacks)}// 掃描各個span中特殊對象(析構器列表)case baseSpans <= i && i < baseStacks:// mark MSpan.specialsmarkrootSpans(gcw, int(i-baseSpans))// 掃描各個G的棧default:// 獲取需要掃描的G// the rest is scanning goroutine stacksvar gp *gif baseStacks <= i && i < end {gp = allgs[i-baseStacks]} else {throw("markroot: bad index")}// 記錄等待開始的時間// remember when we've first observed the G blocked// needed only to output in tracebackstatus := readgstatus(gp) // We are not in a scan stateif (status == _Gwaiting || status == _Gsyscall) && gp.waitsince == 0 {gp.waitsince = work.tstart}// 切換到g0運行(有可能會掃到自己的棧)// scang must be done on the system stack in case// we're trying to scan our own stack.systemstack(func() {// 判斷掃描的棧是否自己的// If this is a self-scan, put the user G in// _Gwaiting to prevent self-deadlock. It may// already be in _Gwaiting if this is a mark// worker or we're in mark termination.userG := getg().m.curgselfScan := gp == userG && readgstatus(userG) == _Grunning// 如果正在掃描自己的棧則切換狀態到等待中防止死鎖if selfScan {casgstatus(userG, _Grunning, _Gwaiting)userG.waitreason = "garbage collection scan"}// 掃描G的棧// TODO: scang blocks until gp's stack has// been scanned, which may take a while for// running goroutines. Consider doing this in// two phases where the first is non-blocking:// we scan the stacks we can and ask running// goroutines to scan themselves; and the// second blocks.scang(gp, gcw)// 如果正在掃描自己的棧則把狀態切換回運行中if selfScan {casgstatus(userG, _Gwaiting, _Grunning)}})} }scang函數負責掃描G的棧:
// scang blocks until gp's stack has been scanned. // It might be scanned by scang or it might be scanned by the goroutine itself. // Either way, the stack scan has completed when scang returns. func scang(gp *g, gcw *gcWork) {// Invariant; we (the caller, markroot for a specific goroutine) own gp.gcscandone.// Nothing is racing with us now, but gcscandone might be set to true left over// from an earlier round of stack scanning (we scan twice per GC).// We use gcscandone to record whether the scan has been done during this round.// 標記掃描未完成gp.gcscandone = false// See http://golang.org/cl/21503 for justification of the yield delay.const yieldDelay = 10 * 1000var nextYield int64// 循環直到掃描完成// Endeavor to get gcscandone set to true,// either by doing the stack scan ourselves or by coercing gp to scan itself.// gp.gcscandone can transition from false to true when we're not looking// (if we asked for preemption), so any time we lock the status using// castogscanstatus we have to double-check that the scan is still not done. loop:for i := 0; !gp.gcscandone; i++ {// 判斷G的當前狀態switch s := readgstatus(gp); s {default:dumpgstatus(gp)throw("stopg: invalid status")// G已中止, 不需要掃描它case _Gdead:// No stack.gp.gcscandone = truebreak loop// G的棧正在擴展, 下一輪重試case _Gcopystack:// Stack being switched. Go around again.// G不是運行中, 首先需要防止它運行case _Grunnable, _Gsyscall, _Gwaiting:// Claim goroutine by setting scan bit.// Racing with execution or readying of gp.// The scan bit keeps them from running// the goroutine until we're done.if castogscanstatus(gp, s, s|_Gscan) {// 原子切換狀態成功時掃描它的棧if !gp.gcscandone {scanstack(gp, gcw)gp.gcscandone = true}// 恢復G的狀態, 并跳出循環restartg(gp)break loop}// G正在掃描它自己, 等待掃描完畢case _Gscanwaiting:// newstack is doing a scan for us right now. Wait.// G正在運行case _Grunning:// Goroutine running. Try to preempt execution so it can scan itself.// The preemption handler (in newstack) does the actual scan.// 如果已經有搶占請求, 則搶占成功時會幫我們處理// Optimization: if there is already a pending preemption request// (from the previous loop iteration), don't bother with the atomics.if gp.preemptscan && gp.preempt && gp.stackguard0 == stackPreempt {break}// 搶占G, 搶占成功時G會掃描它自己// Ask for preemption and self scan.if castogscanstatus(gp, _Grunning, _Gscanrunning) {if !gp.gcscandone {gp.preemptscan = truegp.preempt = truegp.stackguard0 = stackPreempt}casfrom_Gscanstatus(gp, _Gscanrunning, _Grunning)}}// 第一輪休眠10毫秒, 第二輪休眠5毫秒if i == 0 {nextYield = nanotime() + yieldDelay}if nanotime() < nextYield {procyield(10)} else {osyield()nextYield = nanotime() + yieldDelay/2}}// 掃描完成, 取消搶占掃描的請求gp.preemptscan = false // cancel scan request if no longer needed }設置preemptscan后, 在搶占G成功時會調用scanstack掃描它自己的棧, 具體代碼在這里.
掃描棧用的函數是scanstack:
scanblock函數是一個通用的掃描函數, 掃描全局變量和棧空間都會用它, 和scanobject不同的是bitmap需要手動傳入:
// scanblock scans b as scanobject would, but using an explicit // pointer bitmap instead of the heap bitmap. // // This is used to scan non-heap roots, so it does not update // gcw.bytesMarked or gcw.scanWork. // //go:nowritebarrier func scanblock(b0, n0 uintptr, ptrmask *uint8, gcw *gcWork) {// Use local copies of original parameters, so that a stack trace// due to one of the throws below shows the original block// base and extent.b := b0n := n0arena_start := mheap_.arena_startarena_used := mheap_.arena_used// 枚舉掃描的地址for i := uintptr(0); i < n; {// 找到bitmap中對應的byte// Find bits for the next word.bits := uint32(*addb(ptrmask, i/(sys.PtrSize*8)))if bits == 0 {i += sys.PtrSize * 8continue}// 枚舉bytefor j := 0; j < 8 && i < n; j++ {// 如果該地址包含指針if bits&1 != 0 {// 標記在該地址的對象存活, 并把它加到標記隊列(該對象變為灰色)// Same work as in scanobject; see comments there.obj := *(*uintptr)(unsafe.Pointer(b + i))if obj != 0 && arena_start <= obj && obj < arena_used {// 找到該對象對應的span和bitmapif obj, hbits, span, objIndex := heapBitsForObject(obj, b, i); obj != 0 {// 標記一個對象存活, 并把它加到標記隊列(該對象變為灰色)greyobject(obj, b, i, hbits, span, gcw, objIndex)}}}// 處理下一個指針下一個bitbits >>= 1i += sys.PtrSize}} }greyobject用于標記一個對象存活, 并把它加到標記隊列(該對象變為灰色):
// obj is the start of an object with mark mbits. // If it isn't already marked, mark it and enqueue into gcw. // base and off are for debugging only and could be removed. //go:nowritebarrierrec func greyobject(obj, base, off uintptr, hbits heapBits, span *mspan, gcw *gcWork, objIndex uintptr) {// obj should be start of allocation, and so must be at least pointer-aligned.if obj&(sys.PtrSize-1) != 0 {throw("greyobject: obj not pointer-aligned")}mbits := span.markBitsForIndex(objIndex)if useCheckmark {// checkmark是用于檢查是否所有可到達的對象都被正確標記的機制, 僅除錯使用if !mbits.isMarked() {printlock()print("runtime:greyobject: checkmarks finds unexpected unmarked object obj=", hex(obj), "\n")print("runtime: found obj at *(", hex(base), "+", hex(off), ")\n")// Dump the source (base) objectgcDumpObject("base", base, off)// Dump the objectgcDumpObject("obj", obj, ^uintptr(0))getg().m.traceback = 2throw("checkmark found unmarked object")}if hbits.isCheckmarked(span.elemsize) {return}hbits.setCheckmarked(span.elemsize)if !hbits.isCheckmarked(span.elemsize) {throw("setCheckmarked and isCheckmarked disagree")}} else {if debug.gccheckmark > 0 && span.isFree(objIndex) {print("runtime: marking free object ", hex(obj), " found at *(", hex(base), "+", hex(off), ")\n")gcDumpObject("base", base, off)gcDumpObject("obj", obj, ^uintptr(0))getg().m.traceback = 2throw("marking free object")}// 如果對象所在的span中的gcmarkBits對應的bit已經設置為1則可以跳過處理// If marked we have nothing to do.if mbits.isMarked() {return}// 設置對象所在的span中的gcmarkBits對應的bit為1// mbits.setMarked() // Avoid extra call overhead with manual inlining.atomic.Or8(mbits.bytep, mbits.mask)// 如果確定對象不包含指針(所在span的類型是noscan), 則不需要把對象放入標記隊列// If this is a noscan object, fast-track it to black// instead of greying it.if span.spanclass.noscan() {gcw.bytesMarked += uint64(span.elemsize)return}}// 把對象放入標記隊列// 先放入本地標記隊列, 失敗時把本地標記隊列中的部分工作轉移到全局標記隊列, 再放入本地標記隊列// Queue the obj for scanning. The PREFETCH(obj) logic has been removed but// seems like a nice optimization that can be added back in.// There needs to be time between the PREFETCH and the use.// Previously we put the obj in an 8 element buffer that is drained at a rate// to give the PREFETCH time to do its work.// Use of PREFETCHNTA might be more appropriate than PREFETCHif !gcw.putFast(obj) {gcw.put(obj)} }gcDrain函數掃描完根對象, 就會開始消費標記隊列, 對從標記隊列中取出的對象調用scanobject函數:
// scanobject scans the object starting at b, adding pointers to gcw. // b must point to the beginning of a heap object or an oblet. // scanobject consults the GC bitmap for the pointer mask and the // spans for the size of the object. // //go:nowritebarrier func scanobject(b uintptr, gcw *gcWork) {// Note that arena_used may change concurrently during// scanobject and hence scanobject may encounter a pointer to// a newly allocated heap object that is *not* in// [start,used). It will not mark this object; however, we// know that it was just installed by a mutator, which means// that mutator will execute a write barrier and take care of// marking it. This is even more pronounced on relaxed memory// architectures since we access arena_used without barriers// or synchronization, but the same logic applies.arena_start := mheap_.arena_startarena_used := mheap_.arena_used// Find the bits for b and the size of the object at b.//// b is either the beginning of an object, in which case this// is the size of the object to scan, or it points to an// oblet, in which case we compute the size to scan below.// 獲取對象對應的bitmaphbits := heapBitsForAddr(b)// 獲取對象所在的spans := spanOfUnchecked(b)// 獲取對象的大小n := s.elemsizeif n == 0 {throw("scanobject n == 0")}// 對象大小過大時(maxObletBytes是128KB)需要分割掃描// 每次最多只掃描128KBif n > maxObletBytes {// Large object. Break into oblets for better// parallelism and lower latency.if b == s.base() {// It's possible this is a noscan object (not// from greyobject, but from other code// paths), in which case we must *not* enqueue// oblets since their bitmaps will be// uninitialized.if s.spanclass.noscan() {// Bypass the whole scan.gcw.bytesMarked += uint64(n)return}// Enqueue the other oblets to scan later.// Some oblets may be in b's scalar tail, but// these will be marked as "no more pointers",// so we'll drop out immediately when we go to// scan those.for oblet := b + maxObletBytes; oblet < s.base()+s.elemsize; oblet += maxObletBytes {if !gcw.putFast(oblet) {gcw.put(oblet)}}}// Compute the size of the oblet. Since this object// must be a large object, s.base() is the beginning// of the object.n = s.base() + s.elemsize - bif n > maxObletBytes {n = maxObletBytes}}// 掃描對象中的指針var i uintptrfor i = 0; i < n; i += sys.PtrSize {// 獲取對應的bit// Find bits for this word.if i != 0 {// Avoid needless hbits.next() on last iteration.hbits = hbits.next()}// Load bits once. See CL 22712 and issue 16973 for discussion.bits := hbits.bits()// 檢查scan bit判斷是否繼續掃描, 注意第二個scan bit是checkmark// During checkmarking, 1-word objects store the checkmark// in the type bit for the one word. The only one-word objects// are pointers, or else they'd be merged with other non-pointer// data into larger allocations.if i != 1*sys.PtrSize && bits&bitScan == 0 {break // no more pointers in this object}// 檢查pointer bit, 不是指針則繼續if bits&bitPointer == 0 {continue // not a pointer}// 取出指針的值// Work here is duplicated in scanblock and above.// If you make changes here, make changes there too.obj := *(*uintptr)(unsafe.Pointer(b + i))// 如果指針在arena區域中, 則調用greyobject標記對象并把對象放到標記隊列中// At this point we have extracted the next potential pointer.// Check if it points into heap and not back at the current object.if obj != 0 && arena_start <= obj && obj < arena_used && obj-b >= n {// Mark the object.if obj, hbits, span, objIndex := heapBitsForObject(obj, b, i); obj != 0 {greyobject(obj, b, i, hbits, span, gcw, objIndex)}}}// 統計掃描過的大小和對象數量gcw.bytesMarked += uint64(n)gcw.scanWork += int64(i) }在所有后臺標記任務都把標記隊列消費完畢時, 會執行gcMarkDone函數準備進入完成標記階段(mark termination):
在并行GC中gcMarkDone會被執行兩次, 第一次會禁止本地標記隊列然后重新開始后臺標記任務, 第二次會進入完成標記階段(mark termination)。
gcMarkTermination函數會進入完成標記階段:
func gcMarkTermination(nextTriggerRatio float64) {// World is stopped.// Start marktermination which includes enabling the write barrier.// 禁止輔助GC和后臺標記任務的運行atomic.Store(&gcBlackenEnabled, 0)// 重新允許本地標記隊列(下次GC使用)gcBlackenPromptly = false// 設置當前GC階段到完成標記階段, 并啟用寫屏障setGCPhase(_GCmarktermination)// 記錄開始時間work.heap1 = memstats.heap_livestartTime := nanotime()// 禁止G被搶占mp := acquirem()mp.preemptoff = "gcing"_g_ := getg()_g_.m.traceback = 2// 設置G的狀態為等待中這樣它的棧可以被掃描gp := _g_.m.curgcasgstatus(gp, _Grunning, _Gwaiting)gp.waitreason = "garbage collection"// 切換到g0運行// Run gc on the g0 stack. We do this so that the g stack// we're currently running on will no longer change. Cuts// the root set down a bit (g0 stacks are not scanned, and// we don't need to scan gc's internal state). We also// need to switch to g0 so we can shrink the stack.systemstack(func() {// 開始STW中的標記gcMark(startTime)// 必須立刻返回, 因為外面的G的棧有可能被移動, 不能在這之后訪問外面的變量// Must return immediately.// The outer function's stack may have moved// during gcMark (it shrinks stacks, including the// outer function's stack), so we must not refer// to any of its variables. Return back to the// non-system stack to pick up the new addresses// before continuing.})// 重新切換到g0運行systemstack(func() {work.heap2 = work.bytesMarked// 如果啟用了checkmark則執行檢查, 檢查是否所有可到達的對象都有標記if debug.gccheckmark > 0 {// Run a full stop-the-world mark using checkmark bits,// to check that we didn't forget to mark anything during// the concurrent mark process.gcResetMarkState()initCheckmarks()gcMark(startTime)clearCheckmarks()}// 設置當前GC階段到關閉, 并禁用寫屏障// marking is complete so we can turn the write barrier offsetGCPhase(_GCoff)// 喚醒后臺清掃任務, 將在STW結束后開始運行gcSweep(work.mode)// 除錯用if debug.gctrace > 1 {startTime = nanotime()// The g stacks have been scanned so// they have gcscanvalid==true and gcworkdone==true.// Reset these so that all stacks will be rescanned.gcResetMarkState()finishsweep_m()// Still in STW but gcphase is _GCoff, reset to _GCmarktermination// At this point all objects will be found during the gcMark which// does a complete STW mark and object scan.setGCPhase(_GCmarktermination)gcMark(startTime)setGCPhase(_GCoff) // marking is done, turn off wb.gcSweep(work.mode)}})// 設置G的狀態為運行中_g_.m.traceback = 0casgstatus(gp, _Gwaiting, _Grunning)// 跟蹤處理if trace.enabled {traceGCDone()}// all donemp.preemptoff = ""if gcphase != _GCoff {throw("gc done but gcphase != _GCoff")}// 更新下一次觸發gc需要的heap大小(gc_trigger)// Update GC trigger and pacing for the next cycle.gcSetTriggerRatio(nextTriggerRatio)// 更新用時記錄// Update timing memstatsnow := nanotime()sec, nsec, _ := time_now()unixNow := sec*1e9 + int64(nsec)work.pauseNS += now - work.pauseStartwork.tEnd = nowatomic.Store64(&memstats.last_gc_unix, uint64(unixNow)) // must be Unix time to make sense to useratomic.Store64(&memstats.last_gc_nanotime, uint64(now)) // monotonic time for usmemstats.pause_ns[memstats.numgc%uint32(len(memstats.pause_ns))] = uint64(work.pauseNS)memstats.pause_end[memstats.numgc%uint32(len(memstats.pause_end))] = uint64(unixNow)memstats.pause_total_ns += uint64(work.pauseNS)// 更新所用cpu記錄// Update work.totaltime.sweepTermCpu := int64(work.stwprocs) * (work.tMark - work.tSweepTerm)// We report idle marking time below, but omit it from the// overall utilization here since it's "free".markCpu := gcController.assistTime + gcController.dedicatedMarkTime + gcController.fractionalMarkTimemarkTermCpu := int64(work.stwprocs) * (work.tEnd - work.tMarkTerm)cycleCpu := sweepTermCpu + markCpu + markTermCpuwork.totaltime += cycleCpu// Compute overall GC CPU utilization.totalCpu := sched.totaltime + (now-sched.procresizetime)*int64(gomaxprocs)memstats.gc_cpu_fraction = float64(work.totaltime) / float64(totalCpu)// 重置清掃狀態// Reset sweep state.sweep.nbgsweep = 0sweep.npausesweep = 0// 統計強制開始GC的次數if work.userForced {memstats.numforcedgc++}// 統計執行GC的次數然后喚醒等待清掃的G// Bump GC cycle count and wake goroutines waiting on sweep.lock(&work.sweepWaiters.lock)memstats.numgc++injectglist(work.sweepWaiters.head.ptr())work.sweepWaiters.head = 0unlock(&work.sweepWaiters.lock)// 性能統計用// Finish the current heap profiling cycle and start a new// heap profiling cycle. We do this before starting the world// so events don't leak into the wrong cycle.mProf_NextCycle()// 重新啟動世界systemstack(startTheWorldWithSema)// !!!!!!!!!!!!!!!// 世界已重新啟動...// !!!!!!!!!!!!!!!// 性能統計用// Flush the heap profile so we can start a new cycle next GC.// This is relatively expensive, so we don't do it with the// world stopped.mProf_Flush()// 移動標記隊列使用的緩沖區到自由列表, 使得它們可以被回收// Prepare workbufs for freeing by the sweeper. We do this// asynchronously because it can take non-trivial time.prepareFreeWorkbufs()// 釋放未使用的棧// Free stack spans. This must be done between GC cycles.systemstack(freeStackSpans)// 除錯用// Print gctrace before dropping worldsema. As soon as we drop// worldsema another cycle could start and smash the stats// we're trying to print.if debug.gctrace > 0 {util := int(memstats.gc_cpu_fraction * 100)var sbuf [24]byteprintlock()print("gc ", memstats.numgc," @", string(itoaDiv(sbuf[:], uint64(work.tSweepTerm-runtimeInitTime)/1e6, 3)), "s ",util, "%: ")prev := work.tSweepTermfor i, ns := range []int64{work.tMark, work.tMarkTerm, work.tEnd} {if i != 0 {print("+")}print(string(fmtNSAsMS(sbuf[:], uint64(ns-prev))))prev = ns}print(" ms clock, ")for i, ns := range []int64{sweepTermCpu, gcController.assistTime, gcController.dedicatedMarkTime + gcController.fractionalMarkTime, gcController.idleMarkTime, markTermCpu} {if i == 2 || i == 3 {// Separate mark time components with /.print("/")} else if i != 0 {print("+")}print(string(fmtNSAsMS(sbuf[:], uint64(ns))))}print(" ms cpu, ",work.heap0>>20, "->", work.heap1>>20, "->", work.heap2>>20, " MB, ",work.heapGoal>>20, " MB goal, ",work.maxprocs, " P")if work.userForced {print(" (forced)")}print("\n")printunlock()}semrelease(&worldsema)// Careful: another GC cycle may start now.// 重新允許當前的G被搶占releasem(mp)mp = nil// 如果是并行GC, 讓當前M繼續運行(會回到gcBgMarkWorker然后休眠)// 如果不是并行GC, 則讓當前M開始調度// now that gc is done, kick off finalizer thread if neededif !concurrentSweep {// give the queued finalizers, if any, a chance to runGosched()} }gcSweep函數會喚醒后臺清掃任務:
后臺清掃任務會在程序啟動時調用的gcenable函數中啟動.
func gcSweep(mode gcMode) {if gcphase != _GCoff {throw("gcSweep being done but phase is not GCoff")}// 增加sweepgen, 這樣sweepSpans中兩個隊列角色會交換, 所有span都會變為"待清掃"的spanlock(&mheap_.lock)mheap_.sweepgen += 2mheap_.sweepdone = 0if mheap_.sweepSpans[mheap_.sweepgen/2%2].index != 0 {// We should have drained this list during the last// sweep phase. We certainly need to start this phase// with an empty swept list.throw("non-empty swept list")}mheap_.pagesSwept = 0unlock(&mheap_.lock)// 如果非并行GC則在這里完成所有工作(STW中)if !_ConcurrentSweep || mode == gcForceBlockMode {// Special case synchronous sweep.// Record that no proportional sweeping has to happen.lock(&mheap_.lock)mheap_.sweepPagesPerByte = 0unlock(&mheap_.lock)// Sweep all spans eagerly.for sweepone() != ^uintptr(0) {sweep.npausesweep++}// Free workbufs eagerly.prepareFreeWorkbufs()for freeSomeWbufs(false) {}// All "free" events for this mark/sweep cycle have// now happened, so we can make this profile cycle// available immediately.mProf_NextCycle()mProf_Flush()return}// 喚醒后臺清掃任務// Background sweep.lock(&sweep.lock)if sweep.parked {sweep.parked = falseready(sweep.g, 0, true)}unlock(&sweep.lock) }后臺清掃任務的函數是bgsweep:
func bgsweep(c chan int) {sweep.g = getg()// 等待喚醒lock(&sweep.lock)sweep.parked = truec <- 1goparkunlock(&sweep.lock, "GC sweep wait", traceEvGoBlock, 1)// 循環清掃for {// 清掃一個span, 然后進入調度(一次只做少量工作)for gosweepone() != ^uintptr(0) {sweep.nbgsweep++Gosched()}// 釋放一些未使用的標記隊列緩沖區到heapfor freeSomeWbufs(true) {Gosched()}// 如果清掃未完成則繼續循環lock(&sweep.lock)if !gosweepdone() {// This can happen if a GC runs between// gosweepone returning ^0 above// and the lock being acquired.unlock(&sweep.lock)continue}// 否則讓后臺清掃任務進入休眠, 當前M繼續調度sweep.parked = truegoparkunlock(&sweep.lock, "GC sweep wait", traceEvGoBlock, 1)} }gosweepone函數會從sweepSpans中取出單個span清掃:
//go:nowritebarrier func gosweepone() uintptr {var ret uintptr// 切換到g0運行systemstack(func() {ret = sweepone()})return ret }sweepone函數如下:
// sweeps one span // returns number of pages returned to heap, or ^uintptr(0) if there is nothing to sweep //go:nowritebarrier func sweepone() uintptr {_g_ := getg()sweepRatio := mheap_.sweepPagesPerByte // For debugging// 禁止G被搶占// increment locks to ensure that the goroutine is not preempted// in the middle of sweep thus leaving the span in an inconsistent state for next GC_g_.m.locks++// 檢查是否已完成清掃if atomic.Load(&mheap_.sweepdone) != 0 {_g_.m.locks--return ^uintptr(0)}// 更新同時執行sweep的任務數量atomic.Xadd(&mheap_.sweepers, +1)npages := ^uintptr(0)sg := mheap_.sweepgenfor {// 從sweepSpans中取出一個spans := mheap_.sweepSpans[1-sg/2%2].pop()// 全部清掃完畢時跳出循環if s == nil {atomic.Store(&mheap_.sweepdone, 1)break}// 其他M已經在清掃這個span時跳過if s.state != mSpanInUse {// This can happen if direct sweeping already// swept this span, but in that case the sweep// generation should always be up-to-date.if s.sweepgen != sg {print("runtime: bad span s.state=", s.state, " s.sweepgen=", s.sweepgen, " sweepgen=", sg, "\n")throw("non in-use span in unswept list")}continue}// 原子增加span的sweepgen, 失敗表示其他M已經開始清掃這個span, 跳過if s.sweepgen != sg-2 || !atomic.Cas(&s.sweepgen, sg-2, sg-1) {continue}// 清掃這個span, 然后跳出循環npages = s.npagesif !s.sweep(false) {// Span is still in-use, so this returned no// pages to the heap and the span needs to// move to the swept in-use list.npages = 0}break}// 更新同時執行sweep的任務數量// Decrement the number of active sweepers and if this is the// last one print trace information.if atomic.Xadd(&mheap_.sweepers, -1) == 0 && atomic.Load(&mheap_.sweepdone) != 0 {if debug.gcpacertrace > 0 {print("pacer: sweep done at heap size ", memstats.heap_live>>20, "MB; allocated ", (memstats.heap_live-mheap_.sweepHeapLiveBasis)>>20, "MB during sweep; swept ", mheap_.pagesSwept, " pages at ", sweepRatio, " pages/byte\n")}}// 允許G被搶占_g_.m.locks--// 返回清掃的頁數return npages }span的sweep函數用于清掃單個span:
// Sweep frees or collects finalizers for blocks not marked in the mark phase. // It clears the mark bits in preparation for the next GC round. // Returns true if the span was returned to heap. // If preserve=true, don't return it to heap nor relink in MCentral lists; // caller takes care of it. //TODO go:nowritebarrier func (s *mspan) sweep(preserve bool) bool {// It's critical that we enter this function with preemption disabled,// GC must not start while we are in the middle of this function._g_ := getg()if _g_.m.locks == 0 && _g_.m.mallocing == 0 && _g_ != _g_.m.g0 {throw("MSpan_Sweep: m is not locked")}sweepgen := mheap_.sweepgenif s.state != mSpanInUse || s.sweepgen != sweepgen-1 {print("MSpan_Sweep: state=", s.state, " sweepgen=", s.sweepgen, " mheap.sweepgen=", sweepgen, "\n")throw("MSpan_Sweep: bad span state")}if trace.enabled {traceGCSweepSpan(s.npages * _PageSize)}// 統計已清理的頁數atomic.Xadd64(&mheap_.pagesSwept, int64(s.npages))spc := s.spanclasssize := s.elemsizeres := falsec := _g_.m.mcachefreeToHeap := false// The allocBits indicate which unmarked objects don't need to be// processed since they were free at the end of the last GC cycle// and were not allocated since then.// If the allocBits index is >= s.freeindex and the bit// is not marked then the object remains unallocated// since the last GC.// This situation is analogous to being on a freelist.// 判斷在special中的析構器, 如果對應的對象已經不再存活則標記對象存活防止回收, 然后把析構器移到運行隊列// Unlink & free special records for any objects we're about to free.// Two complications here:// 1. An object can have both finalizer and profile special records.// In such case we need to queue finalizer for execution,// mark the object as live and preserve the profile special.// 2. A tiny object can have several finalizers setup for different offsets.// If such object is not marked, we need to queue all finalizers at once.// Both 1 and 2 are possible at the same time.specialp := &s.specialsspecial := *specialpfor special != nil {// A finalizer can be set for an inner byte of an object, find object beginning.objIndex := uintptr(special.offset) / sizep := s.base() + objIndex*sizembits := s.markBitsForIndex(objIndex)if !mbits.isMarked() {// This object is not marked and has at least one special record.// Pass 1: see if it has at least one finalizer.hasFin := falseendOffset := p - s.base() + sizefor tmp := special; tmp != nil && uintptr(tmp.offset) < endOffset; tmp = tmp.next {if tmp.kind == _KindSpecialFinalizer {// Stop freeing of object if it has a finalizer.mbits.setMarkedNonAtomic()hasFin = truebreak}}// Pass 2: queue all finalizers _or_ handle profile record.for special != nil && uintptr(special.offset) < endOffset {// Find the exact byte for which the special was setup// (as opposed to object beginning).p := s.base() + uintptr(special.offset)if special.kind == _KindSpecialFinalizer || !hasFin {// Splice out special record.y := specialspecial = special.next*specialp = specialfreespecial(y, unsafe.Pointer(p), size)} else {// This is profile record, but the object has finalizers (so kept alive).// Keep special record.specialp = &special.nextspecial = *specialp}}} else {// object is still live: keep special recordspecialp = &special.nextspecial = *specialp}}// 除錯用if debug.allocfreetrace != 0 || raceenabled || msanenabled {// Find all newly freed objects. This doesn't have to// efficient; allocfreetrace has massive overhead.mbits := s.markBitsForBase()abits := s.allocBitsForIndex(0)for i := uintptr(0); i < s.nelems; i++ {if !mbits.isMarked() && (abits.index < s.freeindex || abits.isMarked()) {x := s.base() + i*s.elemsizeif debug.allocfreetrace != 0 {tracefree(unsafe.Pointer(x), size)}if raceenabled {racefree(unsafe.Pointer(x), size)}if msanenabled {msanfree(unsafe.Pointer(x), size)}}mbits.advance()abits.advance()}}// 計算釋放的對象數量// Count the number of free objects in this span.nalloc := uint16(s.countAlloc())if spc.sizeclass() == 0 && nalloc == 0 {// 如果span的類型是0(大對象)并且其中的對象已經不存活則釋放到heaps.needzero = 1freeToHeap = true}nfreed := s.allocCount - nallocif nalloc > s.allocCount {print("runtime: nelems=", s.nelems, " nalloc=", nalloc, " previous allocCount=", s.allocCount, " nfreed=", nfreed, "\n")throw("sweep increased allocation count")}// 設置新的allocCounts.allocCount = nalloc// 判斷span是否無未分配的對象wasempty := s.nextFreeIndex() == s.nelems// 重置freeindex, 下次分配從0開始搜索s.freeindex = 0 // reset allocation index to start of span.if trace.enabled {getg().m.p.ptr().traceReclaimed += uintptr(nfreed) * s.elemsize}// gcmarkBits變為新的allocBits// 然后重新分配一塊全部為0的gcmarkBits// 下次分配對象時可以根據allocBits得知哪些元素是未分配的// gcmarkBits becomes the allocBits.// get a fresh cleared gcmarkBits in preparation for next GCs.allocBits = s.gcmarkBitss.gcmarkBits = newMarkBits(s.nelems)// 更新freeindex開始的allocCache// Initialize alloc bits cache.s.refillAllocCache(0)// 如果span中已經無存活的對象則更新sweepgen到最新// 下面會把span加到mcentral或者mheap// We need to set s.sweepgen = h.sweepgen only when all blocks are swept,// because of the potential for a concurrent free/SetFinalizer.// But we need to set it before we make the span available for allocation// (return it to heap or mcentral), because allocation code assumes that a// span is already swept if available for allocation.if freeToHeap || nfreed == 0 {// The span must be in our exclusive ownership until we update sweepgen,// check for potential races.if s.state != mSpanInUse || s.sweepgen != sweepgen-1 {print("MSpan_Sweep: state=", s.state, " sweepgen=", s.sweepgen, " mheap.sweepgen=", sweepgen, "\n")throw("MSpan_Sweep: bad span state after sweep")}// Serialization point.// At this point the mark bits are cleared and allocation ready// to go so release the span.atomic.Store(&s.sweepgen, sweepgen)}if nfreed > 0 && spc.sizeclass() != 0 {// 把span加到mcentral, res等于是否添加成功c.local_nsmallfree[spc.sizeclass()] += uintptr(nfreed)res = mheap_.central[spc].mcentral.freeSpan(s, preserve, wasempty)// freeSpan會更新sweepgen// MCentral_FreeSpan updates sweepgen} else if freeToHeap {// 把span釋放到mheap// Free large span to heap// NOTE(rsc,dvyukov): The original implementation of efence// in CL 22060046 used SysFree instead of SysFault, so that// the operating system would eventually give the memory// back to us again, so that an efence program could run// longer without running out of memory. Unfortunately,// calling SysFree here without any kind of adjustment of the// heap data structures means that when the memory does// come back to us, we have the wrong metadata for it, either in// the MSpan structures or in the garbage collection bitmap.// Using SysFault here means that the program will run out of// memory fairly quickly in efence mode, but at least it won't// have mysterious crashes due to confused memory reuse.// It should be possible to switch back to SysFree if we also// implement and then call some kind of MHeap_DeleteSpan.if debug.efence > 0 {s.limit = 0 // prevent mlookup from finding this spansysFault(unsafe.Pointer(s.base()), size)} else {mheap_.freeSpan(s, 1)}c.local_nlargefree++c.local_largefree += sizeres = true}// 如果span未加到mcentral或者未釋放到mheap, 則表示span仍在使用if !res {// 把仍在使用的span加到sweepSpans的"已清掃"隊列中// The span has been swept and is still in-use, so put// it on the swept in-use list.mheap_.sweepSpans[sweepgen/2%2].push(s)}return res }從bgsweep和前面的分配器可以看出掃描階段的工作是十分懶惰(lazy)的,
實際可能會出現前一階段的掃描還未完成, 就需要開始新一輪的GC的情況,
所以每一輪GC開始之前都需要完成前一輪GC的掃描工作(Sweep Termination階段).
GC的整個流程都分析完畢了, 最后貼上寫屏障函數writebarrierptr的實現:
// NOTE: Really dst *unsafe.Pointer, src unsafe.Pointer, // but if we do that, Go inserts a write barrier on *dst = src. //go:nosplit func writebarrierptr(dst *uintptr, src uintptr) {if writeBarrier.cgo {cgoCheckWriteBarrier(dst, src)}if !writeBarrier.needed {*dst = srcreturn}if src != 0 && src < minPhysPageSize {systemstack(func() {print("runtime: writebarrierptr *", dst, " = ", hex(src), "\n")throw("bad pointer in write barrier")})}// 標記指針writebarrierptr_prewrite1(dst, src)// 設置指針到目標*dst = src }writebarrierptr_prewrite1函數如下:
// writebarrierptr_prewrite1 invokes a write barrier for *dst = src // prior to the write happening. // // Write barrier calls must not happen during critical GC and scheduler // related operations. In particular there are times when the GC assumes // that the world is stopped but scheduler related code is still being // executed, dealing with syscalls, dealing with putting gs on runnable // queues and so forth. This code cannot execute write barriers because // the GC might drop them on the floor. Stopping the world involves removing // the p associated with an m. We use the fact that m.p == nil to indicate // that we are in one these critical section and throw if the write is of // a pointer to a heap object. //go:nosplit func writebarrierptr_prewrite1(dst *uintptr, src uintptr) {mp := acquirem()if mp.inwb || mp.dying > 0 {releasem(mp)return}systemstack(func() {if mp.p == 0 && memstats.enablegc && !mp.inwb && inheap(src) {throw("writebarrierptr_prewrite1 called with mp.p == nil")}mp.inwb = truegcmarkwb_m(dst, src)})mp.inwb = falsereleasem(mp) }gcmarkwb_m函數如下:
func gcmarkwb_m(slot *uintptr, ptr uintptr) {if writeBarrier.needed {// Note: This turns bad pointer writes into bad// pointer reads, which could be confusing. We avoid// reading from obviously bad pointers, which should// take care of the vast majority of these. We could// patch this up in the signal handler, or use XCHG to// combine the read and the write. Checking inheap is// insufficient since we need to track changes to// roots outside the heap.//// Note: profbuf.go omits a barrier during signal handler// profile logging; that's safe only because this deletion barrier exists.// If we remove the deletion barrier, we'll have to work out// a new way to handle the profile logging.if slot1 := uintptr(unsafe.Pointer(slot)); slot1 >= minPhysPageSize {if optr := *slot; optr != 0 {// 標記舊指針shade(optr)}}// TODO: Make this conditional on the caller's stack color.if ptr != 0 && inheap(ptr) {// 標記新指針shade(ptr)}} }shade函數如下:
// Shade the object if it isn't already. // The object is not nil and known to be in the heap. // Preemption must be disabled. //go:nowritebarrier func shade(b uintptr) {if obj, hbits, span, objIndex := heapBitsForObject(b, 0, 0); obj != 0 {gcw := &getg().m.p.ptr().gcw// 標記一個對象存活, 并把它加到標記隊列(該對象變為灰色)greyobject(obj, 0, 0, hbits, span, gcw, objIndex)// 如果標記了禁止本地標記隊列則flush到全局標記隊列if gcphase == _GCmarktermination || gcBlackenPromptly {// Ps aren't allowed to cache work during mark// termination.gcw.dispose()}} }?
總結
以上是生活随笔為你收集整理的深入理解GO语言:GC原理及源码分析的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 计算机毕业设计之 少儿编程学习平台的设计
- 下一篇: 浅谈音乐与计算机,浅析电脑音乐在音乐教育