當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

DL之SSD：SSD算法的简介(论文介绍)、架构详解、案例应用等配图集合之详细攻略

發布時間：2025/3/21 编程问答 36 豆豆

生活随笔收集整理的這篇文章主要介紹了 DL之SSD：SSD算法的简介(论文介绍)、架构详解、案例应用等配图集合之详细攻略小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

DL之SSD：SSD算法的簡介(論文介紹)、架構詳解、案例應用等配圖集合之詳細攻略

SSD算法的簡介(論文介紹)

0、SSD實驗結果

1、架構圖集合

2、SSD VS Yolo

SSD算法的架構詳解

SSD算法的案例應用

相關文章
DL之SSD：SSD算法的簡介(論文介紹)、架構詳解、案例應用等配圖集合之詳細攻略
DL之SSD：SSD算法的架構詳解

SSD算法的簡介(論文介紹)

? ? ? ?SSD:，即Single shot multiboxdetector，單步驟多盒探測器。

Abstract
? ? ? ?We present a method for detecting objects in images using a single ?deep neural network. Our approach, named SSD, discretizes the output space of ?bounding boxes into a set of default boxes over different aspect ratios and scales ?per feature map location. At prediction time, the network generates scores for the ?presence of each object category in each default box and produces adjustments to ?the box to better match the object shape. Additionally, the network combines predictions ?from multiple feature maps with different resolutions to naturally handle ?objects of various sizes. SSD is simple relative to methods that require object ?proposals because it completely eliminates proposal generation and subsequent ?pixel or feature resampling stages and encapsulates all computation in a single ?network. This makes SSD easy to train and straightforward to integrate into systems ?that require a detection component.
? ? ? ??Experimental results on the PASCAL ?VOC, COCO, and ILSVRC datasets confirm that SSD has competitive accuracy ?to methods that utilize an additional object proposal step and is much faster, while ?providing a unified framework for both training and inference. For 300 × 300 input, ?SSD achieves 74.3% mAP1 ?on VOC2007 test at 59 FPS on a Nvidia Titan ?X and for 512 × 512 input, SSD achieves 76.9% mAP, outperforming a comparable ?state-of-the-art Faster R-CNN model. Compared to other single stage methods, ?SSD has much better accuracy even with a smaller input image size. Code is ?available at: https://github.com/weiliu89/caffe/tree/ssd .
? ? ? ?本論文提出了一種利用單個深度神經網絡對圖像中目標進行檢測的方法。我們的方法名為SSD，它將邊界框的輸出空間離散為一組默認框，每個特征映射位置具有不同的縱橫比和比例。在預測時，網絡為每個默認框中每個對象類別的存在生成評分，并對該框進行調整以更好地匹配對象形狀。此外，該網絡結合了來自具有不同分辨率的多個特征圖的預測，以自然地處理不同大小的對象。相對于需要對象建議的方法，SSD比較簡單，因為它完全消除了建議生成和隨后的像素或特征重采樣階段，并將所有計算封裝在一個網絡中。這使得SSD易于訓練，并且易于集成到需要檢測組件的系統中。
? ? ? ??在PASCAL VOC、COCO和ILSVRC數據集上的實驗結果證實，相對于使用附加對象建議步驟的方法，SSD具有競爭力的準確性，而且速度更快，同時為訓練和推理提供了統一的框架。對于300×300輸入，SSD在Nvidia Titan X上以59幀每秒的速度在VOC2007測試中實現了74.3%的mAP，對于512×512輸入，SSD實現了76.9%的mAP，超過了同類的最先進的更快的R-CNN模型。與其他單級方法相比，即使在較小的輸入圖像尺寸下，SSD也具有更高的精度。代碼如下:https://github.com/weiliu89/ /tree/ssd。
Conclusions
? ? ? ?This paper introduces SSD, a fast single-shot object detector for multiple categories. A ?key feature of our model is the use of multi-scale convolutional bounding box outputs ?attached to multiple feature maps at the top of the network. This representation allows ?us to efficiently model the space of possible box shapes. We experimentally validate ?that given appropriate training strategies, a larger number of carefully chosen default ?bounding boxes results in improved performance. We build SSD models with at least an ?order of magnitude more box predictions sampling location, scale, and aspect ratio, than ?existing methods [5,7]. We demonstrate that given the same VGG-16 base architecture, ?SSD compares favorably to its state-of-the-art object detector counterparts in terms of ?both accuracy and speed. Our SSD512 model significantly outperforms the state-of-theart ?Faster R-CNN [2] in terms of accuracy on PASCAL VOC and COCO, while being ?3× faster. Our real time SSD300 model runs at 59 FPS, which is faster than the current ?real time YOLO [5] alternative, while producing markedly superior detection accuracy. ?
? ? ? ?本文介紹了一種單shot 多類別快速目標檢測系統SSD。我們模型的一個關鍵特性是使用多尺度卷積邊界框輸出，附加到網絡頂部的多個特征映射上。這種表示使我們能夠有效地為可能的盒子形狀的空間建模。我們通過實驗驗證，在給定適當的訓練策略下，大量精心選擇的缺省邊界框可以提高性能。與現有方法相比，我們構建的SSD模型具有至少一個數量級的盒預測采樣位置、尺度和縱橫比[5,7]。我們證明，給定相同的VGG-16基礎架構，SSD在精度和速度方面都優于其最先進的對象檢測器。我們的SSD512模型在PASCAL VOC和COCO上的精度明顯優于目前最先進的R-CNN[2]，同時速度提高了3倍。我們的實時SSD300模型以59幀每秒的速度運行，這比當前的實時YOLO[5]替代方案更快，同時產生明顯優越的檢測精度。
? ? ? ?Apart from its standalone utility, we believe that our monolithic and relatively simple ?SSD model provides a useful building block for larger systems that employ an object ?detection component. A promising future direction is to explore its use as part of a system ?using recurrent neural networks to detect and track objects in video simultaneously.
? ? ? ?除了它的獨立實用程序之外，我們相信我們的統一的且相對簡單的SSD模型為使用對象檢測組件的大型系統提供了一個有用的構建塊。一個很有前途的未來方向是探索它作為一個系統的一部分，使用遞歸神經網絡同時檢測和跟蹤視頻對象。

論文
Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, Alexander C. Berg. SSD: Single shot multiboxdetector. ECCV 2016
https://arxiv.org/abs/1512.02325

論文地址：https://arxiv.org/pdf/1512.02325v5.pdf

0、SSD實驗結果

Training: VOC2007 trainvaland VOC2012 trainval(16551 images)
Testing: VOC2007 test (4952 images)

1、單步驟和兩步驟在VOC2007數據集上比較

? ? ?兩個模型SSD300、SSD512分別可達到77%mAP且每秒46幀、80%mAP且每秒19幀。
? ? ?對比Yolov1，SDD不論是速度還是精度上，都超過！對比兩階段模型，比如FasterR-CNN，也超過！

2、SSD500模型——PASCAL VOC2007 test detection results

? ? ? ?Here is the accuracy comparison for different methods. For SSD, it uses image size of 300 ×300 or 512 ×512.這是不同方法的精度比較。對于SSD，它使用的圖像大小為300×300或512×512。

? ? ? ?The model is trained using SGD with initial learning rate 0.001, 0.9 momentum, 0.0005 weight decay, and batch size 32.
Using a Nvidia Titan X on VOC2007 test, SSD achieves 59 FPS with mAP74.3% on VOC2007 test, vs. Faster R-CNN 7 FPS with mAP73.2% or YOLO 45 FPS with mAP63.4%.
? ? ? ?模型采用SGD進行訓練，初始學習率0.001，動量0.9，重量衰減0.0005，批量大小32。在VOC2007測試中使用Nvidia Titan X, SSD在VOC2007測試中使用mAP74.3%實現59幀/秒，而更快的R-CNN 7幀/秒使用mAP73.2%或YOLO 45幀/秒使用mAP63.4%。

? ? ? ? ?Fast 和Faster R-CNN都使用最小尺寸為600的輸入圖像。兩種SSD模型具有完全相同的設置，除了它們具有不同的輸入尺寸（300×300與512×512）。很明顯，更大的輸入尺寸可以帶來更好的結果，而更多的數據總是有幫助的。
? ? ? ? 圖表可知，采用【07+12】組合數據集可得到76.8mAP，而采用【07+12+COCO】組合，性能最好，為81.6mAP！

注：
Data: ”07”: VOC2007 trainval：采用07年數據集
”07+12”: union of VOC2007 and VOC2012 trainval：采用07年和12年的數據集
”07+12+COCO”: first train on COCO trainval35k then fine-tune on 07+12：采用COCO數據上訓練+07年和12年數據集上微調

3、檢測速度(幀每秒為單位)

This is the recap of the speed performance in frame per second
? ? ? ?Pascal VOC2007測試結果。SSD300是唯一可實現70％以上mAP的實時檢測方法。通過使用更大的輸入圖像，SSD512在保持接近實時速度的同時優于所有精確度方法。

4、SSD512模型——COCO test-dev檢測實例

Detection examples on COCO test-dev with SSD512 model

1、架構圖集合

2、SSD VS Yolo

SSD算法的架構詳解

更新……

DL之SSD：SSD算法的架構詳解

SSD算法的案例應用

更新……

總結

以上是生活随笔為你收集整理的DL之SSD：SSD算法的简介(论文介绍)、架构详解、案例应用等配图集合之详细攻略的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： Ubuntu之make：make命令行工
下一篇： Py之pydotplus：pydotpl