【神经网络】(11) 轻量化网络MobileNetV1代码复现、解析,附Tensorflow完整代码
各位同學好,今天和大家分享一下如何使用 Tensorflow 復現輕量化神經網絡模型 MobileNetV1。為了能將神經網絡模型用于移動端(手機)和終端(安防監控、無人駕駛)的實時計算,通常這些設備計算能力有限,因此我們需要減少模型參數量、減小計算量、更少的內存訪問量、更少的能耗。
?下圖通過比較GPU和CPU上前向傳播的耗時分布,可見,卷積層占用大部分時間。并且 Batch Size 越大,卷積層花費的時間就越長。由于全連接層在目前的許多網絡中都不存在了。所以,輕量化網絡很大程度上是對卷積層的優化。
下圖是各個網絡的 計算量—準確率 散點圖。我們希望網絡在滿足更少的計算量的同時也能保證較高的準確率,也就是下圖越靠近左上角的網絡越好。我們今天要討論的MobileNet系列網絡,計算量在3-4百萬個參數,準確率適中。
1. MobileNetV1 網絡
MobileNetV1 的核心是使用了深度可分離卷積,接下來就詳細講一講這個理念。
1.1 標準卷積
首先,傳統的卷積是:一個多通道的卷積核在多通道的輸入圖像上滑動,把每次滑動位置的卷積核的權重和原始輸入圖像的對應像素相乘再求和,將計算結果填在新生成的特征圖對應像素位置上。輸入有幾個通道,卷積核就有幾個通道。卷積核在圖像上不斷滑動,滑動的區域稱為感受野。如下圖所示。
舉個例子來說,若輸入的圖像shape為5x5x3,一個卷積核的shape為3x3x3,得到的特征圖shape為3x3x1
1.2 深度可分離卷積
深度可分離卷積(depthwise separable convolution)可理解為:由 深度卷積(depthwise convolution)和 逐點卷積(pointwise convolution)構成。
(1)深度卷積
以下圖為例,深度卷積可以理解為:輸入三通道的圖像,使用三個卷積核,每個卷積核只有一個通道,分別對輸入圖像的三個通道的特征進行卷積。每個通道都用自己對應的卷積核生成一張對應的特征圖,因此輸入是三通道輸出也是三通道。
舉個例子:輸入圖像shape為5x5x3,在通道維度上拆分為3個5x5x1,使用3個3x3x1的卷積核,每個卷積核負責一個通道。將卷積的結果疊加起來,輸出shape為3x3x3
(2)逐點卷積
逐點卷積用于處理跨通道信息(跨層信息融合),采用的是標準卷積的方法,只不過卷積核的size是1*1大小。如下圖所示,輸入特征圖的shape為5*5*3,使用一個1*1*3的卷積核。滑動過程中,卷積核權重和輸入圖像的對應像素值相乘再相加,得到輸出特征圖的shape為5*5*1
深度卷積每個卷積核只關心自己通道的信息,沒有考慮跨通道的信息,現在跨通道的信息由1*1卷積來補充。下圖1個1*1的卷積核生成了1張特征圖,那么n個1*1的卷積核就生成n張特征圖
(3)深度可分離卷積總體流程
先進行深度卷積再進行逐點卷積。深度卷積處理長寬方向的空間信息,不關心跨通道信息。逐點卷積只關心跨通道信息,不關心長寬方向的信息,因為它的size只有1*1
2. 代碼復現
網絡由兩個模塊構成,標準卷積模塊和深度可分離卷積卷積模塊,先將這兩個模塊定義好,下面能直接調用。
模型涉及兩個超參數。alpha:寬度超參數,控制卷積核個數;?depth_multiplier:分辨率超參數,控制輸入圖像的尺寸,進而控制中間層特征圖的大小。
所有層的?通道數?乘以?alpha?參數(四舍五入),模型大小近似下降到原來的?alpha^2?倍,計算量下降到原來的?alpha^2?倍,用于降低模型的寬度。
輸入層的?分辨率?乘以?depth_multiplier?參數 (四舍五入),等價于所有層的分辨率乘?depth_multiplier,模型大小不變,計算量下降到原來的?depth_multiplier^2?倍,用于降低輸入圖像的分辨率。
2.1 標準卷積模塊
標準卷積由 卷積+批標準化+激活函數 構成。
這里使用 ReLU6 激活函數。主要是為了在移動端計算時,float16的低精度的時候,也能有很好的數值分辨率,如果對reLu的輸出值不加限制,那么輸出范圍就是0到正無窮,而低精度的float16無法精確描述其數值,帶來精度損失。
#(1)標準卷積模塊
def conv_block(input_tensor, filters, alpha, kernel_size=(3,3), strides=(1,1)):# 超參數alpha控制卷積核個數filters = int(filters*alpha)# 卷積+批標準化+激活函數x = layers.Conv2D(filters, kernel_size, strides=strides, # 步長padding='same', # 0填充,卷積后特征圖size不變use_bias=False)(input_tensor) # 有BN層就不需要計算偏置x = layers.BatchNormalization()(x) # 批標準化x = layers.ReLU(6.0)(x) # relu6激活函數return x # 返回一次標準卷積后的結果
relu6 函數和 relu 函數圖如下
2.2 深度可分離卷積塊
如果卷積層之后跟了BatchNormalization層,可以不用再加偏置了use_bias=False。如果加了,對模型不起作用,還會占用內存。
#(2)深度可分離卷積塊
def depthwise_conv_block(input_tensor, point_filters, alpha, depth_multiplier, strides=(1,1)):# 超參數alpha控制逐點卷積的卷積核個數point_filters = int(point_filters*alpha)# ① 深度卷積--輸出特征圖個數和輸入特征圖的通道數相同x = layers.DepthwiseConv2D(kernel_size=(3,3), # 卷積核size默認3*3strides=strides, # 步長padding='same', # strides=1時,卷積過程中特征圖size不變depth_multiplier=depth_multiplier, # 超參數,控制卷積層中間輸出特征圖的長寬use_bias=False)(input_tensor) # 有BN層就不需要偏置x = layers.BatchNormalization()(x) # 批標準化x = layers.ReLU(6.0)(x) # relu6激活函數# ② 逐點卷積--1*1標準卷積x = layers.Conv2D(point_filters, kernel_size=(1,1), # 卷積核默認1*1padding='same', # 卷積過程中特征圖size不變strides=(1,1), # 步長為1,對特征圖上每個像素點卷積use_bias=False)(x) # 有BN層,不需要偏置x = layers.BatchNormalization()(x) # 批標準化x = layers.ReLU(6.0)(x) # 激活函數return x # 返回深度可分離卷積結果
2.3 完整網絡架構代碼
根據論文中的網絡模型架構,堆疊每一層。下表中的 Conv dw 為深度卷積,Conv / s1 是逐點卷積
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers, Model, optimizers#(1)標準卷積模塊
def conv_block(input_tensor, filters, alpha, kernel_size=(3,3), strides=(1,1)):# 超參數alpha控制卷積核個數filters = int(filters*alpha)# 卷積+批標準化+激活函數x = layers.Conv2D(filters, kernel_size, strides=strides, # 步長padding='same', # 0填充,卷積后特征圖size不變use_bias=False)(input_tensor) # 有BN層就不需要計算偏置x = layers.BatchNormalization()(x) # 批標準化x = layers.ReLU(6.0)(x) # relu6激活函數return x # 返回一次標準卷積后的結果#(2)深度可分離卷積塊
def depthwise_conv_block(input_tensor, point_filters, alpha, depth_multiplier, strides=(1,1)):# 超參數alpha控制逐點卷積的卷積核個數point_filters = int(point_filters*alpha)# ① 深度卷積--輸出特征圖個數和輸入特征圖的通道數相同x = layers.DepthwiseConv2D(kernel_size=(3,3), # 卷積核size默認3*3strides=strides, # 步長padding='same', # strides=1時,卷積過程中特征圖size不變depth_multiplier=depth_multiplier, # 超參數,控制卷積層中間輸出特征圖的長寬use_bias=False)(input_tensor) # 有BN層就不需要偏置x = layers.BatchNormalization()(x) # 批標準化x = layers.ReLU(6.0)(x) # relu6激活函數# ② 逐點卷積--1*1標準卷積x = layers.Conv2D(point_filters, kernel_size=(1,1), # 卷積核默認1*1padding='same', # 卷積過程中特征圖size不變strides=(1,1), # 步長為1,對特征圖上每個像素點卷積use_bias=False)(x) # 有BN層,不需要偏置x = layers.BatchNormalization()(x) # 批標準化x = layers.ReLU(6.0)(x) # 激活函數return x # 返回深度可分離卷積結果#(3)主干網絡
def MobileNet(classes, input_shape, alpha, depth_multiplier, dropout_rate):# 創建輸入層inputs = layers.Input(shape=input_shape) # [224,224,3]# [224,224,3]==>[112,112,32]x = conv_block(inputs, 32, alpha, strides=(2,2)) # 步長為2,壓縮寬高,提升通道數# [112,112,32]==>[112,112,64]x = depthwise_conv_block(x, 64, alpha, depth_multiplier) # 深度可分離卷積。逐點卷積時卷積核個數為64# [112,112,64]==>[56,56,128]x = depthwise_conv_block(x, 128, alpha, depth_multiplier, strides=(2,2)) # 步長為2,壓縮特征圖size# [56,56,128]==>[56,56,128]x = depthwise_conv_block(x, 128, alpha, depth_multiplier)# [56,56,128]==>[28,28,256]x = depthwise_conv_block(x, 256, alpha, depth_multiplier, strides=(2,2))# [28,28,256]==>[28,28,256]x = depthwise_conv_block(x, 256, alpha, depth_multiplier)# [28,28,256]==>[14,14,512]x = depthwise_conv_block(x, 512, alpha, depth_multiplier, strides=(2,2))# [14,14,512]==>[14,14,512]x = depthwise_conv_block(x, 512, alpha, depth_multiplier)x = depthwise_conv_block(x, 512, alpha, depth_multiplier)x = depthwise_conv_block(x, 512, alpha, depth_multiplier)x = depthwise_conv_block(x, 512, alpha, depth_multiplier)x = depthwise_conv_block(x, 512, alpha, depth_multiplier)# [14,14,512]==>[7,7,1024]x = depthwise_conv_block(x, 1024, alpha, depth_multiplier, strides=(2,2))# [7,7,1024]==>[7,7,1024]x = depthwise_conv_block(x, 1024, alpha, depth_multiplier)# [7,7,1024]==>[1,1,1024] 全局平均池化x = layers.GlobalAveragePooling2D()(x) # 通道維度上對size維度求平均# 超參數調整卷積核(特征圖)個數shape = (1, 1, int(1024 * alpha))# 調整輸出特征圖x的特征圖個數x = layers.Reshape(target_shape=shape)(x)# Dropout層隨機殺死神經元,防止過擬合x = layers.Dropout(rate=dropout_rate)(x)# 卷積層,將特征圖x的個數轉換成分類數x = layers.Conv2D(classes, kernel_size=(1,1), padding='same')(x)# 經過softmax函數,變成分類概率x = layers.Activation('softmax')(x)# 重塑概率數排列形式x = layers.Reshape(target_shape=(classes,))(x)# 構建模型model = Model(inputs, x)# 返回模型結構return modelif __name__ == '__main__':# 獲得模型結構model = MobileNet(classes=1000, # 分類種類數input_shape=[224,224,3], # 模型輸入圖像shapealpha=1.0, # 超參數,控制卷積核個數depth_multiplier=1, # 超參數,控制圖像分辨率dropout_rate=1e-3) # 隨即殺死神經元的概率# 查看網絡模型結構model.summary()
打印網絡模型結構如下,可見參數量是四百萬,相比于VGG網絡的一億參數量已經是非常輕量化的了。
Model: "model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, 224, 224, 3)] 0
_________________________________________________________________
conv2d (Conv2D) (None, 112, 112, 32) 864
_________________________________________________________________
batch_normalization (BatchNo (None, 112, 112, 32) 128
_________________________________________________________________
re_lu (ReLU) (None, 112, 112, 32) 0
_________________________________________________________________
depthwise_conv2d (DepthwiseC (None, 112, 112, 32) 288
_________________________________________________________________
batch_normalization_1 (Batch (None, 112, 112, 32) 128
_________________________________________________________________
re_lu_1 (ReLU) (None, 112, 112, 32) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 112, 112, 64) 2048
_________________________________________________________________
batch_normalization_2 (Batch (None, 112, 112, 64) 256
_________________________________________________________________
re_lu_2 (ReLU) (None, 112, 112, 64) 0
_________________________________________________________________
depthwise_conv2d_1 (Depthwis (None, 56, 56, 64) 576
_________________________________________________________________
batch_normalization_3 (Batch (None, 56, 56, 64) 256
_________________________________________________________________
re_lu_3 (ReLU) (None, 56, 56, 64) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 56, 56, 128) 8192
_________________________________________________________________
batch_normalization_4 (Batch (None, 56, 56, 128) 512
_________________________________________________________________
re_lu_4 (ReLU) (None, 56, 56, 128) 0
_________________________________________________________________
depthwise_conv2d_2 (Depthwis (None, 56, 56, 128) 1152
_________________________________________________________________
batch_normalization_5 (Batch (None, 56, 56, 128) 512
_________________________________________________________________
re_lu_5 (ReLU) (None, 56, 56, 128) 0
_________________________________________________________________
conv2d_3 (Conv2D) (None, 56, 56, 128) 16384
_________________________________________________________________
batch_normalization_6 (Batch (None, 56, 56, 128) 512
_________________________________________________________________
re_lu_6 (ReLU) (None, 56, 56, 128) 0
_________________________________________________________________
depthwise_conv2d_3 (Depthwis (None, 28, 28, 128) 1152
_________________________________________________________________
batch_normalization_7 (Batch (None, 28, 28, 128) 512
_________________________________________________________________
re_lu_7 (ReLU) (None, 28, 28, 128) 0
_________________________________________________________________
conv2d_4 (Conv2D) (None, 28, 28, 256) 32768
_________________________________________________________________
batch_normalization_8 (Batch (None, 28, 28, 256) 1024
_________________________________________________________________
re_lu_8 (ReLU) (None, 28, 28, 256) 0
_________________________________________________________________
depthwise_conv2d_4 (Depthwis (None, 28, 28, 256) 2304
_________________________________________________________________
batch_normalization_9 (Batch (None, 28, 28, 256) 1024
_________________________________________________________________
re_lu_9 (ReLU) (None, 28, 28, 256) 0
_________________________________________________________________
conv2d_5 (Conv2D) (None, 28, 28, 256) 65536
_________________________________________________________________
batch_normalization_10 (Batc (None, 28, 28, 256) 1024
_________________________________________________________________
re_lu_10 (ReLU) (None, 28, 28, 256) 0
_________________________________________________________________
depthwise_conv2d_5 (Depthwis (None, 14, 14, 256) 2304
_________________________________________________________________
batch_normalization_11 (Batc (None, 14, 14, 256) 1024
_________________________________________________________________
re_lu_11 (ReLU) (None, 14, 14, 256) 0
_________________________________________________________________
conv2d_6 (Conv2D) (None, 14, 14, 512) 131072
_________________________________________________________________
batch_normalization_12 (Batc (None, 14, 14, 512) 2048
_________________________________________________________________
re_lu_12 (ReLU) (None, 14, 14, 512) 0
_________________________________________________________________
depthwise_conv2d_6 (Depthwis (None, 14, 14, 512) 4608
_________________________________________________________________
batch_normalization_13 (Batc (None, 14, 14, 512) 2048
_________________________________________________________________
re_lu_13 (ReLU) (None, 14, 14, 512) 0
_________________________________________________________________
conv2d_7 (Conv2D) (None, 14, 14, 512) 262144
_________________________________________________________________
batch_normalization_14 (Batc (None, 14, 14, 512) 2048
_________________________________________________________________
re_lu_14 (ReLU) (None, 14, 14, 512) 0
_________________________________________________________________
depthwise_conv2d_7 (Depthwis (None, 14, 14, 512) 4608
_________________________________________________________________
batch_normalization_15 (Batc (None, 14, 14, 512) 2048
_________________________________________________________________
re_lu_15 (ReLU) (None, 14, 14, 512) 0
_________________________________________________________________
conv2d_8 (Conv2D) (None, 14, 14, 512) 262144
_________________________________________________________________
batch_normalization_16 (Batc (None, 14, 14, 512) 2048
_________________________________________________________________
re_lu_16 (ReLU) (None, 14, 14, 512) 0
_________________________________________________________________
depthwise_conv2d_8 (Depthwis (None, 14, 14, 512) 4608
_________________________________________________________________
batch_normalization_17 (Batc (None, 14, 14, 512) 2048
_________________________________________________________________
re_lu_17 (ReLU) (None, 14, 14, 512) 0
_________________________________________________________________
conv2d_9 (Conv2D) (None, 14, 14, 512) 262144
_________________________________________________________________
batch_normalization_18 (Batc (None, 14, 14, 512) 2048
_________________________________________________________________
re_lu_18 (ReLU) (None, 14, 14, 512) 0
_________________________________________________________________
depthwise_conv2d_9 (Depthwis (None, 14, 14, 512) 4608
_________________________________________________________________
batch_normalization_19 (Batc (None, 14, 14, 512) 2048
_________________________________________________________________
re_lu_19 (ReLU) (None, 14, 14, 512) 0
_________________________________________________________________
conv2d_10 (Conv2D) (None, 14, 14, 512) 262144
_________________________________________________________________
batch_normalization_20 (Batc (None, 14, 14, 512) 2048
_________________________________________________________________
re_lu_20 (ReLU) (None, 14, 14, 512) 0
_________________________________________________________________
depthwise_conv2d_10 (Depthwi (None, 14, 14, 512) 4608
_________________________________________________________________
batch_normalization_21 (Batc (None, 14, 14, 512) 2048
_________________________________________________________________
re_lu_21 (ReLU) (None, 14, 14, 512) 0
_________________________________________________________________
conv2d_11 (Conv2D) (None, 14, 14, 512) 262144
_________________________________________________________________
batch_normalization_22 (Batc (None, 14, 14, 512) 2048
_________________________________________________________________
re_lu_22 (ReLU) (None, 14, 14, 512) 0
_________________________________________________________________
depthwise_conv2d_11 (Depthwi (None, 7, 7, 512) 4608
_________________________________________________________________
batch_normalization_23 (Batc (None, 7, 7, 512) 2048
_________________________________________________________________
re_lu_23 (ReLU) (None, 7, 7, 512) 0
_________________________________________________________________
conv2d_12 (Conv2D) (None, 7, 7, 1024) 524288
_________________________________________________________________
batch_normalization_24 (Batc (None, 7, 7, 1024) 4096
_________________________________________________________________
re_lu_24 (ReLU) (None, 7, 7, 1024) 0
_________________________________________________________________
depthwise_conv2d_12 (Depthwi (None, 7, 7, 1024) 9216
_________________________________________________________________
batch_normalization_25 (Batc (None, 7, 7, 1024) 4096
_________________________________________________________________
re_lu_25 (ReLU) (None, 7, 7, 1024) 0
_________________________________________________________________
conv2d_13 (Conv2D) (None, 7, 7, 1024) 1048576
_________________________________________________________________
batch_normalization_26 (Batc (None, 7, 7, 1024) 4096
_________________________________________________________________
re_lu_26 (ReLU) (None, 7, 7, 1024) 0
_________________________________________________________________
global_average_pooling2d (Gl (None, 1024) 0
_________________________________________________________________
reshape (Reshape) (None, 1, 1, 1024) 0
_________________________________________________________________
dropout (Dropout) (None, 1, 1, 1024) 0
_________________________________________________________________
conv2d_14 (Conv2D) (None, 1, 1, 1000) 1025000
_________________________________________________________________
activation (Activation) (None, 1, 1, 1000) 0
_________________________________________________________________
reshape_1 (Reshape) (None, 1000) 0
=================================================================
Total params: 4,253,864
Trainable params: 4,231,976
Non-trainable params: 21,888
_________________________________________________________________
總結
以上是生活随笔為你收集整理的【神经网络】(11) 轻量化网络MobileNetV1代码复现、解析,附Tensorflow完整代码的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 【机器视觉案例】(11) 眨眼计数器,人
- 下一篇: 【神经网络】(13) ShuffleNe