當前位置：首頁 > 人工智能 > Caffe >内容正文

Caffe

Caffe MNIST 手写数字识别（全面流程）

發布時間：2023/12/14 Caffe 91 豆豆

生活随笔收集整理的這篇文章主要介紹了 Caffe MNIST 手写数字识别（全面流程）小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

1.下載MNIST數據集

2.生成MNIST圖片訓練、驗證、測試數據集

3.制作LMDB數據庫文件

4.準備LeNet-5網絡結構定義模型.prototxt文件

5.準備模型求解配置文件_solver.prototxt

6.開始訓練并生成日志文件

7.訓練日志畫圖（可視化一些訓練數據）plot_training_log.py

7.2日志解析成txt文件（若干數據字段可供畫圖）parse_log.py

8.模型測試和評估（用于選取較優模型）

8.1.測試模型準確率

8.2評估模型性能

9.手寫數字識別（模型部署）

10.數據增強再訓練

10.1訓練數據增加方法

10.2 接著上次訓練狀態再訓練

10.3在訓練的時候實時增加數據的方法：第三方實時擾動的Caffe層

11.caffe-augmentation

Realtime data augmentation

How to use

環境：

OS：Ubuntu 18.04LTS

caffe環境

1.下載MNIST數據集

這里使用Bengio組封裝好的MNIST數據集

(題外話：Bengio:Yoshua Bengio：孤軍奮戰的AI學者和他的烏托邦情懷)

在控制臺下輸入：

wget http://deeplearning.net/data/mnist/mnist.pkl.gz

在當前文件夾下得到mnist.pkl.gz壓縮文件。

2.生成MNIST圖片訓練、驗證、測試數據集

?mnist.pkl.gz這個壓縮包中就是mnist數據集的? 訓練集train、驗證集validate、測試集test采用pickle導出的文件被壓縮為gzip格式，所以采用python中的gzip模塊當成文件就可以讀取。其中每個數據集是一個元組，第一個元素存儲的是手寫數字的圖片：長度為28*28=728的一維浮點型號numpy數組，這個數組就算單一通道的灰度圖像，歸一化后的，最大值1代表白色，最小值0 代表黑色；元組的第二個元素代表的是圖片的對于數字標簽，是一個一維的整型numpy數組，按照下標位置對應圖片中的數字。

知道了以上數據結構信息，就可以使用Python腳本完成數據->圖片的轉換：

執行以下convert_mnist.py 腳本:（這個腳本在當前文件夾下創建一個mnist文件夾，然后在mnist文件夾下創建3個子文件夾：train、val、test；分別用來表示對應生存的訓練、驗證、測試數據集的圖片；

train下有5萬圖像、val和test文件下分別有1萬幅圖像。

# Load the dataset，從壓縮文件讀取MNIST數據集： print('Loading data from mnist.pkl.gz ...') with gzip.open('mnist.pkl.gz', 'rb') as f:train_set, valid_set, test_set = pickle.load(f)#在當前路徑下生成mnist文件夾 imgs_dir = 'mnist' os.system('mkdir -p {}'.format(imgs_dir))#datasets是個字典鍵-值對 dataname-dataset 對 datasets = {'train': train_set, 'val': valid_set, 'test': test_set} for dataname, dataset in datasets.items():print('Converting {} dataset ...'.format(dataname))data_dir = os.sep.join([imgs_dir, dataname]) #字符串拼接os.system('mkdir -p {}'.format(data_dir)) #生成對應的文件夾# i代表數據的序號，用zip()函數讀取對應的位置的圖片和標簽for i, (img, label) in enumerate(zip(*dataset)):filename = '{:0>6d}_{}.jpg'.format(i, label)filepath = os.sep.join([data_dir, filename])img = img.reshape((28, 28)) #將一維數組還原成二維數組#用pyplot保存可以自動歸一化生成像素值在[0,255]之間的灰度圖pyplot.imsave(filepath, img, cmap='gray')if (i+1) % 10000 == 0:print('{} images converted!'.format(i+1))

圖片的命令規則：第一個地段是6位數字是圖片的序號_第二個字段是該圖的標簽.jpg

3.制作LMDB數據庫文件

使用Caffe提高的工具：convert_imageset命令：

先看看該命令的幫助說明：

在控制臺執行以下命令：

/home/yang/caffe/build/tools/convert_imageset -help

得到一些說明，提取關鍵說明：

convert_imageset: Convert a set of images to the leveldb/lmdb format used as input for Caffe. Usage:convert_imageset [FLAGS] ROOTFOLDER/ LISTFILE DB_NAME The ImageNet dataset for the training demo is athttp://www.image-net.org/download-images...Flags from tools/convert_imageset.cpp:-backend (The backend {lmdb, leveldb} for storing the result) type: stringdefault: "lmdb"-check_size (When this option is on, check that all the datum have the samesize) type: bool default: false-encode_type (Optional: What type should we encode the image as('png','jpg',...).) type: string default: ""-encoded (When this option is on, the encoded image will be save in datum)type: bool default: false-gray (When this option is on, treat images as grayscale ones) type: booldefault: false-resize_height (Height images are resized to) type: int32 default: 0-resize_width (Width images are resized to) type: int32 default: 0-shuffle (Randomly shuffle the order of images and their labels) type: booldefault: false

該命令需要一個圖片路徑和標簽的列表文件.txt；該文件的每一行是一幅圖片的全路徑標簽?

比如：train.txt 局部如下：

mnist/train/033247_5.jpg 5
mnist/train/025404_9.jpg 9
mnist/train/026385_8.jpg 8
mnist/train/013058_5.jpg 5
mnist/train/006524_5.jpg 5

...

我們需要將第2步驟生成的train、val、test文件夾下的所有圖像的路徑和標簽生存三個train.txt、val.txt、test.txt文件：

執行如下的三條命令，分類生成train.txt;val.txt;test.txt文件

python gen_caffe_imglist.py mnist/train train.txt python gen_caffe_imglist.py mnist/val val.txt python gen_caffe_imglist.py mnist/test test.txt

其中gen_caffe_imglist.py腳本如下：傳入的第一個參數是包含圖片的文件路徑（相對路徑）第二個參數是生存的.txt文件名（路徑）

#!/usr/bin/env python3 # -*- coding: utf-8 -*- """ Created on Mon Dec 3 18:35:55 2018 @author: yang """import os import sysinput_path = sys.argv[1].rstrip(os.sep) output_path = sys.argv[2]filenames = os.listdir(input_path)with open(output_path, 'w') as f:for filename in filenames:filepath = os.sep.join([input_path, filename])label = filename[:filename.rfind('.')].split('_')[1]line = '{} {}\n'.format(filepath, label)f.write(line) f.close()

這樣就生存了三個數據集圖片文件列表和對應的標簽了，下面就可以調用caffe提供的convert_imageset命令實現轉換了：

/home/yang/caffe/build/tools/convert_imageset ./ train.txt train_lmdb --gray --shuffle/home/yang/caffe/build/tools/convert_imageset ./ val.txt val_lmdb --gray --shuffle/home/yang/caffe/build/tools/convert_imageset ./ test.txt test_lmdb --gray --shuffle

然后就在當前文件夾下生成了3個LMDB文件夾了：train_lmdb；val_lmdb；test_lmdb；

4.準備LeNet-5網絡結構定義模型.prototxt文件

lenet_train_val.prototxt 這個文件是用來訓練模型用的；內容稍微區別于發布文件lenet.prototxt；

lenet_train_val.prototxt內容如下：（注意需要修改開頭的輸入數據lmdb文件的路徑）

name: "LeNet" layer {name: "mnist"type: "Data"top: "data"top: "label"include {phase: TRAIN}transform_param {mean_value: 128scale: 0.00390625}data_param {source: "train_lmdb"batch_size: 64backend: LMDB} } layer {name: "mnist"type: "Data"top: "data"top: "label"include {phase: TEST}transform_param {mean_value: 128scale: 0.00390625}data_param {source: "val_lmdb"batch_size: 100backend: LMDB} } layer {name: "conv1"type: "Convolution"bottom: "data"top: "conv1"param {lr_mult: 1}param {lr_mult: 2}convolution_param {num_output: 20kernel_size: 5stride: 1weight_filler {type: "xavier"}bias_filler {type: "constant"}} } layer {name: "pool1"type: "Pooling"bottom: "conv1"top: "pool1"pooling_param {pool: MAXkernel_size: 2stride: 2} } layer {name: "conv2"type: "Convolution"bottom: "pool1"top: "conv2"param {lr_mult: 1}param {lr_mult: 2}convolution_param {num_output: 50kernel_size: 5stride: 1weight_filler {type: "xavier"}bias_filler {type: "constant"}} } layer {name: "pool2"type: "Pooling"bottom: "conv2"top: "pool2"pooling_param {pool: MAXkernel_size: 2stride: 2} } layer {name: "ip1"type: "InnerProduct"bottom: "pool2"top: "ip1"param {lr_mult: 1}param {lr_mult: 2}inner_product_param {num_output: 500weight_filler {type: "xavier"}bias_filler {type: "constant"}} } layer {name: "relu1"type: "ReLU"bottom: "ip1"top: "ip1" } layer {name: "ip2"type: "InnerProduct"bottom: "ip1"top: "ip2"param {lr_mult: 1}param {lr_mult: 2}inner_product_param {num_output: 10weight_filler {type: "xavier"}bias_filler {type: "constant"}} } layer {name: "accuracy"type: "Accuracy"bottom: "ip2"bottom: "label"top: "accuracy"include {phase: TEST} } layer {name: "loss"type: "SoftmaxWithLoss"bottom: "ip2"bottom: "label"top: "loss" }

5.準備模型求解配置文件_solver.prototxt

lenet_solver.prototxt內容如下：

注意修改net定義文件的路徑（這里是相對路徑就是文件名）net: "lenet_train_val.prototxt"

和存儲訓練迭代中間結果的快照文件夾：snapshot_prefix: "snapshot" ，先在當前文件夾下創建一個快照文件夾snapshot

# The train/validate net protocol buffer definition net: "lenet_train_val.prototxt" # test_iter specifies how many forward passes the test should carry out. # In the case of MNIST, we have test batch size 100 and 100 test iterations, # covering the full 10,000 testing images. test_iter: 100 # Carry out testing every 500 training iterations. test_interval: 500 # The base learning rate, momentum and the weight decay of the network. base_lr: 0.01 momentum: 0.9 weight_decay: 0.0005 # The learning rate policy lr_policy: "inv" gamma: 0.0001 power: 0.75 # Display every 100 iterations display: 100 # The maximum number of iterations max_iter: 36000 # snapshot intermediate results snapshot: 5000 snapshot_prefix: "snapshot" # solver mode: CPU or GPU solver_mode: GPU

6.開始訓練并生成日志文件

首先在當前文件夾下創建一個存儲日志的文件夾：trainLog文件夾

然后執行以下命令：

train命令需要帶網絡模型定義文件lenet_solver.prototxt

其他參數可以通過-help 查看一下

/home/yang/caffe/build/tools/caffe train -solver lenet_solver.prototxt -gpu 0 -log_dir ./trainLog

訓練結束在快照文件夾下生成了不同迭代次數的求解狀態文件*.solverstate和網絡模型參數文件：*.caffemodel

7.訓練日志畫圖（可視化一些訓練數據）plot_training_log.py

caffe提供了可視化log的工具：

python /home/yang/caffe/tools/extra/plot_training_log.py

執行如下命令查看幫助：

python /home/yang/caffe/tools/extra/plot_training_log.py -help yang@yang-System-Product-Name:~/caffe/data/mnist_Bengio/trainLog$ python /home/yang/caffe/tools/extra/plot_training_log.py -help This script mainly serves as the basis of your customizations. Customization is a must. You can copy, paste, edit them in whatever way you want. Be warned that the fields in the training log may change in the future. You had better check the data files and change the mapping from field name tofield index in create_field_index before designing your own plots. Usage:./plot_training_log.py chart_type[0-7] /where/to/save.png /path/to/first.log ... Notes:1. Supporting multiple logs.2. Log file name must end with the lower-cased ".log". Supported chart types:0: Test accuracy vs. Iters1: Test accuracy vs. Seconds2: Test loss vs. Iters3: Test loss vs. Seconds4: Train learning rate vs. Iters5: Train learning rate vs. Seconds6: Train loss vs. Iters7: Train loss vs. Seconds

Caffe提供了可視化log的工具：在/home/yang/caffe/tools/extra 下面的polt_training_log.py.example 文件
把這個文件復制一份并命名為polt_training_log.py ，就可以用這個python腳本來畫圖：
這個腳本的輸入參數類型是：需要畫什么圖、生成的圖片存儲路徑與文件名、訓練得到log文件路徑

支持畫8中圖：
0：測試準確率 vs. 迭代次數
1：測試準確率 vs. 訓練時間（秒）
2：測試loss vs. 迭代次數
3：測試loss vs. 訓練時間
4:學習率lr vs. 迭代次數
5：學習率lr vs. 訓練時間
6:訓練loss vs.迭代次數
7: 訓練loss vs.訓練時間

在控制臺執行如下命令：生成以上8幅圖；

python /home/yang/caffe/tools/extra/plot_training_log.py 0 test_acc_vs_iters.png caffeLeNetTrain20181203.log python /home/yang/caffe/tools/extra/plot_training_log.py 1 test_acc_vs_time.png caffeLeNetTrain20181203.log python /home/yang/caffe/tools/extra/plot_training_log.py 2 test_loss_vs_iters.png caffeLeNetTrain20181203.log python /home/yang/caffe/tools/extra/plot_training_log.py 3 test_loss_vs_time.png caffeLeNetTrain20181203.log python /home/yang/caffe/tools/extra/plot_training_log.py 4 lr_vs_iters.png caffeLeNetTrain20181203.log python /home/yang/caffe/tools/extra/plot_training_log.py 5 lr_vs_time.png caffeLeNetTrain20181203.log python /home/yang/caffe/tools/extra/plot_training_log.py 6 train_loss_vs_iters.png caffeLeNetTrain20181203.log python /home/yang/caffe/tools/extra/plot_training_log.py 7 train_loss_vs_time.png caffeLeNetTrain20181203.log

如下是train的準確率-迭代次數圖：

7.2日志解析成txt文件（若干數據字段可供畫圖）parse_log.py

yang@yang-System-Product-Name:~/caffe/data/mnist_Bengio/trainLog$ python /home/yang/caffe/tools/extra/parse_log.py -h usage: parse_log.py [-h] [--verbose] [--delimiter DELIMITER]logfile_path output_dirParse a Caffe training log into two CSV files containing training and testing informationpositional arguments:logfile_path Path to log fileoutput_dir Directory in which to place output CSV filesoptional arguments:-h, --help show this help message and exit--verbose Print some extra info (e.g., output filenames)--delimiter DELIMITERColumn delimiter in output files (default: ',')

2.parse_log.py文件的作用就是：將你的日志文件分解成兩個txt（csv)文本文件。
終端輸入如下命令
python ./tools/extra/parse_log.py ./examples/myfile/a.log? ./examples/myfile/
便會在myfile/目錄下產生a.log.train 和a.log.test的文件，根據這兩個文件你可以使用matplotlib庫畫出你想要的圖像。

/home/yang/caffe/tools/extra/parse_log.py caffeLeNetTrain20181203.log ./

下面的指令解析caffeLeNetTrain20181203.log 并在當前文件夾./下生成兩個.txt文件：caffeLeNetTrain20181203.log.train；caffeLeNetTrain20181203.log.test
這兩個文本文件包含這些字段：NumIters,Seconds,LearningRate,accuracy,loss

根據這兩個文件你可以使用matplotlib庫畫出你想要的圖像。

下面是訓練日志解析文件局部：caffeLeNetTrain20181203.log.train：

NumIters,Seconds,LearningRate,loss 0.0,0.155351,0.01,2.33102 100.0,0.482773,0.01,0.167176 200.0,0.806851,0.00992565,0.1556 300.0,1.1295,0.00985258,0.0575197 400.0,1.460222,0.00978075,0.0952922 500.0,1.897946,0.00971013,0.0684174 600.0,2.216532,0.00964069,0.0514046

下面是測試文件解析局部：caffeLeNetTrain20181203.log.test

NumIters,Seconds,LearningRate,accuracy,loss 0.0,0.129343,0.00971013,0.0919,2.33742 500.0,1.895023,0.00971013,0.976,0.0833776 1000.0,3.602925,0.00937411,0.9794,0.0671232 1500.0,5.299409,0.00906403,0.9853,0.0522081 2000.0,6.99157,0.00877687,0.9856,0.0475213 2500.0,8.691082,0.00851008,0.9859,0.0473052

利用Python的 pandas 和 matplotlib 可以畫出以上字段的各個字段的曲線：

import pandas as pd
import matplotlib.pyplot as plt

如下是畫出訓練和驗證（測試）的loss-NumIters迭代次數曲線圖：

#!/usr/bin/env python3 # -*- coding: utf-8 -*- """ Created on Tue Dec 4 10:53:28 2018@author: yang """import pandas as pd import matplotlib.pyplot as plt train_log = pd.read_csv("caffeLeNetTrain20181203.log.train") test_log = pd.read_csv("caffeLeNetTrain20181203.log.test") _, ax1 = plt.subplots() ax1.set_title("train loss and test loss") ax1.plot(train_log["NumIters"], train_log["loss"], alpha=0.5) ax1.plot(test_log["NumIters"], test_log["loss"], 'g') ax1.set_xlabel('iteration') ax1.set_ylabel('train loss') plt.legend(loc='upper left') ax2 = ax1.twinx() #ax2.plot(test_log["NumIters"], test_log["LearningRate"], 'r') #ax2.plot(test_log["NumIters"], test_log["LearningRate"], 'm') #ax2.set_ylabel('test LearningRate') #plt.legend(loc='upper right') plt.show() print('Done.')

8.模型測試和評估（用于選取較優模型）

8.1.測試模型準確率

訓練好后，就需要對模型進行測試和評估

其實在訓練過程中，每迭代500此，就已經在val_mldb上對模型進行了準確率的評估了；
不過MNIST除了驗證集還有測試集，對于模型的選擇，以測試集合為準進行評估（泛化能力）

對lenet_train_val.ptototxt 文件的頭部數據層部分稍作修改，刪去TRAIN層，將TEST層的數據源路徑改為test_lmdb文件路徑：

lenet_test.prototxt:

name: "LeNet" layer {name: "mnist"type: "Data"top: "data"top: "label"include {phase: TEST}transform_param {mean_value: 128scale: 0.00390625}data_param {source: "test_lmdb"batch_size: 100backend: LMDB} }...

下面執行caffe的測試命令：并生成日志文件：

/home/yang/caffe/build/tools/caffe test -model lenet_test.prototxt -weights ./snapshot/lenet_solver_iter_5000.caffemodel -gpu 0 -iterations 100 -log_dir ./testLog

test命令參數說明：
-model 指定測試網絡定義文件lenet_test.prototxt
-weights 指定一個迭代次數下生成的權重文件lenet_solver_iter_5000.caffemodel（這個需要在所以這些權重文件下選取最優的一個）

-gpu 0 ：使用0號GPU -iterations 100 -iterations參數與 lenet_solver.prototxt文件下的 test_iter參賽類似： # test_iter specifies how many forward passes the test should carry out. # In the case of MNIST, we have test batch size 100 and 100 test iterations, # covering the full 10,000 testing images. test_iter: 100

要遍歷所有的待測試圖像1萬張
需要滿足：-iterations * batch_size=10,000

batch_size是lenet_test.prototxt 中的測試數據層中指定的批處理的大小參數。

然后，在testLog文件夾下得到caffe.INFO的終端命令行輸出記錄? 和 ?
caffe.yang-System-Product-Name.yang.log.INFO.20181204-095911.3660 是測試日志文件；
caffe.INFO是終端屏幕輸出；

終端部分輸出如下（關注最后的Loss和accuracy):

... I1204 09:59:16.654608 3660 caffe.cpp:281] Running for 100 iterations. I1204 09:59:16.670982 3660 caffe.cpp:304] Batch 0, accuracy = 0.98 I1204 09:59:16.671051 3660 caffe.cpp:304] Batch 0, loss = 0.0443168 I1204 09:59:16.672643 3660 caffe.cpp:304] Batch 1, accuracy = 1 I1204 09:59:16.672709 3660 caffe.cpp:304] Batch 1, loss = 0.0175841 I1204 09:59:16.674376 3660 caffe.cpp:304] Batch 2, accuracy = 0.99 I1204 09:59:16.674437 3660 caffe.cpp:304] Batch 2, loss = 0.0308315 ... I1204 09:59:16.795164 3671 data_layer.cpp:73] Restarting data prefetching from start. I1204 09:59:16.795873 3660 caffe.cpp:304] Batch 97, accuracy = 0.98 I1204 09:59:16.795882 3660 caffe.cpp:304] Batch 97, loss = 0.0427303 I1204 09:59:16.797765 3660 caffe.cpp:304] Batch 98, accuracy = 0.97 I1204 09:59:16.797775 3660 caffe.cpp:304] Batch 98, loss = 0.107767 I1204 09:59:16.798722 3660 caffe.cpp:304] Batch 99, accuracy = 0.99 I1204 09:59:16.798730 3660 caffe.cpp:304] Batch 99, loss = 0.0540964 I1204 09:59:16.798734 3660 caffe.cpp:309] Loss: 0.0391683 I1204 09:59:16.798739 3660 caffe.cpp:321] accuracy = 0.9879 I1204 09:59:16.798746 3660 caffe.cpp:321] loss = 0.0391683 (* 1 = 0.0391683 loss)

可以看出程序對每一個Batch的準確率都進行了計算，最后得到了一個總的準確率；
當訓練生成的存檔模型不是很多的時候，可以對照驗證數據Loss 小，accuracy高的區域，手動人工選取一個最優的模型；
如是模型存檔快照比較多，可以利用測試數據集進行挑選模型，寫腳本來遍歷所有的模型，得到一個Loss小，accuracy高的模型；
一般而言，當數據多時，測試集合的loss最小和accuracy最高的模型就越有可能是同一個，如果不是同一個，通常選取loss最小的模型泛化能力會好一些；

其實，訓練集、驗證集、測試集都只是對真是數據分布情況的采樣，
從大數據量挑選的模型比從小數據量更有信心而已了；

下面在訓練日志文件夾下利用 7.2節中的parse_log.py解析訓練日志caffe.yang-System-Product-Name.yang.log.INFO.20181203-184414.8457文件得到的驗證集合（TEST）的csv文件：

利用如下Python腳本畫出： val_loss-iterNums?? val_accuracy-iterNums 畫在一張圖中：

val_loss_val_accuracy.py 如下：

#!/usr/bin/env python3 # -*- coding: utf-8 -*- """ Created on Tue Dec 4 10:53:28 2018@author: yang """import pandas as pd import matplotlib.pyplot as plt val_log = pd.read_csv("caffeLeNetTrain20181203.log.test") #驗證數據集 _, ax1 = plt.subplots() ax1.set_title("val loss and val accuracy") ax1.plot(val_log["NumIters"], val_log["loss"], 'g') ax1.set_xlabel('iterations') ax1.set_ylabel('val loss') plt.legend(loc='center left') ax2 = ax1.twinx() ax2.plot(val_log["NumIters"], val_log["accuracy"], 'm') ax2.set_ylabel('val accuracy') plt.legend(loc='center right') plt.show() print('Done.')

得到：val_loss_val_accuracy_vs_iterNums1.png

隨著迭代次數的增加，模型在驗證集上的loss下降，accuracy上升；所以可以選擇35000次的模型進行部署；

我們這里在測試集上測試一下35000次迭代生成的模型快照，與上面的5000次迭代生成的模型快照作對比。

在終端輸入：

/home/yang/caffe/build/tools/caffe test -model lenet_test.prototxt -weights ./snapshot/lenet_solver_iter_35000.caffemodel -gpu 0 -iterations 100 -log_dir ./testLog

查看輸出的最后三行：

I1204 14:58:12.377961 6560 caffe.cpp:309] Loss: 0.0267361 I1204 14:58:12.377966 6560 caffe.cpp:321] accuracy = 0.9904 I1204 14:58:12.377972 6560 caffe.cpp:321] loss = 0.0267361 (* 1 = 0.0267361 loss)

發現Loss 和accuracy確實比5000次迭代的更優了！

8.2評估模型性能

評估模型性能主要是時間和空間占用情況，即評估模型的一次前向傳播所需要的運行時間和內存占用情況：

Caffe支持此評估，使用Caffe提供的工具：

/home/yang/caffe/build/tools/caffe time 命令

只要有模型網絡描述文件.prototxt就可以：

在Caffe主目錄 examples/mnist/下得到 lenet.prototxt文件，然后執行如下命令：

/home/yang/caffe/build/tools/caffe time -model lenet.prototxt -gpu 0

部分輸出如下：

yang@yang-System-Product-Name:~/caffe/data/mnist_Bengio$ /home/yang/caffe/build/tools/caffe time -model lenet.prototxt -gpu 0 /home/yang/caffe/build/tools/caffe: /home/yang/anaconda2/lib/libtiff.so.5: no version information available (required by /usr/local/lib/libopencv_imgcodecs.so.3.4) I1204 15:12:33.080821 6687 caffe.cpp:339] Use GPU with device ID 0 I1204 15:12:33.266084 6687 net.cpp:53] Initializing net from parameters: ...I1204 15:12:33.266204 6687 layer_factory.hpp:77] Creating layer data I1204 15:12:33.266217 6687 net.cpp:86] Creating Layer data I1204 15:12:33.266227 6687 net.cpp:382] data -> data I1204 15:12:33.275761 6687 net.cpp:124] Setting up data I1204 15:12:33.275779 6687 net.cpp:131] Top shape: 64 1 28 28 (50176) I1204 15:12:33.275794 6687 net.cpp:139] Memory required for data: 200704 I1204 15:12:33.275801 6687 layer_factory.hpp:77] Creating layer conv1 I1204 15:12:33.275822 6687 net.cpp:86] Creating Layer conv1 I1204 15:12:33.275828 6687 net.cpp:408] conv1 <- data I1204 15:12:33.275837 6687 net.cpp:382] conv1 -> conv1 I1204 15:12:33.680294 6687 net.cpp:124] Setting up conv1 I1204 15:12:33.680315 6687 net.cpp:131] Top shape: 64 20 24 24 (737280) I1204 15:12:33.680322 6687 net.cpp:139] Memory required for data: 3149824 ... I1204 15:12:33.685878 6687 net.cpp:244] This network produces output prob I1204 15:12:33.685887 6687 net.cpp:257] Network initialization done. I1204 15:12:33.685910 6687 caffe.cpp:351] Performing Forward I1204 15:12:33.703292 6687 caffe.cpp:356] Initial loss: 0 I1204 15:12:33.703311 6687 caffe.cpp:357] Performing Backward I1204 15:12:33.703316 6687 caffe.cpp:365] *** Benchmark begins *** I1204 15:12:33.703320 6687 caffe.cpp:366] Testing for 50 iterations. I1204 15:12:33.705480 6687 caffe.cpp:394] Iteration: 1 forward-backward time: 2.14998 ms. I1204 15:12:33.707129 6687 caffe.cpp:394] Iteration: 2 forward-backward time: 1.63258 ms. I1204 15:12:33.709730 6687 caffe.cpp:394] Iteration: 3 forward-backward time: 2.58979 ms. ... I1204 15:12:33.783918 6687 caffe.cpp:397] Average time per layer: I1204 15:12:33.783921 6687 caffe.cpp:400] data forward: 0.0011584 ms. I1204 15:12:33.783926 6687 caffe.cpp:403] data backward: 0.00117824 ms. I1204 15:12:33.783929 6687 caffe.cpp:400] conv1 forward: 0.449037 ms. I1204 15:12:33.783933 6687 caffe.cpp:403] conv1 backward: 0.251798 ms. I1204 15:12:33.783936 6687 caffe.cpp:400] pool1 forward: 0.0626419 ms. I1204 15:12:33.783941 6687 caffe.cpp:403] pool1 backward: 0.00116608 ms. I1204 15:12:33.783943 6687 caffe.cpp:400] conv2 forward: 0.194311 ms. I1204 15:12:33.783947 6687 caffe.cpp:403] conv2 backward: 0.190176 ms. I1204 15:12:33.783965 6687 caffe.cpp:400] pool2 forward: 0.0201024 ms. I1204 15:12:33.783969 6687 caffe.cpp:403] pool2 backward: 0.00117952 ms. I1204 15:12:33.783972 6687 caffe.cpp:400] ip1 forward: 0.0706387 ms. I1204 15:12:33.783977 6687 caffe.cpp:403] ip1 backward: 0.0717856 ms. I1204 15:12:33.783980 6687 caffe.cpp:400] relu1 forward: 0.00906752 ms. I1204 15:12:33.783984 6687 caffe.cpp:403] relu1 backward: 0.0011584 ms. I1204 15:12:33.783988 6687 caffe.cpp:400] ip2 forward: 0.0247597 ms. I1204 15:12:33.783993 6687 caffe.cpp:403] ip2 backward: 0.0221478 ms. I1204 15:12:33.783996 6687 caffe.cpp:400] prob forward: 0.0119437 ms. I1204 15:12:33.784000 6687 caffe.cpp:403] prob backward: 0.00113536 ms. I1204 15:12:33.784006 6687 caffe.cpp:408] Average Forward pass: 0.938644 ms. I1204 15:12:33.784010 6687 caffe.cpp:410] Average Backward pass: 0.637078 ms. I1204 15:12:33.784014 6687 caffe.cpp:412] Average Forward-Backward: 1.61356 ms. I1204 15:12:33.784021 6687 caffe.cpp:414] Total Time: 80.678 ms. I1204 15:12:33.784029 6687 caffe.cpp:415] *** Benchmark ends ***

我的電腦的GPU是NVIDIA? GeForce GTX 960 2G顯存的，執行一次Lenet的前向傳播，平均時間不到1ms

Average Forward pass: 0.938644 ms.

再來測試一下CPU下運行的時間：去掉以上命令中的 -gpu 0參數

輸入：

/home/yang/caffe/build/tools/caffe time -model lenet.prototxt

結尾輸出：

I1204 15:18:10.153908 6768 caffe.cpp:397] Average time per layer: I1204 15:18:10.153916 6768 caffe.cpp:400] data forward: 0.00064 ms. I1204 15:18:10.153939 6768 caffe.cpp:403] data backward: 0.0009 ms. I1204 15:18:10.153951 6768 caffe.cpp:400] conv1 forward: 2.21126 ms. I1204 15:18:10.153965 6768 caffe.cpp:403] conv1 backward: 3.18376 ms. I1204 15:18:10.153981 6768 caffe.cpp:400] pool1 forward: 2.59676 ms. I1204 15:18:10.153996 6768 caffe.cpp:403] pool1 backward: 0.0006 ms. I1204 15:18:10.154012 6768 caffe.cpp:400] conv2 forward: 6.02428 ms. I1204 15:18:10.154027 6768 caffe.cpp:403] conv2 backward: 4.72778 ms. I1204 15:18:10.154043 6768 caffe.cpp:400] pool2 forward: 1.6211 ms. I1204 15:18:10.154058 6768 caffe.cpp:403] pool2 backward: 0.00072 ms. I1204 15:18:10.154073 6768 caffe.cpp:400] ip1 forward: 0.3852 ms. I1204 15:18:10.154086 6768 caffe.cpp:403] ip1 backward: 0.2337 ms. I1204 15:18:10.154150 6768 caffe.cpp:400] relu1 forward: 0.04076 ms. I1204 15:18:10.154165 6768 caffe.cpp:403] relu1 backward: 0.0005 ms. I1204 15:18:10.154181 6768 caffe.cpp:400] ip2 forward: 0.03236 ms. I1204 15:18:10.154196 6768 caffe.cpp:403] ip2 backward: 0.01712 ms. I1204 15:18:10.154213 6768 caffe.cpp:400] prob forward: 0.04284 ms. I1204 15:18:10.154230 6768 caffe.cpp:403] prob backward: 0.02084 ms. I1204 15:18:10.154249 6768 caffe.cpp:408] Average Forward pass: 12.9634 ms. I1204 15:18:10.154259 6768 caffe.cpp:410] Average Backward pass: 8.19254 ms. I1204 15:18:10.154268 6768 caffe.cpp:412] Average Forward-Backward: 21.2 ms. I1204 15:18:10.154278 6768 caffe.cpp:414] Total Time: 1060 ms. I1204 15:18:10.154287 6768 caffe.cpp:415] *** Benchmark ends ***

本臺計算機的CPU是Intel? Core? i7-6700K CPU @ 4.00GHz × 8 ，Caffe的基本線性代數子庫使用的是OpenBLAS

平均一次前向傳播需要 12.96ms的時間，遠遠比GPU的運算時間慢；

9.手寫數字識別（模型部署）

有了訓練好的模型，我們就可以用來識別手寫數字了，這里測試用的是test數據集的圖片和之前生成的test.txt文件列表：

下面是用來完成以上任務的recognize_digit.py

recognize_digit.py如下：

#!/usr/bin/env python3 # -*- coding: utf-8 -*- """ Created on Tue Dec 4 15:29:44 2018@author: yang """import sys sys.path.append('/home/yang/caffe/python') import numpy as np import cv2 import caffeMEAN = 128 SCALE = 0.00390625imglist = sys.argv[1] #第一個參數輸入test.txt 文件路徑caffe.set_mode_gpu() caffe.set_device(0) net = caffe.Net('lenet.prototxt', './snapshot/lenet_solver_iter_36000.caffemodel', caffe.TEST) net.blobs['data'].reshape(1, 1, 28, 28)with open(imglist, 'r') as f:line = f.readline()while line:imgpath, label = line.split()line = f.readline()image = cv2.imread(imgpath, cv2.IMREAD_GRAYSCALE).astype(np.float) - MEANimage *= SCALEnet.blobs['data'].data[...] = imageoutput = net.forward()pred_label = np.argmax(output['prob'][0])print('Predicted digit for {} is {}'.format(imgpath, pred_label))

在終端執行：因為1萬張圖片太多，所以將屏幕的標準輸出重定向lenet_model_test.txt文件：

test.txt 參數是測試數據集的文件路徑和標簽的列表

python recognize_digit.py test.txt >& lenet_model_test.txt

如下是部分預測輸出結果：

Predicted digit for mnist/test/005120_2.jpg is 2 Predicted digit for mnist/test/006110_1.jpg is 1 Predicted digit for mnist/test/004019_6.jpg is 6 Predicted digit for mnist/test/009045_7.jpg is 7 Predicted digit for mnist/test/004194_4.jpg is 4 Predicted digit for mnist/test/006253_7.jpg is 7 Predicted digit for mnist/test/000188_0.jpg is 0 Predicted digit for mnist/test/001068_8.jpg is 8 Predicted digit for mnist/test/007297_8.jpg is 8 Predicted digit for mnist/test/000003_0.jpg is 0 Predicted digit for mnist/test/009837_7.jpg is 7 Predicted digit for mnist/test/000093_3.jpg is 3

10.數據增強再訓練

訓練數據的增加參考：https://github.com/frombeijingwithlove/dlcv_for_beginners/tree/master/chap6/data_augmentation

這里因為Mnist是灰度圖，所以我們就使用平移和旋轉來增加數據：

10.1訓練數據增加方法

在工作路徑文件夾下將以上鏈接的run_augmentation.py 和 image_augmentation.py 下載下來：

run_augmentation.py如下：

#!/usr/bin/env python3 # -*- coding: utf-8 -*- """ Created on Tue Dec 4 16:00:27 2018@author: yang """import os import argparse import random import math from multiprocessing import Process, cpu_countimport cv2import image_augmentation as iadef parse_args():parser = argparse.ArgumentParser(description='A Simple Image Data Augmentation Tool',formatter_class=argparse.ArgumentDefaultsHelpFormatter)parser.add_argument('input_dir',help='Directory containing images')parser.add_argument('output_dir',help='Directory for augmented images')parser.add_argument('num',help='Number of images to be augmented',type=int)parser.add_argument('--num_procs',help='Number of processes for paralleled augmentation',type=int, default=cpu_count())parser.add_argument('--p_mirror',help='Ratio to mirror an image',type=float, default=0.5)parser.add_argument('--p_crop',help='Ratio to randomly crop an image',type=float, default=1.0)parser.add_argument('--crop_size',help='The ratio of cropped image size to original image size, in area',type=float, default=0.8)parser.add_argument('--crop_hw_vari',help='Variation of h/w ratio',type=float, default=0.1)parser.add_argument('--p_rotate',help='Ratio to randomly rotate an image',type=float, default=1.0)parser.add_argument('--p_rotate_crop',help='Ratio to crop out the empty part in a rotated image',type=float, default=1.0)parser.add_argument('--rotate_angle_vari',help='Variation range of rotate angle',type=float, default=10.0)parser.add_argument('--p_hsv',help='Ratio to randomly change gamma of an image',type=float, default=1.0)parser.add_argument('--hue_vari',help='Variation of hue',type=int, default=10)parser.add_argument('--sat_vari',help='Variation of saturation',type=float, default=0.1)parser.add_argument('--val_vari',help='Variation of value',type=float, default=0.1)parser.add_argument('--p_gamma',help='Ratio to randomly change gamma of an image',type=float, default=1.0)parser.add_argument('--gamma_vari',help='Variation of gamma',type=float, default=2.0)args = parser.parse_args()args.input_dir = args.input_dir.rstrip('/')args.output_dir = args.output_dir.rstrip('/')return argsdef generate_image_list(args):filenames = os.listdir(args.input_dir)num_imgs = len(filenames)num_ave_aug = int(math.floor(args.num/num_imgs))rem = args.num - num_ave_aug*num_imgslucky_seq = [True]*rem + [False]*(num_imgs-rem)random.shuffle(lucky_seq)img_list = [(os.sep.join([args.input_dir, filename]), num_ave_aug+1 if lucky else num_ave_aug)for filename, lucky in zip(filenames, lucky_seq)]random.shuffle(img_list) # in case the file size are not uniformly distributedlength = float(num_imgs) / float(args.num_procs)indices = [int(round(i * length)) for i in range(args.num_procs + 1)]return [img_list[indices[i]:indices[i + 1]] for i in range(args.num_procs)]def augment_images(filelist, args):for filepath, n in filelist:img = cv2.imread(filepath)filename = filepath.split(os.sep)[-1]dot_pos = filename.rfind('.')imgname = filename[:dot_pos]ext = filename[dot_pos:]print('Augmenting {} ...'.format(filename))for i in range(n):img_varied = img.copy()varied_imgname = '{}_{:0>3d}_'.format(imgname, i)if random.random() < args.p_mirror:img_varied = cv2.flip(img_varied, 1)varied_imgname += 'm'if random.random() < args.p_crop:img_varied = ia.random_crop(img_varied,args.crop_size,args.crop_hw_vari)varied_imgname += 'c'if random.random() < args.p_rotate:img_varied = ia.random_rotate(img_varied,args.rotate_angle_vari,args.p_rotate_crop)varied_imgname += 'r'if random.random() < args.p_hsv:img_varied = ia.random_hsv_transform(img_varied,args.hue_vari,args.sat_vari,args.val_vari)varied_imgname += 'h'if random.random() < args.p_gamma:img_varied = ia.random_gamma_transform(img_varied,args.gamma_vari)varied_imgname += 'g'output_filepath = os.sep.join([args.output_dir,'{}{}'.format(varied_imgname, ext)])cv2.imwrite(output_filepath, img_varied)def main():args = parse_args()params_str = str(args)[10:-1]if not os.path.exists(args.output_dir):os.mkdir(args.output_dir)print('Starting image data augmentation for {}\n''with\n{}\n'.format(args.input_dir, params_str))sublists = generate_image_list(args)processes = [Process(target=augment_images, args=(x, args, )) for x in sublists]for p in processes:p.start()for p in processes:p.join()print('\nDone!')if __name__ == '__main__':main()

image_augmentation.py 如下：

#!/usr/bin/env python3 # -*- coding: utf-8 -*- """ Created on Tue Dec 4 15:59:29 2018@author: yang """import numpy as np import cv2crop_image = lambda img, x0, y0, w, h: img[y0:y0+h, x0:x0+w]def random_crop(img, area_ratio, hw_vari):h, w = img.shape[:2]hw_delta = np.random.uniform(-hw_vari, hw_vari)hw_mult = 1 + hw_deltaw_crop = int(round(w*np.sqrt(area_ratio*hw_mult)))if w_crop > w - 2:w_crop = w - 2h_crop = int(round(h*np.sqrt(area_ratio/hw_mult)))if h_crop > h - 2:h_crop = h - 2x0 = np.random.randint(0, w-w_crop-1)y0 = np.random.randint(0, h-h_crop-1)return crop_image(img, x0, y0, w_crop, h_crop)def rotate_image(img, angle, crop):h, w = img.shape[:2]angle %= 360M_rotate = cv2.getRotationMatrix2D((w/2, h/2), angle, 1)img_rotated = cv2.warpAffine(img, M_rotate, (w, h))if crop:angle_crop = angle % 180if angle_crop > 90:angle_crop = 180 - angle_croptheta = angle_crop * np.pi / 180.0hw_ratio = float(h) / float(w)tan_theta = np.tan(theta)numerator = np.cos(theta) + np.sin(theta) * tan_thetar = hw_ratio if h > w else 1 / hw_ratiodenominator = r * tan_theta + 1crop_mult = numerator / denominatorw_crop = int(round(crop_mult*w))h_crop = int(round(crop_mult*h))x0 = int((w-w_crop)/2)y0 = int((h-h_crop)/2)img_rotated = crop_image(img_rotated, x0, y0, w_crop, h_crop)return img_rotateddef random_rotate(img, angle_vari, p_crop):angle = np.random.uniform(-angle_vari, angle_vari)crop = False if np.random.random() > p_crop else Truereturn rotate_image(img, angle, crop)def hsv_transform(img, hue_delta, sat_mult, val_mult):img_hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV).astype(np.float)img_hsv[:, :, 0] = (img_hsv[:, :, 0] + hue_delta) % 180img_hsv[:, :, 1] *= sat_multimg_hsv[:, :, 2] *= val_multimg_hsv[img_hsv > 255] = 255return cv2.cvtColor(np.round(img_hsv).astype(np.uint8), cv2.COLOR_HSV2BGR)def random_hsv_transform(img, hue_vari, sat_vari, val_vari):hue_delta = np.random.randint(-hue_vari, hue_vari)sat_mult = 1 + np.random.uniform(-sat_vari, sat_vari)val_mult = 1 + np.random.uniform(-val_vari, val_vari)return hsv_transform(img, hue_delta, sat_mult, val_mult)def gamma_transform(img, gamma):gamma_table = [np.power(x / 255.0, gamma) * 255.0 for x in range(256)]gamma_table = np.round(np.array(gamma_table)).astype(np.uint8)return cv2.LUT(img, gamma_table)def random_gamma_transform(img, gamma_vari):log_gamma_vari = np.log(gamma_vari)alpha = np.random.uniform(-log_gamma_vari, log_gamma_vari)gamma = np.exp(alpha)return gamma_transform(img, gamma)

我們使用上面兩個Python腳本文件，來增強 minst/train 文件夾下的5W萬張圖，生存25萬張圖，與之前的5萬合并的到30萬張圖，即將數據集增加為原來的6倍！

關閉除了旋轉和平移以外的一切選項：旋轉范圍設為正負15度：

在終端輸入如下命令：

python run_augmentation.py mnist/train/ mnist/augmented 250000 --rotate_angle=15 --p_mirror=0 --p_hsv=0 --p_gamma=0

這樣會在mnist/augmented/ 文件夾下生成25萬張增加平移和旋轉繞動后的圖，并且這些圖的命名規則也與gen_caffe_imglist.py的解析規則一致，接下來生存這些圖的文件和標簽列表文件：

python gen_caffe_imglist.py mnist/augmented augmented.txt

然后將訓練集和新增加的集的文件與標簽文件列表合并成：train_aug.txt:

cat train.txt augmented.txt > train_aug.txt

然后為這個文件train_aug.txt單獨建立一個lmdb文件夾：

因為擾動后的圖片分辨率不一定是28*28了，所以必須在這個使用 --resize_widrh=28 和 --resize_height=28 ,把輸入lmdb的圖像尺寸固定為28*28；另外使用--shuffle將輸入順序大散；

/home/yang/caffe/build/tools/convert_imageset ./ train_aug.txt train_aug_lmdb --resize_width=28 --resize_height=28 --gray --shuffle

然后將lenet_train_val.prototxt 復制一份命名為lenet_train_val_aug.prototxt；

將lenet_train_val_aug.prototxt的輸入訓練數據層的數據原lmdb文件路徑改為：

? source: "train_aug_lmdb"

然后在工作文件夾下再創建 snapshot_aug文件夾，和一個train_aug_Log 存儲日志信息的文件夾；

然后再復制一份lenet_solver.prototxt文件，命名為lenet_aug_solver.prototxt文件：

修該lenet_aug_solver.prototxt 文件的

net 參數為：net: "lenet_train_val_aug.prototxt"

snapshot參數為：snapshot_prefix: "snapshot_aug"

然后就可以開始訓練了：訓練日志輸出到train_aug_Log文件夾下：

/home/yang/caffe/build/tools/caffe train -solver lenet_aug_solver.prototxt -gpu 0 -log_dir ./train_aug_Log

最后幾行的輸出：

I1204 17:12:28.571137 5109 solver.cpp:414] Test net output #0: accuracy = 0.9911 I1204 17:12:28.571157 5109 solver.cpp:414] Test net output #1: loss = 0.0319057 (* 1 = 0.0319057 loss)

然后到train_aug_Log文件夾下將log文件的名字修改為mnist_train_with_augmentation.log；

沒有增強數據前的訓練日志在 trainLog文件夾下的mnist_train.log；

然后使用Caffe的plot_training_log.py 工具畫出這兩次訓練的一些對比圖：

python /home/yang/caffe/tools/extra/plot_training_log.py

下面的指令畫這兩次訓練的驗證集的 accu vs iters圖：

python /home/yang/caffe/tools/extra/plot_training_log.py 0 test_acc_vs_iters.png trainLog/mnist_train.log train_aug_Log/mnist_train_with_augmentation.log

下面的指令畫出這兩次訓練的在驗證集合上的 loss vs iters 圖；

python /home/yang/caffe/tools/extra/plot_training_log.py 2 test_loss_vs_iters.png trainLog/mnist_train.log train_aug_Log/mnist_train_with_augmentation.log

10.2 接著上次訓練狀態再訓練

原來的訓練數據只有5萬張，每個batch大小為50的情況下，迭代1000次就是一代（epoch)；

增加后的數據量為30萬張圖，（50*6000=30W）迭代6000次才是一代，迭代到了36000次也才6代；

如果希望接著36000次迭代的狀態繼續訓練，訓練到20代，即最大迭代次數為120000

120000/6000=20(epooch)

將lenet_aug_solver.prototxt文件的最大迭代次數：

max_iter: 36000

改為：max_iter: 120000

然后執行如下命令，就能接著36000次的訓練狀態繼續訓練：

python /home/yang/caffe/tools/caffe train -solver lenet_aug_solver.prototxt -snapshot_aug lenet_aug_solver_iter_36000.solverstate -gpu 0

/home/yang/caffe/build/tools/caffe train -solver lenet_aug_solver.prototxt -snapshot snapshot_aug/lenet_aug_solver_iter_36000.solverstate -gpu 0 -log_dir ./train_aug_Log

訓練輸出局部：

I1204 18:44:26.388453 6406 solver.cpp:414] Test net output #0: accuracy = 0.9911 I1204 18:44:26.388473 6406 solver.cpp:414] Test net output #1: loss = 0.0305995 (* 1 = 0.0305995 loss) I1204 18:44:26.388478 6406 solver.cpp:332] Optimization Done. I1204 18:44:26.388481 6406 caffe.cpp:250] Optimization Done.

下面畫一下accuracy-iterNums:

python /home/yang/caffe/tools/extra/plot_training_log.py 0 test_acc_vs_iters_120000.png train_aug_Log/mnist_train_augmentation_iter_120000.log

loss-iterNums:

python /home/yang/caffe/tools/extra/plot_training_log.py 2 test_loss_vs_iters_120000.png train_aug_Log/mnist_train_augmentation_iter_120000.log

10.3在訓練的時候實時增加數據的方法：第三方實時擾動的Caffe層

注意：這樣直接在原始樣本的基礎上做擾動來增加數據只是數據增加的一種方式之一，并不是最好的方案，因為增加的數據量有限，并且還要占用額外的硬盤存儲空間；

最好的方式是在訓練的時候對數據進行實時的擾動，這樣等效于無限多的隨即擾動。

Caffe的數據層已經自帶了最基礎的擾動方式：隨即裁剪和鏡像；

Github上有一些開源的第三方實現實時擾動的Caffe層，會包含各種常見的數據擾動方式，在github上搜索：caffe augmentation：

比如：

https://github.com/kevinlin311tw/caffe-augmentation

11.caffe-augmentation

Caffe with real-time data augmentation

Data augmentation is a simple yet effective way to enrich training data. However, we don't want to re-create a dataset (such as ImageNet) with more than millions of images every time when we change our augmentation strategy. To address this problem, this project provides real-time training data augmentation. During training, caffe will augment training data with random combination of different geometric transformations (scaling, rotation, cropping), image variations (blur, sharping, JPEG compression), and lighting adjustments.

Realtime data augmentation

Realtime data augmentation is implemented within the ImageData layer. We provide several augmentations as below:

Geometric transform: random flipping, cropping, resizing, rotation
Smooth filtering
JPEG compression
Contrast & brightness adjustment

How to use

You could specify your network prototxt as:

layer { name: "data" type: "ImageData" top: "data" top: "label" include {phase: TRAIN } transform_param {mirror: truecrop_size: 227mean_file: "/home/your/imagenet_mean.binaryproto"contrast_adjustment: truesmooth_filtering: truejpeg_compression: truerotation_angle_interval: 30display: true } image_data_param {source: "/home/your/image/list.txt"batch_size: 32shuffle: truenew_height: 256new_width: 256} }

You could also find a toy example at /examples/SSDH/train_val.prototxt

Note: ImageData Layer is currently not supported in TEST mode

Caffe MNIST手寫數字識別訓練_驗證測試模型測試評估與選擇數據增強

參考：

https://github.com/frombeijingwithlove/dlcv_for_beginners

總結

以上是生活随笔為你收集整理的Caffe MNIST 手写数字识别（全面流程）的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： SSL证书绑定了顶级域名后二级域名还需再
下一篇： EndNote文献管理（二）基操勿六