當(dāng)前位置：首頁(yè) > 编程资源 > 编程问答 >内容正文

编程问答

应用训练MNIST的CNN模型识别手写数字图片完整实例（图片来自网上）

發(fā)布時(shí)間：2023/12/20 编程问答 25 豆豆

生活随笔收集整理的這篇文章主要介紹了应用训练MNIST的CNN模型识别手写数字图片完整实例（图片来自网上）小編覺(jué)得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

1 思考訓(xùn)練模型如何進(jìn)行應(yīng)用

通過(guò)CNN訓(xùn)練的MNIST模型如何應(yīng)用來(lái)識(shí)別手寫數(shù)字圖片（圖片來(lái)自網(wǎng)上）？

這個(gè)問(wèn)題困擾了我2天，網(wǎng)上找的很多代碼都是訓(xùn)練模型和調(diào)用模型包含在一個(gè).py文件中，這樣子每一次調(diào)用模型都需要重新訓(xùn)練一次模型，這種方法顯然效率低下；

我想到要把訓(xùn)練模型的.py文件和調(diào)用模型預(yù)測(cè)的.py文件分開(kāi)，但是調(diào)用模型的.py文件該怎么寫，很多回答都是如下所示：

saver = tf.train.Saver() # 定義saver with tf.Session() as sess:sess.run(intt) # 載入模型saver.restore(sess,"./save/model.ckpt")

這個(gè)回答不是我要的答案，我覺(jué)得載入的模型要起作用，起碼應(yīng)該有個(gè)輸入輸出的參數(shù)，于是我想要在兩個(gè).py文件之間傳遞參數(shù)，我收到的結(jié)果是：

from xxx import 參數(shù) 獲取xxx.py文件的參數(shù)

但是我這樣寫之后，直接是把訓(xùn)練模型的文件重新跑了一遍，這不是我要的效果，而且最后的圖片識(shí)別也報(bào)錯(cuò)，程序執(zhí)行中斷；

終于我無(wú)意間看到了下面這篇文章：

多層神經(jīng)網(wǎng)絡(luò)建模與模型的保存還原
https://www.cnblogs.com/HuangYJ/p/11681357.html

簡(jiǎn)單來(lái)說(shuō)，saver.restore() 是加載模型的參數(shù)：首先定義相同結(jié)構(gòu)的模型（要定義一個(gè)和以前存盤模型相同結(jié)構(gòu)的模型，只有它們的結(jié)構(gòu)相同，這些變量才能吻合，才能把讀取出來(lái)的變量的值賦給等待著被覆蓋的變量的值）。

2 訓(xùn)練模型的 main graph

從上圖（main graph）可以直觀看出我們一共需要定義的模型結(jié)構(gòu)有10個(gè)：

input image conv_layer1 pooling_layer1 conv_layer2 pooling_layer2 fc_layer3 dropout output_fc_layer4 softmax

10個(gè)結(jié)構(gòu)的代碼（函數(shù)定義的代碼沒(méi)放上來(lái)）：

with tf.name_scope('input'):x=tf.placeholder(tf.float32,[None,784])y_=tf.placeholder('float',[None,10])with tf.name_scope('image'):x_image=tf.reshape(x,[-1,28,28,1])tf.summary.image('input_image',x_image,8)with tf.name_scope('conv_layer1'):W_conv1=weight_variable([5,5,1,32])b_conv1=bias_variable([32])h_conv1=tf.nn.relu(conv2d(x_image,W_conv1)+b_conv1)with tf.name_scope('pooling_layer1'):h_pool1=max_pool_2x2(h_conv1)with tf.name_scope('conv_layer2'):W_conv2=weight_variable([5,5,32,64])b_conv2=bias_variable([64])h_conv2=tf.nn.relu(conv2d(h_pool1,W_conv2)+b_conv2)with tf.name_scope('pooling_layer2'):h_pool2=max_pool_2x2(h_conv2)with tf.name_scope('fc_layer3'):W_fc1=weight_variable([7*7*64,1024])b_fc1=bias_variable([1024])h_pool2_flat=tf.reshape(h_pool2,[-1,7*7*64])h_fc1=tf.nn.relu(tf.matmul(h_pool2_flat,W_fc1)+b_fc1)with tf.name_scope('dropout'):keep_prob=tf.placeholder(tf.float32)h_fc1_drop=tf.nn.dropout(h_fc1,keep_prob)with tf.name_scope('output_fc_layer4'):W_fc2=weight_variable([1024,10])b_fc2=bias_variable([10])with tf.name_scope('softmax'):y_conv=tf.nn.softmax(tf.matmul(h_fc1_drop,W_fc2)+b_fc2)

3 構(gòu)造模型的輸入和輸出

換一種說(shuō)法，我們要調(diào)用這個(gè)訓(xùn)練好的模型，是希望我們輸入一張手寫數(shù)字圖片，模型能自動(dòng)幫我們識(shí)別出這張圖片上的數(shù)字，并打印出來(lái)。以上是我們要達(dá)到的目的，但是訓(xùn)練的模型本質(zhì)還是做數(shù)學(xué)運(yùn)算，圖片輸入和識(shí)別數(shù)字輸出都要根據(jù)模型來(lái)確定。

模型的輸入要求的是一維張量（向量），圖像要求是28*28的尺寸，一共784個(gè)像素點(diǎn)，需要由2維張量（矩陣）展開(kāi)成一維張量，以下代碼實(shí)現(xiàn)：

text = Image.open('./images/text3.png') # 載入圖片 data = list(text.getdata()) picture=[(255-x)*1.0/255.0 for x in data] #picture作為調(diào)用模型的輸入

模型的輸出是經(jīng)過(guò)softmax函數(shù)運(yùn)算的輸出，是一長(zhǎng)串概率數(shù)組，我們要找出最大的概率對(duì)應(yīng)的數(shù)字，這個(gè)數(shù)字就是調(diào)入的模型預(yù)測(cè)到的結(jié)果，以下代碼實(shí)現(xiàn)：

# 進(jìn)行預(yù)測(cè)prediction = tf.argmax(y_conv,1)#找概率最大對(duì)應(yīng)的數(shù)字predict_result = prediction.eval(feed_dict={x: [picture],keep_prob:1.0},session=sess)print("你導(dǎo)入的圖片是：",predict_result[0])

4 應(yīng)用模型進(jìn)行識(shí)別的完整.py代碼

from PIL import Image import tensorflow.compat.v1 as tf tf.disable_v2_behavior()#---設(shè)置模型參數(shù)--- def weight_variable(shape):#權(quán)重函數(shù)initial=tf.truncated_normal(shape,stddev=0.1)return tf.Variable(initial)def bias_variable(shape):#偏置函數(shù)initial=tf.constant(0.1,shape=shape)return tf.Variable(initial)def conv2d(x,W):return tf.nn.conv2d(x,W,strides=[1,1,1,1],padding='SAME')def max_pool_2x2(x):return tf.nn.max_pool(x,ksize=[1,2,2,1],strides=[1,2,2,1],padding='SAME')with tf.name_scope('input'):x=tf.placeholder(tf.float32,[None,784])y_=tf.placeholder('float',[None,10])with tf.name_scope('image'):x_image=tf.reshape(x,[-1,28,28,1])tf.summary.image('input_image',x_image,8)with tf.name_scope('conv_layer1'):W_conv1=weight_variable([5,5,1,32])b_conv1=bias_variable([32])h_conv1=tf.nn.relu(conv2d(x_image,W_conv1)+b_conv1)with tf.name_scope('pooling_layer1'):h_pool1=max_pool_2x2(h_conv1)with tf.name_scope('conv_layer2'):W_conv2=weight_variable([5,5,32,64])b_conv2=bias_variable([64])h_conv2=tf.nn.relu(conv2d(h_pool1,W_conv2)+b_conv2)with tf.name_scope('pooling_layer2'):h_pool2=max_pool_2x2(h_conv2)with tf.name_scope('fc_layer3'):W_fc1=weight_variable([7*7*64,1024])b_fc1=bias_variable([1024])h_pool2_flat=tf.reshape(h_pool2,[-1,7*7*64])h_fc1=tf.nn.relu(tf.matmul(h_pool2_flat,W_fc1)+b_fc1)with tf.name_scope('dropout'):keep_prob=tf.placeholder(tf.float32)h_fc1_drop=tf.nn.dropout(h_fc1,keep_prob)with tf.name_scope('output_fc_layer4'):W_fc2=weight_variable([1024,10])b_fc2=bias_variable([10])with tf.name_scope('softmax'):y_conv=tf.nn.softmax(tf.matmul(h_fc1_drop,W_fc2)+b_fc2)#---加載模型，用導(dǎo)入的圖片進(jìn)行測(cè)試-- text = Image.open('./images/text2.png') # 載入圖片 data = list(text.getdata()) picture=[(255-x)*1.0/255.0 for x in data] intt=tf.global_variables_initializer() saver = tf.train.Saver() # 定義saverwith tf.Session() as sess:sess.run(intt)# 載入模型參數(shù)saver.restore(sess,"./save/model.ckpt")# 進(jìn)行預(yù)測(cè)prediction = tf.argmax(y_conv,1)predict_result = prediction.eval(feed_dict={x: [picture],keep_prob:1.0},session=sess)print("你導(dǎo)入的圖片是：",predict_result[0])

text2.png

識(shí)別結(jié)果（Spyder編譯）

模型和圖片下載鏈接：
https://download.csdn.net/download/weixin_42899627/12672965

5 運(yùn)行小提示

每次代碼運(yùn)行完都需要 restart kernel 才能再次運(yùn)行，否則會(huì)報(bào)錯(cuò)，具體什么原因我沒(méi)深究。

參考文章：
1 [Python]基于CNN的MNIST手寫數(shù)字識(shí)別 - 東聃 - 博客園
2 TensorFlow下利用MNIST訓(xùn)練模型識(shí)別手寫數(shù)字 - qiuhlee - 博客園
3 多層神經(jīng)網(wǎng)絡(luò)建模與模型的保存還原
4 TensorFlow實(shí)戰(zhàn)（三）分類應(yīng)用入門：MNIST手寫數(shù)字識(shí)別

以上是個(gè)人理解，有不對(duì)的希望批評(píng)指正

總結(jié)

以上是生活随笔為你收集整理的应用训练MNIST的CNN模型识别手写数字图片完整实例（图片来自网上）的全部?jī)?nèi)容，希望文章能夠幫你解決所遇到的問(wèn)題。

如果覺(jué)得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。