當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

Tensorflow实例3: 验证码图片的识别训练，每张图片有4个字母

發布時間：2024/9/20 编程问答 25 豆豆

生活随笔收集整理的這篇文章主要介紹了 Tensorflow实例3: 验证码图片的识别训练，每张图片有4个字母小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

學習目標
目標
說明驗證碼識別的原理
說明全連接層的輸出設置
說明輸出結果的損失、準確率計算
說明驗證碼標簽值的數字轉換
應用tf.one_hot實現驗證碼目標值的one_hot編碼處理
應用
應用神經網絡識別驗證碼圖片
1、識別效果

2、驗證碼識別實戰

處理原始數據
方便特征值、目標值讀取訓練
設計網絡結構
網絡的輸出處理
訓練模型并預測
原理分析

1、目標標簽分析

考慮每個位置的可能性？“ABCDEFGHIJKLMNOPQRSTUVWXYZ”

第一個位置：26種可能性

第二個位置：26種可能性

第三個位置：26種可能性

第四個位置：26種可能性

如何比較輸出結果和真實值的正確性？可以對每個位置進行one_hot編碼

2、網絡輸出分析
按照這樣的順序，“ABCDEFGHIJKLMNOPQRSTUVWXYZ”

真實值：
第一個位置：[0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0]
第二個位置：[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1]
第三個位置：[0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0]
第四個位置：[0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0]
1
2
3
4
5
那么每個驗證碼的目標有[4, 26]這樣一個數組

3、如何衡量損失
我們考慮將目標值拼接在一起，形成一個[104]長度的一階張量

真實值：
[0,0,0,0,...0,0,1,0,0][0,0,0,1,...0,0,0,0,0][0,0,0,0,...0,0,0,1,0][1,0,0,0,...0,0,0,0,0]
?? ??? ? ?26 ? ? ? ? ? ? ? ? ? ?26 ? ? ? ? ? ? ? ? ? 26 ? ? ? ? ? ? ? ? ? ? 26

預測概率值：
[0.001,0.01,,...,0.2,][0.001,0.01,,...,0.2,][0.001,0.01,,...,0.2,][0.02,0.01,,...,0.1,]
?? ??? ? ?26 ? ? ? ? ? ? ? ? ? ?26 ? ? ? ? ? ? ? ? ? 26 ? ? ? ? ? ? ? ? ? ? 26
1
2
3
4
5
6
7
這兩個104的一階張量進行交叉熵損失計算，得出損失大小。會提高四個位置的概率，使得4組中每組26個目標值中為1的位置對應的預測概率值越來越大，在預測的四組當中概率值最大。這樣得出預測中每組的字母位置。

所有104個概率相加為1

4、準確率如何計算

預測值和目標值形狀要變為[None, 4, 26]，即可這樣去比較

在每個驗證碼的第三個維度去進行比較，4個標簽的目標值位置與預測概率位置是否相等，4個全相等，這個樣本才預測正確

維度位置比較：
? ? 0 ? 1 ? 2
[None, 4, 26]

tf.argmax(y_predict, 2)
1
2
3
4
5
3.1 處理原始圖片標簽數據到TFRecords

3.1.1 驗證碼原始數據

3.1.2 處理分析

處理特征值
避免讀取的時候文件名字混亂，自己構造的0~5999的驗證碼圖片文件名字列表

def get_captcha_image():
? ? """
? ? 獲取驗證碼圖片數據
? ? :param file_list: 路徑+文件名列表
? ? :return: image
? ? """
? ? # 構造文件名
? ? filename = []

? ? for i in range(6000):
? ? ? ? string = str(i) + ".jpg"
? ? ? ? filename.append(string)

? ? # 構造路徑+文件
? ? file_list = [os.path.join(FLAGS.captcha_dir, file) for file in filename]

? ? # 構造文件隊列
? ? file_queue = tf.train.string_input_producer(file_list, shuffle=False)

? ? # 構造閱讀器
? ? reader = tf.WholeFileReader()

? ? # 讀取圖片數據內容
? ? key, value = reader.read(file_queue)

? ? # 解碼圖片數據
? ? image = tf.image.decode_jpeg(value)

? ? image.set_shape([20, 80, 3])

? ? # 批處理數據 [6000, 20, 80, 3]
? ? image_batch = tf.train.batch([image], batch_size=6000, num_threads=1, capacity=6000)

? ? return image_batch
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
目標值處理
目標值怎么處理，我們每個圖片的目標值都是一個字符串。那么將其當做一個個的字符單獨處理。一張驗證碼的圖片的目標值由4個數字組成。建立這樣的對應關系

"ABCDEFGHIJKLMNOPQRSTUVWXYZ"
0,1,2,................,24,25

最終：
"NZPP"----> [[13, 25, 15, 15]]
1
2
3
4
5
然后將所有的目標值都變成四個數字，然后與對應的特征值一起存入example當中

[[13, 25, 15, 15], [22, 10, 7, 10], [22, 15, 18, 9], [16, 6, 13, 10], [1, 0, 8, 17], [0, 9, 24, 14].....]
1
代碼部分：

讀取label文件

def get_captcha_label():
? ? """
? ? 讀取驗證碼圖片標簽數據
? ? :return: label
? ? """
? ? file_queue = tf.train.string_input_producer(["../data/Genpics/labels.csv"], shuffle=False)

? ? reader = tf.TextLineReader()

? ? key, value = reader.read(file_queue)

? ? records = [[1], ["None"]]

? ? number, label = tf.decode_csv(value, record_defaults=records)

? ? # [["NZPP"], ["WKHK"], ["ASDY"]]
? ? label_batch = tf.train.batch([label], batch_size=6000, num_threads=1, capacity=6000)

? ? return label_batch
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
處理目標值

# [b'NZPP' b'WKHK' b'WPSJ' ..., b'FVQJ' b'BQYA' b'BCHR']
label_str = sess.run(label)

print(label_str)

# 處理字符串標簽到數字張量
label_batch = dealwithlabel(label_str)
1
2
3
4
5
6
7
轉換對應的數字

def dealwithlabel(label_str):

? ? # 構建字符索引 {0：'A', 1:'B'......}
? ? num_letter = dict(enumerate(list(FLAGS.letter)))

? ? # 鍵值對反轉 {'A':0, 'B':1......}
? ? letter_num = dict(zip(num_letter.values(), num_letter.keys()))

? ? print(letter_num)

? ? # 構建標簽的列表
? ? array = []

? ? # 給標簽數據進行處理[[b"NZPP"]......]
? ? for string in label_str:

? ? ? ? letter_list = []# [1,2,3,4]

? ? ? ? # 修改編碼，b'FVQJ'到字符串，并且循環找到每張驗證碼的字符對應的數字標記
? ? ? ? for letter in string.decode('utf-8'):
? ? ? ? ? ? letter_list.append(letter_num[letter])

? ? ? ? array.append(letter_list)

? ? # [[13, 25, 15, 15], [22, 10, 7, 10], [22, 15, 18, 9], [16, 6, 13, 10], [1, 0, 8, 17], [0, 9, 24, 14].....]
? ? print(array)

? ? # 將array轉換成tensor類型
? ? label = tf.constant(array)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
特征值、目標值一一對應構造example并寫入文件
同一個圖片的特征值目標值由于都是非0維數組，所以都以bytes存入

def write_to_tfrecords(image_batch, label_batch):
? ? """
? ? 將圖片內容和標簽寫入到tfrecords文件當中
? ? :param image_batch: 特征值
? ? :param label_batch: 標簽紙
? ? :return: None
? ? """
? ? # 轉換類型
? ? label_batch = tf.cast(label_batch, tf.uint8)

? ? print(label_batch)

? ? # 建立TFRecords 存儲器
? ? writer = tf.python_io.TFRecordWriter(FLAGS.tfrecords_dir)

? ? # 循環將每一個圖片上的數據構造example協議塊，序列化后寫入
? ? for i in range(6000):
? ? ? ? # 取出第i個圖片數據，轉換相應類型,圖片的特征值要轉換成字符串形式
? ? ? ? image_string = image_batch[i].eval().tostring()

? ? ? ? # 標簽值，轉換成整型
? ? ? ? label_string = label_batch[i].eval().tostring()

? ? ? ? # 構造協議塊
? ? ? ? example = tf.train.Example(features=tf.train.Features(feature={
? ? ? ? ? ? "image": tf.train.Feature(bytes_list=tf.train.BytesList(value=[image_string])),
? ? ? ? ? ? "label": tf.train.Feature(bytes_list=tf.train.BytesList(value=[label_string]))
? ? ? ? }))

? ? ? ? writer.write(example.SerializeToString())

? ? # 關閉文件
? ? writer.close()

? ? return None
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
3.2 讀取數據訓練

3.2.1 讀取TFRecords文件數據

def read_captcha_tfrecords():
? ? """
? ? 從tfrecords讀取圖片特征值和目標值
? ? :return: 特征值、目標值
? ? """
? ? # 1、構造文件隊列
? ? file_queue = tf.train.string_input_producer([FLAGS.captcha_tfrecords])

? ? # 2、構造讀取器去讀取數據，默認一個樣本
? ? reader = tf.TFRecordReader()

? ? key, values = reader.read(file_queue)

? ? # 3、解析example協議
? ? feature = tf.parse_single_example(values, features={
? ? ? ? "image": tf.FixedLenFeature([], tf.string),
? ? ? ? "label": tf.FixedLenFeature([], tf.string),
? ? })

? ? # 4、對bytes類型的數據進行解碼
? ? image = tf.decode_raw(feature['image'], tf.uint8)

? ? label = tf.decode_raw(feature['label'], tf.uint8)

? ? print(image, label)

? ? # 固定每一個數據張量的形狀
? ? image_reshape = tf.reshape(image, [FLAGS.height, FLAGS.width, FLAGS.channel])

? ? label_reshape = tf.reshape(label, [FLAGS.label_num])

? ? print(image_reshape, label_reshape)

? ? # 處理數據的類型
? ? # 對特征值進行類型修改
? ? image_reshape = tf.cast(image_reshape, tf.float32)

? ? label_reshape = tf.cast(label_reshape, tf.int32)

? ? # 5、進行批處理
? ? # 意味著每批次訓練的樣本數量
? ? image_batch, label_batch = tf.train.batch([image_reshape, label_reshape], batch_size=100, num_threads=1, capacity=100)

? ? print(image_batch, label_batch)

? ? return image_batch, label_batch
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
3.2.2 標簽數據處理成三維

def change_to_onehot(label_batch):
? ? """
? ? 處理圖片的四個目標值到ont_hot編碼
? ? :param label_batch: [[13, 25, 15, 15], [22, 10, 7, 10], [22, 15, 18, 9]]
? ? :return: ont_hot
? ? """

? ? # [100, 4]---->[100, 4, 26]
? ? y_true = tf.one_hot(label_batch, depth=FLAGS.depth, on_value=1.0)

? ? return y_true
1
2
3
4
5
6
7
8
9
10
11
3.2.3 全連接層模型建立

每個樣本的目標值4個，每個目標值26中可能性，全連接層神經元個數4*26個

def captcha_model(image_batch):
? ? """
? ? 定義驗證碼的神經網絡模型，得出模型輸出
? ? :param image_batch: 模型的輸入數據
? ? :return: 模型輸出結果(預測結果)
? ? """

? ? # 直接使用一層 ?全連接層的神經網絡進行預測
? ? # 確定全連接層的模型計算
? ? # 輸入：[100, 20, 80, 3] ? ? ? ? 輸出：[None, 104] ? 104 = 4個目標值 * 26中可能性
? ? with tf.variable_scope("captcha_model"):

? ? ? ? # [100, 20 * 80 * 3]*[20*80*3, 104]+[104] = [None, 104]
? ? ? ? # 隨機初始化全連接層的權重和偏置
? ? ? ? w = weight_variables([20 * 80 * 3, 104])

? ? ? ? b = bias_variables([104])

? ? ? ? # 做出全連接層的形狀改變[100, 20, 80, 3] ----->[100, 20 * 80 * 3]
? ? ? ? image_reshape = tf.reshape(image_batch, [-1, FLAGS.height * FLAGS.width * FLAGS.channel])

? ? ? ? # 進行矩陣運算
? ? ? ? # y_predict ? [None, 104]
? ? ? ? y_predict = tf.matmul(image_reshape, w) + b

? ? return y_predict
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
3.2.4 計算交叉熵損失

每個圖片的104個預測概率與104個真實值之間進行交叉熵計算

# 3、softmax運算計算交叉熵損失
with tf.variable_scope("softmax_crossentropy"):
? ? # y_true:真實值 [100, 4, 26] ?one_hot---->[100, 4 * 26]
? ? # y_predict :全臉層的輸出[100, 104]
? ? # 返回每個樣本的損失組成的列表
? ? loss = tf.reduce_mean(
? ? ? ? tf.nn.softmax_cross_entropy_with_logits(labels=tf.reshape(y_true, [100, FLAGS.label_num * FLAGS.depth]),
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? logits=y_predict)
1
2
3
4
5
6
7
8
3.2.5 得出準確率

形狀：[100, 4, 26]的低三個維度進行比較最大值位置

# 5、得出每次訓練的準確率（通過真實值和預測值進行位置比較，每個樣本都比較）
with tf.variable_scope("accuracy"):
? ? # 準確率計算需要三維數據對比
? ? # y_true:真實值 [100, 4, 26]
? ? # y_predict :全臉層的輸出[100, 104]--->[100, 4, 26]
? ? equal_list = tf.equal(
? ? tf.argmax(y_true, 2),
? ? tf.argmax(tf.reshape(y_predict, [100, FLAGS.label_num, FLAGS.depth]), 2)
? ? )

? ? accuracy = tf.reduce_mean(tf.cast(tf.reduce_all(equal_list, 1), tf.float32))
1
2
3
4
5
6
7
8
9
10
11
需要用到一個函數處理equal_list

```python
? ? x = tf.constant([[True, ?True], [False, False]])
? ? tf.reduce_all(x) ? ? # False
? ? tf.reduce_all(x, 0) ?# [False, False]
? ? tf.reduce_all(x, 1) ?# [True, False]
```
1
2
3
4
5
6
3.2.6 封裝連個參數工具函數

# 封裝兩個初始化參數的API，以變量Op定義
def weight_variables(shape):
? ? w = tf.Variable(tf.random_normal(shape=shape, mean=0.0, stddev=1.0))
? ? return w

def bias_variables(shape):
? ? b = tf.Variable(tf.random_normal(shape=shape, mean=0.0, stddev=1.0))
? ? return b
1
2
3
4
5
6
7
8
9
3.3 模型訓練

def captcha_reco():
? ? """
? ? 四個目標值的驗證碼圖片識別
? ? :return:
? ? """
? ? # 1、從tfrecords讀取圖片特征值和目標值
? ? # image_batch [100, 20, 80, 3]
? ? # label_batch [100, 4] ?[[13, 25, 15, 15], [22, 10, 7, 10], [22, 15, 18, 9]]
? ? image_batch, label_batch = read_captcha_tfrecords()

? ? # 2、建立識別驗證碼的神經網絡模型
? ? # y_predict-->[100, 104]
? ? y_predict = captcha_model(image_batch)

? ? # 對目標值進行one_hot編碼處理
? ? # y_true是一個三維形狀[100, 4, 26]
? ? y_true = change_to_onehot(label_batch)

? ? # 3、softmax運算計算交叉熵損失
? ? with tf.variable_scope("softmax_crossentropy"):
? ? ? ? # y_true:真實值 [100, 4, 26] ?one_hot---->[100, 4 * 26]
? ? ? ? # y_predict :全臉層的輸出[100, 104]
? ? ? ? # 返回每個樣本的損失組成的列表
? ? ? ? loss = tf.reduce_mean(
? ? ? ? ? ? tf.nn.softmax_cross_entropy_with_logits(labels=tf.reshape(y_true, [100, FLAGS.label_num * FLAGS.depth]),
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? logits=y_predict)
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? )
? ? # 4、梯度下降損失優化
? ? with tf.variable_scope("optimizer"):
? ? ? ? # 學習率
? ? ? ? train_op = tf.train.GradientDescentOptimizer(0.1).minimize(loss)

? ? # 5、得出每次訓練的準確率（通過真實值和預測值進行位置比較，每個樣本都比較）
? ? with tf.variable_scope("accuracy"):
? ? ? ? # 準確率計算需要三維數據對比
? ? ? ? # y_true:真實值 [100, 4, 26]
? ? ? ? # y_predict :全臉層的輸出[100, 104]--->[100, 4, 26]
? ? ? ? equal_list = tf.equal(
? ? ? ? ? ? tf.argmax(y_true, 2),
? ? ? ? ? ? tf.argmax(tf.reshape(y_predict, [100, FLAGS.label_num, FLAGS.depth]), 2)
? ? ? ? )

? ? ? ? accuracy = tf.reduce_mean(tf.cast(equal_list, tf.float32))

? ? # 初始化變量的op
? ? init_op = tf.global_variables_initializer()

? ? # 開啟會話運行
? ? with tf.Session() as sess:
? ? ? ? sess.run(init_op)

? ? ? ? # 創建線程去開啟讀取任務
? ? ? ? coord = tf.train.Coordinator()

? ? ? ? threads = tf.train.start_queue_runners(sess, coord=coord)

? ? ? ? # sess.run([image_batch, label_batch])
? ? ? ? # 循環訓練
? ? ? ? for i in range(1000):

? ? ? ? ? ? sess.run(train_op)

? ? ? ? ? ? print("第%d步的驗證碼訓練準確率為：%f" % (i,
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?accuracy.eval()
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?))

? ? ? ? # 回收線程
? ? ? ? coord.request_stop()

? ? ? ? coord.join(threads)

? ? return None
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
3.3 保存模型預測

if i % 100 == 0:

?? ?saver.save(sess, "./tmp/model/captcha_model")
1
2
3
完整代碼：
# -*- coding=utf-8 -*-
# tensorboard圖像終端查看
# tensorboard --logdir="./temp/summary/"

import os
# os.environ["TF_CPP_MIN_LOG_LEVEL"]='1' # 這是默認的顯示等級，顯示所有信息 ?
os.environ["TF_CPP_MIN_LOG_LEVEL"]='2' # 只顯示 warning 和 Error ??
# os.environ["TF_CPP_MIN_LOG_LEVEL"]='3' # 只顯示 Error

import tensorflow as tf

class CaptchaIdentification(object):
? ? """
? ? 驗證碼的讀取數據、網絡訓練
? ? """

? ? def __init__(self):
? ? ? ? # 驗證碼圖片的屬性
? ? ? ? self.height = 20
? ? ? ? self.width = 80
? ? ? ? self.channel = 3
? ? ? ? # 每個驗證碼的目標值個數(4個字符)
? ? ? ? self.label_num = 4
? ? ? ? # 每個目標值對應的屬性
? ? ? ? self.feature_num = 26

? ? ? ? # 權重和偏置
? ? ? ? self.weight = []
? ? ? ? self.bias = []

? ? ? ? # 每批次訓練樣本個數
? ? ? ? self.train_batch = 100

? ? @staticmethod ?# 設置靜態方法
? ? def weight_variables(shape):
? ? ? ? w = tf.Variable(tf.random_normal(shape=shape, mean=0.0, stddev=0.1))
? ? ? ? return w

? ? @staticmethod ?# 設置靜態方法
? ? def bias_variables(shape):
? ? ? ? b = tf.Variable(tf.random_normal(shape=shape, mean=0.0, stddev=0.1))
? ? ? ? return b

? ? def read_tfrecords(self):
? ? ? ? """
? ? ? ? 讀取驗證碼特征值和目標值數據
? ? ? ? :return:
? ? ? ? """
? ? ? ? # 1、構造文件的隊列
? ? ? ? file_queue = tf.train.string_input_producer(["./tfrecords/captcha.tfrecords"])

? ? ? ? # 2、 tf.TFRecordReader 讀取TFRecorders數據
? ? ? ? reader = tf.TFRecordReader()

? ? ? ? # 單個樣本數據
? ? ? ? key, value = reader.read(file_queue)

? ? ? ? # 3、解析example協議
? ? ? ? feature = tf.parse_single_example(value, features={
? ? ? ? ? ? "image": tf.FixedLenFeature(shape=[], dtype=tf.string),
? ? ? ? ? ? "label": tf.FixedLenFeature(shape=[], dtype=tf.string),
? ? ? ? })

? ? ? ? # 4、解碼操作、數據類型、形狀
? ? ? ? image = tf.decode_raw(bytes=feature["image"], out_type=tf.uint8)
? ? ? ? label = tf.decode_raw(bytes=feature["label"], out_type=tf.uint8)

? ? ? ? # 確定類型和形狀
? ? ? ? # 圖片的形狀 [20, 80, 3]
? ? ? ? # 目標值 [4]
? ? ? ? image_reshape = tf.reshape(image, shape=[self.height, self.width, self.channel])
? ? ? ? label_reshape = tf.reshape(label, shape=[self.label_num])

? ? ? ? # 類型轉換
? ? ? ? image_type = tf.cast(image_reshape, dtype=tf.float32)
? ? ? ? label_type = tf.cast(label_reshape, dtype=tf.int32)
? ? ? ? # print(image_type, label_type)

? ? ? ? # 5、批處理
? ? ? ? # 提供每批次多少樣本去進行訓練
? ? ? ? image_batch, label_batch = tf.train.batch([image_type, label_type],
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? batch_size=self.train_batch,
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? num_threads=1,
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? capacity=self.train_batch)

? ? ? ? print(image_batch, label_batch)
? ? ? ? return image_batch, label_batch

? ? def captcha_model(self, image_batch, label_batch):
? ? ? ? """
? ? ? ? 建立全連接層神經網絡
? ? ? ? :param image_batch: 驗證碼圖片特征值
? ? ? ? :param label_batch: 驗證碼圖片的目標值
? ? ? ? :return: 預測結果
? ? ? ? """
? ? ? ? # 全連接層
? ? ? ? # [self.train_batch, self.height, self.width, self.channel] --> [self.train_batch, self.height * self.width * self.channel]
? ? ? ? # 即：[100, 20, 80, 3] ?--> [100, 20 * 80 * 3]
? ? ? ? # [self.train_batch, self.height * self.width * self.channel] * [self.height * self.width * self.channel, self.label_num * self.feature_num] + [self.label_num * self.feature_num] = [None, self.label_num * self.feature_num]
? ? ? ? # 即：[100, 20 * 80 * 3] * [20 * 80 * 3, 104] + [104] = [None, 104] ? 104= 4*26
? ? ? ? with tf.variable_scope("captcha_fc_model"):
? ? ? ? ? ? # 初始化權重和偏置參數
? ? ? ? ? ? self.weight = self.weight_variables(
? ? ? ? ? ? ? ? shape=[self.height * self.width * self.channel, self.label_num * self.feature_num])
? ? ? ? ? ? self.bias = self.bias_variables(shape=[self.label_num * self.feature_num])

? ? ? ? ? ? # 4維 --> 2維做矩陣運算
? ? ? ? ? ? x_reshape = tf.reshape(tensor=image_batch,
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?shape=[self.train_batch, self.height * self.width * self.channel])

? ? ? ? ? ? # 預測結果的形狀 [self.train_batch, self.label*self.feature_num]
? ? ? ? ? ? y_predict = tf.matmul(x_reshape, self.weight) + self.bias

? ? ? ? return y_predict, self.weight, self.bias

? ? def turn_to_onehot(self, label_batch):
? ? ? ? """
? ? ? ? 目標值轉換成one_hot編碼
? ? ? ? :param label_batch: 目標值 [None, 4]
? ? ? ? :return:
? ? ? ? """
? ? ? ? with tf.variable_scope("one_hot"):
? ? ? ? ? ? # [None, self.label_num] --> [None, self.label_num, self.feature_num]
? ? ? ? ? ? # 即：[None, 4] --> [None, 4, 26]
? ? ? ? ? ? y_true = tf.one_hot(indices=label_batch,
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? depth=self.feature_num,
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? on_value=1.0)
? ? ? ? ? ? return y_true

? ? def loss(self, y_true, y_predict):
? ? ? ? """
? ? ? ? 建立驗證碼4個目標值
? ? ? ? :param y_true:
? ? ? ? :param y_predict:
? ? ? ? :return:
? ? ? ? """
? ? ? ? with tf.variable_scope("loss"):
? ? ? ? ? ? # 先進行網絡輸出值的概率計算softmax，再進行交叉熵損失計算
? ? ? ? ? ? # y_true:[None, 4, 26] -->[None, 104]
? ? ? ? ? ? # y_predict:[None, 104]
? ? ? ? ? ? y_reshape = tf.reshape(tensor=y_true, shape=[self.train_batch, self.label_num * self.feature_num])
? ? ? ? ? ? all_loss = tf.nn.softmax_cross_entropy_with_logits(labels=y_reshape,
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?logits=y_predict,
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?name="compute_loss")

? ? ? ? ? ? # 求出平均損失
? ? ? ? ? ? loss = tf.reduce_mean(all_loss)

? ? ? ? return loss

? ? def sgd(self, loss):
? ? ? ? """
? ? ? ? 梯度下降優化損失
? ? ? ? :param loss:
? ? ? ? :return:
? ? ? ? """
? ? ? ? with tf.variable_scope("sgd"):
? ? ? ? ? ? train_op = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(loss=loss)

? ? ? ? return train_op

? ? def accuracy(self, y_true, y_predict):
? ? ? ? """
? ? ? ? 就按準確率
? ? ? ? :param y_true: 真實值
? ? ? ? :param y_predict: 預測值
? ? ? ? :return: accuracy
? ? ? ? """
? ? ? ? with tf.variable_scope("accuracy"):
? ? ? ? ? ? # y_true: [None, self.label_num, self.feature_num] ?即：[None, 4, 26]
? ? ? ? ? ? # y_predict: [None, self.label_num * self.feature_num] 即：[None, 104]
? ? ? ? ? ? y_predict_reshape = tf.reshape(tensor=y_predict, shape=[self.train_batch, self.label_num, self.feature_num])

? ? ? ? ? ? # 先對最大值的位置去求解
? ? ? ? ? ? t1 = tf.argmax(y_true, 2) ?# 這里 2 是矩陣的層數減1。 [None, 104]的層數為2
? ? ? ? ? ? t2 = tf.argmax(y_predict_reshape, 2)
? ? ? ? ? ? equal_list = tf.equal(t1, t2) ?# 返回的是bool值

? ? ? ? ? ? # 需要對每個樣本進行判斷
? ? ? ? ? ? # x = tf.constant([[True, True], [False, False]])
? ? ? ? ? ? # tf.reduce_all(x, 1) ?# [True, False]
? ? ? ? ? ? accuracy = tf.reduce_mean(tf.cast(tf.reduce_all(equal_list, 1), dtype=tf.float32)) ?# 這里 1 是矩陣的層數減2。 [None, 104]的層數為2

? ? ? ? return accuracy

? ? def train(self):
? ? ? ? """
? ? ? ? 模型訓練邏輯
? ? ? ? :return:
? ? ? ? """
? ? ? ? # 1、通過接口獲取特征值和目標值
? ? ? ? # image_batch: [100, 20, 80, 3]
? ? ? ? # label_batch: [100, 4] ?例：[[13, 25, 15, 15], [22, 10, 7, 10], [22, 15, 18, 9], ...]
? ? ? ? image_batch, label_batch = self.read_tfrecords()

? ? ? ? # 2、建立驗證碼識別的模型
? ? ? ? # 全連接層神經網絡
? ? ? ? # y_predict:[self.train_batch, self.label*self.feature_num] 即：[100, 104]
? ? ? ? y_predict, self.weight, self.bias = self.captcha_model(image_batch, label_batch)

? ? ? ? # 轉換label_batch 到one_hot編碼
? ? ? ? # y_true:[None, 4, 26]
? ? ? ? y_true = self.turn_to_onehot(label_batch)

? ? ? ? # 3、利用真實值和目標值建立損失
? ? ? ? loss = self.loss(y_true, y_predict)

? ? ? ? # 4、對損失進行梯度下降優化
? ? ? ? train_op = self.sgd(loss)

? ? ? ? # 5、計算準確率
? ? ? ? accuracy = self.accuracy(y_true, y_predict)

? ? ? ? # 6、tensorflowboard展示的數據
? ? ? ? # 1）收集要在tensorflowboard觀察的張量值
? ? ? ? # 數值型 ?--> scalar 準確率，損失值
? ? ? ? tf.summary.scalar("loss", loss)
? ? ? ? tf.summary.scalar("accuracy", accuracy)

? ? ? ? # 維度高的張量值
? ? ? ? tf.summary.histogram("w", self.weight)
? ? ? ? tf.summary.histogram("b", self.bias)

? ? ? ? # 2）合并變量
? ? ? ? merged = tf.summary.merge_all()

? ? ? ? # 7、創建保存模型的OP
? ? ? ? saver = tf.train.Saver()

? ? ? ? # 會話訓練
? ? ? ? with tf.Session() as sess:
? ? ? ? ? ? # 會話初始化
? ? ? ? ? ? sess.run(tf.global_variables_initializer())

? ? ? ? ? ? # 創建tensorboard的events文件
? ? ? ? ? ? filte_writer = tf.summary.FileWriter("./temp/summary/", graph=sess.graph)

? ? ? ? ? ? # 生成線程的管理
? ? ? ? ? ? coord = tf.train.Coordinator()

? ? ? ? ? ? # 指定開啟子線程去讀取數據
? ? ? ? ? ? threads = tf.train.start_queue_runners(sess=sess, coord=coord)

? ? ? ? ? ? # 循環訓練打印結果
? ? ? ? ? ? for i in range(1000):
? ? ? ? ? ? ? ? _, loss_run, accuracy_run, summary = sess.run([train_op, loss, accuracy, merged])

? ? ? ? ? ? ? ? print("第 {:d} 次訓練的損失為：{:.6f}，準確率為：{:.6f}".format(i, loss_run, accuracy_run))

? ? ? ? ? ? ? ? # 3) 寫入運行的結果到文件當中
? ? ? ? ? ? ? ? filte_writer.add_summary(summary, i)

? ? ? ? ? ? # 回收線程
? ? ? ? ? ? coord.request_stop()
? ? ? ? ? ? coord.join(threads=threads)

? ? ? ? return None

if __name__ == '__main__':
? ? pic_indentify = CaptchaIdentification()
? ? pic_indentify.train()

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
4、拓展
如果驗證碼的標簽值不止是大寫字母，比如還包含小寫字母和數字，該怎么處理？
如果圖片的目標值不止4個，可能5，6個，該怎么處理？
注：主要是在網絡輸出的結果以及數據對應數字進行分析
————————————————
版權聲明：本文為CSDN博主「Kungs8」的原創文章，遵循 CC 4.0 BY-SA 版權協議，轉載請附上原文出處鏈接及本聲明。
原文鏈接：https://blog.csdn.net/yanpenggong/article/details/84680149

總結

以上是生活随笔為你收集整理的Tensorflow实例3: 验证码图片的识别训练，每张图片有4个字母的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇：什么是慢牛行情 A股说不定也能涨十年
下一篇： 2021年好的基金推荐注意这些题材类