當前位置：首頁 > 人工智能 > 循环神经网络 >内容正文

循环神经网络

深度学习之循环神经网络（5）RNN情感分类问题实战

發布時間：2023/12/15 循环神经网络 28 豆豆

生活随笔收集整理的這篇文章主要介紹了深度学习之循环神经网络（5）RNN情感分类问题实战小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

深度學習之循環神經網絡（5）RNN情感分類問題實戰

1. 數據集
2. 網絡模型
3. 訓練與測試
完整代碼
運行結果

?現在利用基礎的RNN網絡來挑戰情感分類問題。網絡結構如下圖所示，RNN網絡共兩層，循環提取序列信號的語義特征，利用第2層RNN層的最后時間戳的狀態向量

hs(2)\boldsymbol h_s^{(2)}

作為句子的全局語義特征表示，送入全連接層構成的分類網絡3，得到樣本

x\boldsymbol x

為積極情感的概率

P(x為積極情感│x)∈[0,1]P(\boldsymbol x為積極情感│\boldsymbol x)\in[0,1]

。

情感分類任務的網絡結構

1. 數據集

?這里使用經典的IMDB影評數據集來完成情感分類任務。IMDB影評數據集包含了50000條用戶評價，評價的標簽分為消極和積極，其中IMDB評級 $< 5$ 的用戶評價標注為0，即消極; IMDB評價 $\geq 7$ 的用戶評價標注為1，即積極。25000條影評用于訓練集，25000條用于測試集。

?通過Keras提供的數據集datasets工具即可加載IMDB數據集，代碼如下:

import numpy as np from tensorflow import keras from tensorflow.keras import layers, losses, optimizers, Sequential from tensorflow.python.keras.datasets import imdbos.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'batchsz = 128 # 批量大小 total_words = 10000 # 詞匯表大小N_vocab max_review_len = 80 # 句子最大長度s，大于的句子部分將截斷，小于的將填充 embedding_len = 100 # 詞向量特征長度f # 加載IMDB數據集，此處的數據采用數字編碼，一個數字代表一個單詞 (x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=total_words) # 打印輸入的形狀，標簽的形狀 print(x_train.shape, len(x_train[0]), y_train.shape) print(x_test.shape, len(x_test[0]), y_test.shape)

運行結果如下圖所示:

可以看到，x_train和x_test是長度為25000的一維數組，數組的每個元素是不定長List，保存了數字編碼的每個句子，例如訓練集的第一個句子共有218個單詞，測試集的第一個句子共有68個單詞，每個句子都包含了句子起始標志ID。

?那么每個單詞是如何編碼為數字的呢？我們可以通過查看它的編碼表獲得編碼方案，例如:

# 數字編碼表 word_index = imdb.get_word_index() # 打印出編碼表的單詞和對應的數字 for k, v in word_index.items():print(k, v)

運行結果如下圖所示:

由于編碼表的鍵為單詞，值為ID，這翻轉編碼表，并添加標志位的編碼ID，代碼如下:

# 前面4個ID是特殊位 word_index = {k:(v+3) for k, v in word_index.items()} word_index["<PAD>"] = 0 word_index["<START>"] = 1 word_index["<UNK>"] = 2 # unknown word_index["<UNUSED>"] = 3 # 翻轉編碼表 reverse_word_index = dict([(value, key) for (key, value) in word_index.items()])

?對于一個數字編碼的句子，通過入選函數轉換為字符串數據:

def decode_review(text):return ' '.join([reverse_word_index.get(i, '?') for i in text])# 將第1個句子轉換為字符串數據 print(decode_review(x_train[0]))

運行結果如下所示:

<START> this film was just brilliant casting location scenery story direction everyone's really suited the part they played and you could just imagine being there robert <UNK> is an amazing actor and now the same being director <UNK> father came from the same scottish island as myself so i loved the fact there was a real connection with this film the witty remarks throughout the film were great it was just brilliant so much that i bought the film as soon as it was released for <UNK> and would recommend it to everyone to watch and the fly fishing was amazing really cried at the end it was so sad and you know what they say if you cry at a film it must have been good and this definitely was also <UNK> to the two little boy's that played the <UNK> of norman and paul they were just brilliant children are often left out of the <UNK> list i think because the stars that play them all grown up are such a big profile for the whole film but these children are amazing and should be praised for what they have done don't you think the whole story was so lovely because it was true and was someone's life after all that was shared with us all

?對于長度參差不齊的句子，人為設置一個閾值，對大于此長度的句子，選擇階段部分單詞，可以選擇截去句首單詞，也可以截去句末單詞; 對于小于此長度的句子，可以選擇在句首或句尾填充，句子截斷功能可以通過keras.preprocessing.sequence.pad_sequences()函數方便實現，例如:

# 截斷和填充句子，使得等長，此處長句子保留句子后面的部分，短句子在前面填充 x_train = keras.preprocessing.sequence.pad_sequences(x_train, maxlen=max_review_len) x_test = keras.preprocessing.sequence.pad_sequences(x_test, maxlen=max_review_len)

截斷或填充為相同長度后，通過Dataset類包裹成數據集對象，并添加常用的數據集處理流程，代碼如下:

# 構建數據集，打散，批量，并丟掉最后一個不夠batchsz的batch db_train = tf.data.Dataset.from_tensor_slices((x_train, y_train)) db_train = db_train.shuffle(1000).batch(batchsz, drop_remainder=True) db_test = tf.data.Dataset.from_tensor_slices((x_test, y_test)) db_test = db_test.batch(batchsz, drop_remainder=True) # 統計數據集屬性 print('x_train shape:', x_train.shape, tf.reduce_max(y_train), tf.reduce_min(y_train)) print('x_test shape:', x_test.shape)

運行結果如下圖所示:

可以看到截斷填充后的句子長度統一為80，即設定的句子長度閾值。drop_remainder=True參數設置丟掉最后一個Batch，因為其真實的Batch Size可能小于預設的Batch Size。

2. 網絡模型

?我們創建自定義的模型類MyRNN，繼承自Model基類，需要新建Embedding層，兩個RNN層，分類網絡層，代碼如下:

class MyRNN(keras.Model):# Cell方式構建多層網絡def __init__(self, units):super(MyRNN, self).__init__()# [b, 64]，構建Cell初始化狀態向量，重復使用self.state0 = [tf.zeros([batchsz, units])]self.state1 = [tf.zeros([batchsz, units])]# 詞向量編碼 [b, 80] => [b, 80, 100]self.embedding = layers.Embedding(total_words, embedding_len,input_length=max_review_len)# 構建2個Cellself.rnn_cell0 = layers.SimpleRNNCell(units, dropout=0.5)self.rnn_cell1 = layers.SimpleRNNCell(units, dropout=0.5)# 構建分類網絡，用于將CELL的輸出特征進行分類，2分類# [b, 80, 100] => [b, 64] => [b, 1]self.outlayer = layers.Dense(1)

其中詞向量編碼長度 $n = 100$ ，RNN的狀態向量長度 $h=unitsh=\text{units}$ 參數，分類網絡完成二分類任務，故輸出節點設置為1。

?前向傳播邏輯如下: 輸入序列通過Embedding層完成詞向量編碼，循環通過兩個RNN層，提取語義特征，取最后一層的最后時間戳的狀態向量輸出送入分類網絡，經過Sigmoid激活函數后得到輸出概率。代碼如下:

def call(self, inputs, training=None):x = inputs # [b, 80]# 獲取詞向量: embedding: [b, 80] => [b, 80, 100]x = self.embedding(x)# 通過2個RNN CELL，rnn cell compute,[b, 80, 100] => [b, 64]state0 = self.state0state1 = self.state1for word in tf.unstack(x, axis=1): # word: [b, 100] out0, state0 = self.rnn_cell0(word, state0, training) out1, state1 = self.rnn_cell1(out0, state1, training)# 末層最后一個輸出作為分類網絡的輸入: [b, 64] => [b, 1]x = self.outlayer(out1, training)# 通過激活函數，p(y is pos|x)prob = tf.sigmoid(x)return prob

3. 訓練與測試

?為了簡便，這里使用Keras的Compile&Fit方式訓練網絡，設置優化器為Adam優化器，學習率為0.001，誤差函數選用二分類的交叉熵損失函數BinaryCrossentropy，測試指標采用準確率即可。代碼如下:

# 訓練與測試 def main():units = 64 # RNN狀態向量長度fepochs = 50 # 訓練epochsmodel = MyRNN(units)# 裝配model.compile(optimizer=optimizers.RMSprop(0.001),loss=losses.BinaryCrossentropy(),metrics=['accuracy'])# 訓練和驗證model.fit(db_train, epochs=epochs, validation_data=db_test)# 測試model.evaluate(db_test)

網絡固定訓練20個Epoch后，在測試集上獲得了80.1%的準確率。

完整代碼

import os import sslimport tensorflow as tf import numpy as np from tensorflow import keras from tensorflow.keras import layers, losses, optimizers, Sequential from tensorflow.python.keras.datasets import imdbos.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' ssl._create_default_https_context = ssl._create_unverified_contextbatchsz = 128 # 批量大小 total_words = 10000 # 詞匯表大小N_vocab max_review_len = 80 # 句子最大長度s，大于的句子部分將截斷，小于的將填充 embedding_len = 100 # 詞向量特征長度f # 加載IMDB數據集，此處的數據采用數字編碼，一個數字代表一個單詞 (x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=total_words) # 打印輸入的形狀，標簽的形狀 print(x_train.shape, len(x_train[0]), y_train.shape) print(x_test.shape, len(x_test[0]), y_test.shape)# 數字編碼表 word_index = imdb.get_word_index() # 打印出編碼表的單詞和對應的數字 # for k, v in word_index.items(): # print(k, v)# 前面4個ID是特殊位 word_index = {k:(v+3) for k, v in word_index.items()} word_index["<PAD>"] = 0 word_index["<START>"] = 1 word_index["<UNK>"] = 2 # unknown word_index["<UNUSED>"] = 3 # 翻轉編碼表 reverse_word_index = dict([(value, key) for (key, value) in word_index.items()])def decode_review(text):return ' '.join([reverse_word_index.get(i, '?') for i in text])# # 將第1個句子轉換為字符串數據 # print(decode_review(x_train[0]))# 截斷和填充句子，使得等長，此處長句子保留句子后面的部分，短句子在前面填充 x_train = keras.preprocessing.sequence.pad_sequences(x_train, maxlen=max_review_len) x_test = keras.preprocessing.sequence.pad_sequences(x_test, maxlen=max_review_len)# 構建數據集，打散，批量，并丟掉最后一個不夠batchsz的batch db_train = tf.data.Dataset.from_tensor_slices((x_train, y_train)) db_train = db_train.shuffle(1000).batch(batchsz, drop_remainder=True) db_test = tf.data.Dataset.from_tensor_slices((x_test, y_test)) db_test = db_test.batch(batchsz, drop_remainder=True) # 統計數據集屬性 print('x_train shape:', x_train.shape, tf.reduce_max(y_train), tf.reduce_min(y_train)) print('x_test shape:', x_test.shape)class MyRNN(keras.Model):# Cell方式構建多層網絡def __init__(self, units):super(MyRNN, self).__init__()# [b, 64]，構建Cell初始化狀態向量，重復使用self.state0 = [tf.zeros([batchsz, units])]self.state1 = [tf.zeros([batchsz, units])]# 詞向量編碼 [b, 80] => [b, 80, 100]self.embedding = layers.Embedding(total_words, embedding_len,input_length=max_review_len)# 構建2個Cellself.rnn_cell0 = layers.SimpleRNNCell(units, dropout=0.5)self.rnn_cell1 = layers.SimpleRNNCell(units, dropout=0.5)# 構建分類網絡，用于將CELL的輸出特征進行分類，2分類# [b, 80, 100] => [b, 64] => [b, 1]self.outlayer = Sequential([layers.Dense(units),layers.Dropout(rate=0.5),layers.ReLU(),layers.Dense(1)])def call(self, inputs, training=None):x = inputs # [b, 80]# 獲取詞向量: embedding: [b, 80] => [b, 80, 100]x = self.embedding(x)# 通過2個RNN CELL，rnn cell compute,[b, 80, 100] => [b, 64]state0 = self.state0state1 = self.state1for word in tf.unstack(x, axis=1): # word: [b, 100]out0, state0 = self.rnn_cell0(word, state0, training)out1, state1 = self.rnn_cell1(out0, state1, training)# 末層最后一個輸出作為分類網絡的輸入: [b, 64] => [b, 1]x = self.outlayer(out1, training)# 通過激活函數，p(y is pos|x)prob = tf.sigmoid(x)return prob# 訓練與測試 def main():units = 64 # RNN狀態向量長度fepochs = 50 # 訓練epochsmodel = MyRNN(units)# 裝配model.compile(optimizer=optimizers.RMSprop(0.001),loss=losses.BinaryCrossentropy(),metrics=['accuracy'])# 訓練和驗證model.fit(db_train, epochs=epochs, validation_data=db_test)# 測試model.evaluate(db_test)if __name__ == '__main__':main()

運行結果

可以看到，在訓練45個Epoch后，正確率最高達到了80.08%。

總結

以上是生活随笔為你收集整理的深度学习之循环神经网络（5）RNN情感分类问题实战的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： mysql中curdate函数有什么用
下一篇： win10移动热点自动关闭怎么办 win