(pytorch-深度学习)使用pytorch框架nn.RNN实现循环神经网络
生活随笔
收集整理的這篇文章主要介紹了
(pytorch-深度学习)使用pytorch框架nn.RNN实现循环神经网络
小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.
使用pytorch框架nn.RNN實現(xiàn)循環(huán)神經(jīng)網(wǎng)絡(luò)
首先,讀取周杰倫專輯歌詞數(shù)據(jù)集。
import time import math import numpy as np import torch from torch import nn, optim import torch.nn.functional as Fimport sys sys.path.append("..") device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') def load_data_jay_lyrics():"""加載周杰倫歌詞數(shù)據(jù)集"""with zipfile.ZipFile('../../data/jaychou_lyrics.txt.zip') as zin:with zin.open('jaychou_lyrics.txt') as f:corpus_chars = f.read().decode('utf-8')corpus_chars = corpus_chars.replace('\n', ' ').replace('\r', ' ')corpus_chars = corpus_chars[0:10000]idx_to_char = list(set(corpus_chars))char_to_idx = dict([(char, i) for i, char in enumerate(idx_to_char)])vocab_size = len(char_to_idx)corpus_indices = [char_to_idx[char] for char in corpus_chars]return corpus_indices, char_to_idx, idx_to_char, vocab_size(corpus_indices, char_to_idx, idx_to_char, vocab_size) = load_data_jay_lyrics()定義模型
PyTorch中的nn模塊提供了循環(huán)神經(jīng)網(wǎng)絡(luò)的實現(xiàn)。下面構(gòu)造一個含單隱藏層、隱藏單元個數(shù)為256的循環(huán)神經(jīng)網(wǎng)絡(luò)層rnn_layer:
num_hiddens = 256 # rnn_layer = nn.LSTM(input_size=vocab_size, hidden_size=num_hiddens) # 已測試 rnn_layer = nn.RNN(input_size=vocab_size, hidden_size=num_hiddens)這里rnn_layer的輸入形狀為(時間步數(shù), 批量大小, 輸入個數(shù))。
- 其中輸入個數(shù)即one-hot向量長度(詞典大小)。
- rnn_layer作為nn.RNN實例,在前向計算后會分別返回輸出和隱藏狀態(tài)h
-
- 其中輸出指的是隱藏層在各個時間步上計算并輸出的隱藏狀態(tài),它們通常作為后續(xù)輸出層的輸入。該“輸出”本身并不涉及輸出層計算,形狀為(時間步數(shù), 批量大小, 隱藏單元個數(shù))。
-
- 而nn.RNN實例在前向計算返回的隱藏狀態(tài)指的是隱藏層在最后時間步的隱藏狀態(tài):當隱藏層有多層時,每一層的隱藏狀態(tài)都會記錄在該變量中;
-
- 對于像長短期記憶(LSTM),隱藏狀態(tài)是一個元組(h, c),即hidden state和cell state。
循環(huán)神經(jīng)網(wǎng)絡(luò)(LSTM)的輸出如下:
輸出形狀為(時間步數(shù), 批量大小, 隱藏單元個數(shù)),隱藏狀態(tài)h的形狀為(層數(shù), 批量大小, 隱藏單元個數(shù))。
輸出:
torch.Size([35, 2, 256]) 1 torch.Size([2, 256])如果rnn_layer是nn.LSTM實例,繼承Module類來定義一個完整的循環(huán)神經(jīng)網(wǎng)絡(luò)。
- 它首先將輸入數(shù)據(jù)使用one-hot向量表示后輸入到rnn_layer中
- 然后使用全連接輸出層得到輸出。
- 輸出個數(shù)等于詞典大小vocab_size。
訓(xùn)練模型
定義一個預(yù)測函數(shù)
def predict_rnn_pytorch(prefix, num_chars, model, vocab_size, device, idx_to_char,char_to_idx):state = Noneoutput = [char_to_idx[prefix[0]]] # output會記錄prefix加上輸出for t in range(num_chars + len(prefix) - 1):X = torch.tensor([output[-1]], device=device).view(1, 1)if state is not None:if isinstance(state, tuple): # LSTM, state:(h, c) state = (state[0].to(device), state[1].to(device))else: state = state.to(device)(Y, state) = model(X, state)if t < len(prefix) - 1:output.append(char_to_idx[prefix[t + 1]])else:output.append(int(Y.argmax(dim=1).item()))return ''.join([idx_to_char[i] for i in output])使用權(quán)重為隨機值的模型來預(yù)測一次。
model = RNNModel(rnn_layer, vocab_size).to(device) predict_rnn_pytorch('分開', 10, model, vocab_size, device, idx_to_char, char_to_idx)實現(xiàn)訓(xùn)練函數(shù)
def data_iter_consecutive(corpus_indices, batch_size, num_steps, device=None):if device is None:device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')corpus_indices = torch.tensor(corpus_indices, dtype=torch.float32, device=device)data_len = len(corpus_indices)batch_len = data_len // batch_sizeindices = corpus_indices[0: batch_size*batch_len].view(batch_size, batch_len)epoch_size = (batch_len - 1) // num_stepsfor i in range(epoch_size):i = i * num_stepsX = indices[:, i: i + num_steps]Y = indices[:, i + 1: i + num_steps + 1]yield X, Ydef grad_clipping(params, theta, device):norm = torch.tensor([0.0], device=device)for param in params:norm += (param.grad.data ** 2).sum()norm = norm.sqrt().item()if norm > theta:for param in params:param.grad.data *= (theta / norm)def train_and_predict_rnn_pytorch(model, num_hiddens, vocab_size, device,corpus_indices, idx_to_char, char_to_idx,num_epochs, num_steps, lr, clipping_theta,batch_size, pred_period, pred_len, prefixes):loss = nn.CrossEntropyLoss()optimizer = torch.optim.Adam(model.parameters(), lr=lr)model.to(device)state = Nonefor epoch in range(num_epochs):l_sum, n, start = 0.0, 0, time.time()data_iter = data_iter_consecutive(corpus_indices, batch_size, num_steps, device) # 相鄰采樣for X, Y in data_iter:if state is not None:# 使用detach函數(shù)從計算圖分離隱藏狀態(tài), 這是為了# 使模型參數(shù)的梯度計算只依賴一次迭代讀取的小批量序列(防止梯度計算開銷太大)if isinstance (state, tuple): # LSTM, state:(h, c) state = (state[0].detach(), state[1].detach())else: state = state.detach()(output, state) = model(X, state) # output: 形狀為(num_steps * batch_size, vocab_size)# Y的形狀是(batch_size, num_steps),轉(zhuǎn)置后再變成長度為# batch * num_steps 的向量,這樣跟輸出的行一一對應(yīng)y = torch.transpose(Y, 0, 1).contiguous().view(-1)l = loss(output, y.long())optimizer.zero_grad()l.backward()# 梯度裁剪grad_clipping(model.parameters(), clipping_theta, device)optimizer.step()l_sum += l.item() * y.shape[0]n += y.shape[0]try:perplexity = math.exp(l_sum / n)except OverflowError:perplexity = float('inf')if (epoch + 1) % pred_period == 0:print('epoch %d, perplexity %f, time %.2f sec' % (epoch + 1, perplexity, time.time() - start))for prefix in prefixes:print(' -', predict_rnn_pytorch(prefix, pred_len, model, vocab_size, device, idx_to_char,char_to_idx)) num_epochs, batch_size, lr, clipping_theta = 250, 32, 1e-3, 1e-2 # 注意這里的學習率設(shè)置 pred_period, pred_len, prefixes = 50, 50, ['分開', '不分開'] train_and_predict_rnn_pytorch(model, num_hiddens, vocab_size, device,corpus_indices, idx_to_char, char_to_idx,num_epochs, num_steps, lr, clipping_theta,batch_size, pred_period, pred_len, prefixes)總結(jié)
以上是生活随笔為你收集整理的(pytorch-深度学习)使用pytorch框架nn.RNN实现循环神经网络的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 语音增强相关技术综述
- 下一篇: 做对三件事,你也能像聪明人一样高速成长!