當前位置：首頁 > 人工智能 > pytorch >内容正文

pytorch

【深度学习】我用 PyTorch 复现了 LeNet-5 神经网络（CIFAR10 数据集篇）！

發布時間：2025/3/12 pytorch 20 豆豆

生活随笔收集整理的這篇文章主要介紹了【深度学习】我用 PyTorch 复现了 LeNet-5 神经网络（CIFAR10 数据集篇）！小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

今天我們將使用 Pytorch 來繼續實現 LeNet-5 模型，并用它來解決 CIFAR10 數據集的識別。

正文開始！

二、使用LeNet-5網絡結構創建CIFAR-10識別分類器

LeNet-5 網絡本是用來識別 MNIST 數據集的，下面我們來將 LeNet-5 應用到一個比較復雜的例子，識別 CIFAR-10 數據集。

CIFAR-10 是由 Hinton 的學生 Alex Krizhevsky 和 Ilya Sutskever 整理的一個用于識別普適物體的小型數據集。一共包含 10 個類別的 RGB 彩色圖片：飛機（ airlane ）、汽車（ automobile ）、鳥類（ bird ）、貓（ cat ）、鹿（ deer ）、狗（ dog ）、蛙類（ frog ）、馬（ horse ）、船（ ship ）和卡車（ truck ）。圖片的尺寸為 32×32 ，數據集中一共有 50000 張訓練圄片和 10000 張測試圖片。

CIFAR-10 的圖片樣例如圖所示。

2.1?下載并加載數據，并做出一定的預先處理

pipline_train = transforms.Compose([#隨機旋轉圖片transforms.RandomHorizontalFlip(),#將圖片尺寸resize到32x32transforms.Resize((32,32)),#將圖片轉化為Tensor格式transforms.ToTensor(),#正則化(當模型出現過擬合的情況時，用來降低模型的復雜度)transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)) ]) pipline_test = transforms.Compose([#將圖片尺寸resize到32x32transforms.Resize((32,32)),transforms.ToTensor(),transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)) ]) #下載數據集 train_set = datasets.CIFAR10(root="./data/CIFAR10", train=True, download=True, transform=pipline_train) test_set = datasets.CIFAR10(root="./data/CIFAR10", train=False, download=True, transform=pipline_test) #加載數據集 trainloader = torch.utils.data.DataLoader(train_set, batch_size=64, shuffle=True) testloader = torch.utils.data.DataLoader(test_set, batch_size=32, shuffle=False) # 類別信息也是需要我們給定的 classes = ('plane', 'car', 'bird', 'cat','deer', 'dog', 'frog', 'horse', 'ship', 'truck')

2.2 搭建 LeNet-5 神經網絡結構，并定義前向傳播的過程

LeNet-5 網絡上文已經搭建過了，由于 CIFAR10 數據集圖像是 RGB 三通道的，因此 LeNet-5 網絡 C1 層卷積選擇的濾波器需要 3 通道，網絡其它結構跟上文都是一樣的。

class LeNetRGB(nn.Module):def __init__(self):super(LeNetRGB, self).__init__()self.conv1 = nn.Conv2d(3, 6, 5) # 3表示輸入是3通道self.relu = nn.ReLU()self.maxpool1 = nn.MaxPool2d(2, 2)self.conv2 = nn.Conv2d(6, 16, 5)self.maxpool2 = nn.MaxPool2d(2, 2)self.fc1 = nn.Linear(16*5*5, 120)self.fc2 = nn.Linear(120, 84)self.fc3 = nn.Linear(84, 10)def forward(self, x):x = self.conv1(x)x = self.relu(x)x = self.maxpool1(x)x = self.conv2(x)x = self.maxpool2(x)x = x.view(-1, 16*5*5)x = F.relu(self.fc1(x))x = F.relu(self.fc2(x))x = self.fc3(x)output = F.log_softmax(x, dim=1)return output

2.3 將定義好的網絡結構搭載到 GPU/CPU，并定義優化器

使用 SGD（隨機梯度下降）優化，學習率為 0.001，動量為 0.9。

#創建模型，部署gpu device = torch.device("cuda" if torch.cuda.is_available() else "cpu") model = LeNetRGB().to(device) #定義優化器 optimizer?=?optim.SGD(model.parameters(),?lr=0.01,?momentum=0.9)

2.4 定義訓練過程

def train_runner(model, device, trainloader, optimizer, epoch):#訓練模型, 啟用 BatchNormalization 和 Dropout, 將BatchNormalization和Dropout置為Truemodel.train()total = 0correct =0.0#enumerate迭代已加載的數據集,同時獲取數據和數據下標for i, data in enumerate(trainloader, 0):inputs, labels = data#把模型部署到device上inputs, labels = inputs.to(device), labels.to(device)#初始化梯度optimizer.zero_grad()#保存訓練結果outputs = model(inputs)#計算損失和#多分類情況通常使用cross_entropy(交叉熵損失函數), 而對于二分類問題, 通常使用sigmodloss = F.cross_entropy(outputs, labels)#獲取最大概率的預測結果#dim=1表示返回每一行的最大值對應的列下標predict = outputs.argmax(dim=1)total += labels.size(0)correct += (predict == labels).sum().item()#反向傳播loss.backward()#更新參數optimizer.step()if i % 1000 == 0:#loss.item()表示當前loss的數值print("Train Epoch{} \t Loss: {:.6f}, accuracy: {:.6f}%".format(epoch, loss.item(), 100*(correct/total)))Loss.append(loss.item())Accuracy.append(correct/total)return loss.item(), correct/total

2.5 定義測試過程

def test_runner(model, device, testloader):#模型驗證, 必須要寫, 否則只要有輸入數據, 即使不訓練, 它也會改變權值#因為調用eval()將不啟用 BatchNormalization 和 Dropout, BatchNormalization和Dropout置為Falsemodel.eval()#統計模型正確率, 設置初始值correct = 0.0test_loss = 0.0total = 0#torch.no_grad將不會計算梯度, 也不會進行反向傳播with torch.no_grad():for data, label in testloader:data, label = data.to(device), label.to(device)output = model(data)test_loss += F.cross_entropy(output, label).item()predict = output.argmax(dim=1)#計算正確數量total += label.size(0)correct += (predict == label).sum().item()#計算損失值print("test_avarage_loss: {:.6f}, accuracy: {:.6f}%".format(test_loss/total, 100*(correct/total)))

2.6 運行

#調用 epoch = 20 Loss = [] Accuracy = [] for epoch in range(1, epoch+1):print("start_time",time.strftime('%Y-%m-%d %H:%M:%S',time.localtime(time.time())))loss, acc = train_runner(model, device, trainloader, optimizer, epoch)Loss.append(loss)Accuracy.append(acc)test_runner(model, device, testloader)print("end_time: ",time.strftime('%Y-%m-%d %H:%M:%S',time.localtime(time.time())),'\n')print('Finished Training') plt.subplot(2,1,1) plt.plot(Loss) plt.title('Loss') plt.show() plt.subplot(2,1,2) plt.plot(Accuracy) plt.title('Accuracy') plt.show()

經歷 20 次 epoch 迭代訓練之后：

start_time 2021-11-27 22:29:09
Train Epoch20 Loss: 0.659028, accuracy: 68.750000%
test_avarage_loss: 0.030969, accuracy: 67.760000%
end_time: ?2021-11-27 22:29:44

訓練集的 loss 曲線和 Accuracy 曲線變化如下：

2.7 保存模型

print(model) torch.save(model, './models/model-cifar10.pth') #保存模型

LeNet-5 的模型會 print 出來，并將模型模型命令為 model-cifar10.pth 保存在固定目錄下。

LeNetRGB((conv1): Conv2d(3, 6, kernel_size=(5, 5), stride=(1, 1))(relu): ReLU()(maxpool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)(conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))(maxpool2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)(fc1): Linear(in_features=400, out_features=120, bias=True)(fc2): Linear(in_features=120, out_features=84, bias=True)(fc3): Linear(in_features=84, out_features=10, bias=True) )

2.8 模型測試

利用剛剛訓練的模型進行 CIFAR10 類型圖片的測試。

from PIL import Image import numpy as npif __name__ == '__main__':device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')model = torch.load('./models/model-cifar10.pth') #加載模型model = model.to(device)model.eval() #把模型轉為test模式#讀取要預測的圖片# 讀取要預測的圖片img = Image.open("./images/test_cifar10.png").convert('RGB') # 讀取圖像#img.show()plt.imshow(img) # 顯示圖片plt.axis('off') # 不顯示坐標軸plt.show()# 導入圖片，圖片擴展后為[1，1，32，32]trans = transforms.Compose([#將圖片尺寸resize到32x32transforms.Resize((32,32)),transforms.ToTensor(),transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])img = trans(img)img = img.to(device)img = img.unsqueeze(0) #圖片擴展多一維,因為輸入到保存的模型中是4維的[batch_size,通道,長，寬]，而普通圖片只有三維，[通道,長，寬]# 預測 classes = ('plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck')output = model(img)prob = F.softmax(output,dim=1) #prob是10個分類的概率print("概率：",prob)print(predict.item())value, predicted = torch.max(output.data, 1)predict = output.argmax(dim=1)pred_class = classes[predicted.item()]print("預測類別：",pred_class)

輸出：

概率：tensor([[7.6907e-01, 3.3997e-03, 4.8003e-03, 4.2978e-05, 1.2168e-02, 6.8751e-06, 3.2019e-06, 1.6024e-04, 1.2705e-01, 8.3300e-02]], grad_fn=<SoftmaxBackward>) 5 預測類別：plane

模型預測結果正確！

以上就是 PyTorch 構建 LeNet-5 卷積神經網絡并用它來識別 CIFAR10 數據集的例子。全文的代碼都是可以順利運行的，建議大家自己跑一邊。

值得一提的是，針對 MNIST 數據集和 CIFAR10 數據集，最大的不同就是 MNIST 是單通道的，CIFAR10?是三通道的，因此在構建 LeNet-5 網絡的時候，C1層需要做不同的設置。至于輸入圖片尺寸不一樣，我們可以使用 transforms.Resize 方法統一縮放到 32x32 的尺寸大小。

所有完整的代碼我都放在 GitHub 上，GitHub地址為：

https://github.com/RedstoneWill/ObjectDetectionLearner/tree/main/LeNet-5

也可以點擊閱讀原文進入~

往期精彩回顧適合初學者入門人工智能的路線及資料下載中國大學慕課《機器學習》（黃海廣主講）機器學習及深度學習筆記等資料打印機器學習在線手冊深度學習筆記專輯《統計學習方法》的代碼復現專輯 AI基礎下載本站qq群955171419，加入微信群請掃碼：

總結

以上是生活随笔為你收集整理的【深度学习】我用 PyTorch 复现了 LeNet-5 神经网络（CIFAR10 数据集篇）！的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇：腾讯视频客户端如何设置快进速度
下一篇： Java,AXIS,webService