【Pytorch实战6】一个完整的分类案例:迁移学习分类蚂蚁和蜜蜂(Res18,VGG16)
參考資料:
《深度學習之pytorch實戰計算機視覺》
Pytorch官方教程
Pytorch官方文檔
?
本文是采用pytorch進行遷移學習的實戰演練,實戰目的是為了進一步學習和熟悉pytorch編程。
本文涉及以下內容
- 遷移學習的概念
- 數據集的介紹,讀取,處理和預覽
- 模型搭建和參數優化?? 涉及VGG16,Res50等模型
- 采用GPU進行網絡訓練
- 采用tensorboradX進行訓練可視化
- 學習率的調整
- 模型的保存和加載
- 模型測試
?
?
一、遷移學習的概念
? 兩種常見的遷移學習的做法是
1、finetuning:微調。采用在其他數據集上訓練好的權重作為初始權重訓練模型。
2、凍結前面部分層,只訓練后面的一些層。
遷移學習的好處:節省訓練的時間,解決小樣本問題。
個人感覺遷移學習在自然圖像上應用得比較多,尤其是采用ImageNet數據集上訓練的權重作為初始權重是一種很普遍的做法。但是在醫學影像上,我很少看到有用遷移學習的。
二、數據集讀取和處理
?? 采用的數據集是ImageNet的一個子集——螞蟻和蜜蜂。每一類的訓練數據和驗證數據各為120,75。數據很少,從頭訓練肯定是不夠的,但本文采用的是遷移學習。下載地址。
?
我們來看數據讀取和預覽的代碼。
#--coding:utf-8--import torch import torch.nn as nn import torch.optim as optim from torch.optim import lr_scheduler import numpy as np import torchvision from torchvision import datasets, models, transforms import matplotlib.pyplot as plt import time import os import copy# Data augmentation and normalization for training # Just normalization for validation data_transforms = {'train': transforms.Compose([transforms.RandomResizedCrop(224),transforms.RandomHorizontalFlip(),transforms.ToTensor()]),'val': transforms.Compose([transforms.Resize(256),transforms.CenterCrop(224),transforms.ToTensor()]), } #獲得數據生成器,以字典的形式保存。 data_dir = 'data' image_datasets = {x: datasets.ImageFolder(os.path.join(data_dir, x),data_transforms[x])for x in ['train', 'val']} dataloaders = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=4,shuffle=True, num_workers=4)for x in ['train', 'val']}#選擇1batch的訓練數據進行可視化 def imshow(inp, title=None):"""Imshow for Tensor."""inp = inp.numpy().transpose((1, 2, 0))inp = np.clip(inp, 0, 1)plt.imshow(inp)if title is not None:plt.title(title)dataset_sizes = {x: len(image_datasets[x]) for x in ['train', 'val']} class_names = image_datasets['train'].classes # Get a batch of training data inputs, classes = next(iter(dataloaders['train'])) # Make a grid from batch out = torchvision.utils.make_grid(inputs) imshow(out, title=[class_names[x] for x in classes]) plt.show()?
圖片預覽結果:
這部分如果有問題可以看前一篇博文 【Pytorch實戰5】數據讀取和處理(以臉部關鍵點檢測的數據為例)
查看數據集大小和標簽:
dataset_sizes = {x: len(image_datasets[x]) for x in ['train', 'val']} print(dataset_sizes) index_classes = image_datasets['train'].class_to_idx print(index_classes)三、模型搭建、訓練和測試
本節將搭建三個卷積神經網絡模型用于分類:Res18,VGG16,ResNet50。并比較這三個模型的準確率和泛化能力。
本節還涉及以下內容:
- GPU訓練網絡
- tensorboardX訓練可視化
- 計劃學習率
- 保存最優模型和加載模型用于測試
1、Res18
先看完整的訓練代碼。
# --coding:utf-8--import torch import torch.nn as nn import torch.optim as optim from torch.optim import lr_scheduler import numpy as np import torchvision from torchvision import datasets, models, transforms import matplotlib.pyplot as plt import time import os import copy from tensorboardX import SummaryWriter# 獲得數據生成器,以字典的形式保存。 data_transforms = {'train': transforms.Compose([transforms.RandomResizedCrop(224),transforms.RandomHorizontalFlip(),transforms.ToTensor(),transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])]),'val': transforms.Compose([transforms.Resize(256),transforms.CenterCrop(224),transforms.ToTensor(),transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])]), }data_dir = 'data' image_datasets = {x: datasets.ImageFolder(os.path.join(data_dir, x),data_transforms[x])for x in ['train', 'val']} dataloaders = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=4,shuffle=True, num_workers=4)for x in ['train', 'val']} # 數據集的大小 dataset_sizes = {x: len(image_datasets[x]) for x in ['train', 'val']} # 類的名稱 class_names = image_datasets['train'].classes # 有GPU就用GPU訓練 device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")# 模型訓練和參數優化 def train_model(model, criterion, optimizer, scheduler, num_epochs=25):since = time.time()best_model_wts = copy.deepcopy(model.state_dict())best_acc = 0.0for epoch in range(num_epochs):print('Epoch {}/{}'.format(epoch, num_epochs - 1))print('-' * 10)# Each epoch has a training and validation phasefor phase in ['train', 'val']:if phase == 'train':scheduler.step()model.train() # Set model to training modeelse:model.eval() # Set model to evaluate moderunning_loss = 0.0running_corrects = 0# Iterate over data.for inputs, labels in dataloaders[phase]:inputs = inputs.to(device)labels = labels.to(device)# zero the parameter gradientsoptimizer.zero_grad()# forward# track history if only in trainwith torch.set_grad_enabled(phase == 'train'):outputs = model(inputs)_, preds = torch.max(outputs, 1)loss = criterion(outputs, labels)# backward + optimize only if in training phaseif phase == 'train':loss.backward()optimizer.step()# statisticsrunning_loss += loss.item() * inputs.size(0)running_corrects += torch.sum(preds == labels.data)epoch_loss = running_loss / dataset_sizes[phase]epoch_acc = running_corrects.double() / dataset_sizes[phase]writer.add_scalar('loss_%s'%phase, epoch_loss, epoch)writer.add_scalar('acc_%s'%phase, epoch_acc, epoch)print('{} Loss: {:.4f} Acc: {:.4f}'.format(phase, epoch_loss, epoch_acc))# deep copy the modelif phase == 'val' and epoch_acc > best_acc:best_acc = epoch_accbest_model_wts = copy.deepcopy(model.state_dict())print()time_elapsed = time.time() - sinceprint('Training complete in {:.0f}m {:.0f}s'.format(time_elapsed // 60, time_elapsed % 60))print('Best val Acc: {:4f}'.format(best_acc))# load best model weightsmodel.load_state_dict(best_model_wts)return modelmodel_ft = models.resnet18(pretrained=True) writer = SummaryWriter() num_ftrs = model_ft.fc.in_features model_ft.fc = nn.Linear(num_ftrs, 2)model_ft = model_ft.to(device)criterion = nn.CrossEntropyLoss()# Observe that all parameters are being optimized optimizer_ft = optim.SGD(model_ft.parameters(), lr=0.001, momentum=0.9)# Decay LR by a factor of 0.1 every 7 epochs exp_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=7, gamma=0.1) model_ft = train_model(model_ft, criterion, optimizer_ft, exp_lr_scheduler,num_epochs=25) writer.close() torch.save(model_ft.state_dict(), 'models/res18.pt')運行該代碼時可能報錯:
UnboundLocalError: local variable 'photoshop' referenced before assignment
把Pillow降到5.4.1即可解決問題。官方說7月1號會發布新版Pillow,會解決這個問題。所以7月1號之后更新Pilllow到最新版也可。
下面是代碼的詳解:
- 采用tensorboardX進行訓練數據可視化?
可參考:https://github.com/lanpa/tensorboardX
tensorboardX的安裝:
pip install tensorboardX(前提是已經裝好了tensorflow 和 tensorboard)
可視化相關代碼:
#導入讀寫器 from tensorboardX import SummaryWriter #121行,實例化 writer = SummaryWriter() #95,96行。分別寫入loss和accuracy writer.add_scalar('loss_%s'%phase, epoch_loss, epoch) writer.add_scalar('acc_%s'%phase, epoch_acc, epoch) #136行 writer.close()可視化效果(省略驗證集):
運行?tensorboard --logdir runs,打開彈出的網址
- models包
和keras.applicaitons類似,該包封裝好了很多常見的網絡模型,如下圖。
我們只需要直接調用修改部分參數就好,不需要再重新搭建網絡模型。具體的模型調用方法,相關參數設置,模型性能比較可見
https://pytorch.org/docs/stable/torchvision/models.html
本文會調用ResNet的兩個模型和VGG16。上面的代碼調用的是ResNet18,將該模型打印出來輸出如下:
ResNet((conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu): ReLU(inplace)(maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)(layer1): Sequential((0): BasicBlock((conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu): ReLU(inplace)(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True))(1): BasicBlock((conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu): ReLU(inplace)(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)))(layer2): Sequential((0): BasicBlock((conv1): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu): ReLU(inplace)(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(downsample): Sequential((0): Conv2d(64, 128, kernel_size=(1, 1), stride=(2, 2), bias=False)(1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)))(1): BasicBlock((conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu): ReLU(inplace)(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)))(layer3): Sequential((0): BasicBlock((conv1): Conv2d(128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu): ReLU(inplace)(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(downsample): Sequential((0): Conv2d(128, 256, kernel_size=(1, 1), stride=(2, 2), bias=False)(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)))(1): BasicBlock((conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu): ReLU(inplace)(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)))(layer4): Sequential((0): BasicBlock((conv1): Conv2d(256, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)(bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu): ReLU(inplace)(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(downsample): Sequential((0): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)(1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)))(1): BasicBlock((conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(relu): ReLU(inplace)(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)))(avgpool): AdaptiveAvgPool2d(output_size=(1, 1))(fc): Linear(in_features=512, out_features=1000, bias=True) )可以看到模型的最后一層名字叫'fc',上面的代碼采用finetuning的方式做遷移學習。前面所有層采用欲訓練權重,最后一層重寫了,采用默認的初始權重。見代碼第120,122,123行。
- 關于model.train()和model.eval()? 代碼62,64行
這個是設置模型的狀態(mode),驗證狀態時BN,dropout是固定的,比如BN采用的訓練時學到的參數。官方解釋是這樣的:
看網上的討論似乎對是否使用model.eval()存在爭議。具體請百度“model.eval"
- GPU訓練
GPU訓練主要依賴于to()函數,將模型和輸入變量送入device("cuda")中(也有其它的寫法,如tensor.cuda代替to(device)。具體的相關代碼如下:
#第44行,定義device,在哪定義都行,位置無所謂。 device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") #第71,72行,將輸入張量送入GPU中 inputs = inputs.to(device) labels = labels.to(device) #第125行,將整個網絡送入GPU中 model_ft = model_ft.to(device)總結:要想用GPU計算,把涉及計算的所有張量或變量送入GPU即可。
- 張量和變量的區別
變量(Variable)是pytorch專門為自動求導機制設計的,如今它已經與tensor合并了。可見下圖:
- set_grad_enabled函數(79行)
這里先復制官方文檔看看這個函數的含義。
下面說下我個人的理解。這個函數通常作為上下文管理器使用。它用來規定它作用域內的張量是否要計算梯度,也就是張量的屬性requires_grad是否為真。從而避免在inference的時候浪費大量的內存。一個問題是,為什么張量屬性requires_grad為真時會浪費大量內存。我們知道,這時候僅僅是做前向傳播,而梯度計算是反向傳播才做的事情,而且我們也是在phase=='train'的條件下執行的反向傳播的代碼,也就是說phase=='val'時只涉及前向傳播。
當我看到下面這段話時,我才想明白內存節省在哪——不用再去構建用于計算梯度的圖了。
When computing the forwards pass, autograd simultaneously performs the requested computations and builds up a graph representing the function that computes the gradient (the .grad_fn attribute of each torch.Tensor is an entry point into this graph). When the forwards pass is completed, we evaluate this graph in the backwards pass to compute the gradients.
- 模型保存代碼? 參考:https://pytorch.org/tutorials/beginner/saving_loading_models.html
很簡單,就一行。其中state_dict是python字典,保存著模型中所可學習參數。路徑的后綴一般為.pt或.pth
torch.save(model.state_dict(), PATH)-
學習率計劃? lr_scheduler
官方已經封裝好了學習率調整的包lr_scheduler,提供了很多函數可供調用。詳見:
https://pytorch.org/docs/stable/optim.html#torch.optim.lr_scheduler
本文采用的函數說明如下:調用的相關代碼在6(導入模塊),61(step+1),133行(實例化)
訓練代碼講解完畢。我們來看測試代碼。主要是加載模型。
# 測試 model = models.resnet18() num_ftrs = model.fc.in_features model.fc = nn.Linear(num_ftrs, 2) model = model.to(device) model.load_state_dict(torch.load('models/res18.pt')) model.eval() running_corrects = 0 # Iterate over data. for inputs, labels in dataloaders['val']:inputs = inputs.to(device)labels = labels.to(device)# forward# track history if only in trainwith torch.set_grad_enabled(False):outputs = model(inputs)_, preds = torch.max(outputs, 1)running_corrects += torch.sum(preds == labels.data)epoch_acc = running_corrects.double() / dataset_sizes['val'] print(' Acc: {:.4f}'.format(epoch_acc))準確率為95.42%
我嘗試把model.eval注釋掉。準確率為85.67%
下面來看用凍結層來做遷移學習。并測試單張數據可視化測試結果。
#凍結一些層 model_conv = torchvision.models.resnet18(pretrained=True) for param in model_conv.parameters():param.requires_grad = False# Parameters of newly constructed modules have requires_grad=True by default num_ftrs = model_conv.fc.in_features model_conv.fc = nn.Linear(num_ftrs, 2)model_conv = model_conv.to(device)criterion = nn.CrossEntropyLoss()# Observe that only parameters of final layer are being optimized as # opposed to before. optimizer_conv = optim.SGD(model_conv.fc.parameters(), lr=0.001, momentum=0.9)# Decay LR by a factor of 0.1 every 7 epochs exp_lr_scheduler = lr_scheduler.StepLR(optimizer_conv, step_size=7, gamma=0.1)model_conv = train_model(model_conv, criterion, optimizer_conv,exp_lr_scheduler, num_epochs=25) torch.save(model_conv.state_dict(), 'models/res18_0.pt')測試代碼:
# --coding:utf-8--import torch import torch.nn as nn import torch.optim as optim from torch.optim import lr_scheduler import numpy as np import torchvision from torchvision import datasets, models, transforms import matplotlib.pyplot as plt import time import os import copy from tensorboardX import SummaryWriter# 獲得數據生成器,以字典的形式保存。 data_transforms = {'train': transforms.Compose([transforms.RandomResizedCrop(224),transforms.RandomHorizontalFlip(),transforms.ToTensor(),transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])]),'val': transforms.Compose([transforms.Resize(256),transforms.CenterCrop(224),transforms.ToTensor(),transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])]), }data_dir = 'data' image_datasets = {x: datasets.ImageFolder(os.path.join(data_dir, x),data_transforms[x])for x in ['train', 'val']} dataloaders = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=1,shuffle=True, num_workers=4)for x in ['train', 'val']} # 數據集的大小 dataset_sizes = {x: len(image_datasets[x]) for x in ['train', 'val']} # 類的名稱 class_names = image_datasets['train'].classes # 有GPU就用GPU device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")#單張測試可視化代碼 model = models.resnet18() num_ftrs = model.fc.in_features model.fc = nn.Linear(num_ftrs, 2) model = model.to(device)model.load_state_dict(torch.load('models/res18_0.pt')) model.eval() def imshow(inp, title=None):"""Imshow for Tensor."""inp = inp.numpy().transpose((1, 2, 0))mean = np.array([0.485, 0.456, 0.406])std = np.array([0.229, 0.224, 0.225])inp = std * inp + meaninp = np.clip(inp, 0, 1)plt.imshow(inp)if title is not None:plt.title(title) with torch.no_grad():for i, (inputs, labels) in enumerate(dataloaders['val']):inputs = inputs.to(device)labels = labels.to(device)outputs = model(inputs)_, preds = torch.max(outputs, 1)imshow(inputs.cpu().data[0],'predicted: {}'.format(class_names[preds[0]]))plt.show()測試結果:
?
2、VGG16
# --coding:utf-8--import torch import torch.nn as nn import torch.optim as optim from torch.optim import lr_scheduler import numpy as np import torchvision from torchvision import datasets, models, transforms import matplotlib.pyplot as plt import time import os import copy from tensorboardX import SummaryWriter# 獲得數據生成器,以字典的形式保存。 data_transforms = {'train': transforms.Compose([transforms.RandomResizedCrop(224),transforms.RandomHorizontalFlip(),transforms.ToTensor(),transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])]),'val': transforms.Compose([transforms.Resize(256),transforms.CenterCrop(224),transforms.ToTensor(),transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])]), }data_dir = 'data' image_datasets = {x: datasets.ImageFolder(os.path.join(data_dir, x),data_transforms[x])for x in ['train', 'val']} dataloaders = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=4,shuffle=True, num_workers=4)for x in ['train', 'val']} # 數據集的大小 dataset_sizes = {x: len(image_datasets[x]) for x in ['train', 'val']} # 類的名稱 class_names = image_datasets['train'].classes # 有GPU就用GPU訓練 device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")# 模型訓練和參數優化 def train_model(model, criterion, optimizer, scheduler, num_epochs=25):since = time.time()best_model_wts = copy.deepcopy(model.state_dict())best_acc = 0.0for epoch in range(num_epochs):print('Epoch {}/{}'.format(epoch, num_epochs - 1))print('-' * 10)# Each epoch has a training and validation phasefor phase in ['train', 'val']:if phase == 'train':scheduler.step()model.train() # Set model to training modeelse:model.eval() # Set model to evaluate moderunning_loss = 0.0running_corrects = 0# Iterate over data.for inputs, labels in dataloaders[phase]:inputs = inputs.to(device)labels = labels.to(device)# zero the parameter gradientsoptimizer.zero_grad()# forward# track history if only in trainwith torch.set_grad_enabled(phase == 'train'):outputs = model(inputs)_, preds = torch.max(outputs, 1)loss = criterion(outputs, labels)# backward + optimize only if in training phaseif phase == 'train':loss.backward()optimizer.step()# statisticsrunning_loss += loss.item() * inputs.size(0)running_corrects += torch.sum(preds == labels.data)epoch_loss = running_loss / dataset_sizes[phase]epoch_acc = running_corrects.double() / dataset_sizes[phase]writer.add_scalar('loss_%s'%phase, epoch_loss, epoch)writer.add_scalar('acc_%s'%phase, epoch_acc, epoch)print('{} Loss: {:.4f} Acc: {:.4f}'.format(phase, epoch_loss, epoch_acc))# deep copy the modelif phase == 'val' and epoch_acc > best_acc:best_acc = epoch_accbest_model_wts = copy.deepcopy(model.state_dict())print()time_elapsed = time.time() - sinceprint('Training complete in {:.0f}m {:.0f}s'.format(time_elapsed // 60, time_elapsed % 60))print('Best val Acc: {:4f}'.format(best_acc))# load best model weightsmodel.load_state_dict(best_model_wts)return modelmodel_ft = models.vgg16(pretrained=True) writer = SummaryWriter()model_ft.classifier = torch.nn.Sequential(torch.nn.Linear(25088,4096),torch.nn.ReLU(),torch.nn.Dropout(p=0.5),torch.nn.Linear(4096,4096),torch.nn.ReLU(),torch.nn.Dropout(p=0.5),torch.nn.Linear(4096,2))model_ft = model_ft.to(device)criterion = nn.CrossEntropyLoss()# Observe that all parameters are being optimized optimizer_ft = optim.SGD(model_ft.parameters(), lr=0.001, momentum=0.9)# Decay LR by a factor of 0.1 every 7 epochs exp_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=7, gamma=0.1) model_ft = train_model(model_ft, criterion, optimizer_ft, exp_lr_scheduler,num_epochs=25) writer.close() torch.save(model_ft.state_dict(), 'models/vgg16.pt')驗證集準確率94%
總結
以上是生活随笔為你收集整理的【Pytorch实战6】一个完整的分类案例:迁移学习分类蚂蚁和蜜蜂(Res18,VGG16)的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 笔记本(win10、win7)开机在LO
- 下一篇: 守护白起