當(dāng)前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

Kaggle狗的种类识别竞赛——基于Pytorch框架的迁移学习方法

發(fā)布時(shí)間：2025/1/21 编程问答 41 豆豆

生活随笔收集整理的這篇文章主要介紹了 Kaggle狗的种类识别竞赛——基于Pytorch框架的迁移学习方法小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

本文代碼主要參考:
https://www.kaggle.com/blankitdl/using-pytorch-resnet
https://www.kaggle.com/blankitdl/use-pretrained-pytorch-models

Pytorch下的遷移學(xué)習(xí)

一、基本介紹

比賽題目
Dog Breed Identification
官網(wǎng)對(duì)于比賽的介紹

In this playground competition, you are provided a strictly canine subset of ImageNet in order to practice fine-grained image categorization. How well you can tell your Norfolk Terriers from your Norwich Terriers? With 120 breeds of dogs and a limited number training images per class, you might find the problem more, err, ruff than you anticipated.

在這個(gè)比賽中，您將獲得ImageNet的犬類子集，以便練習(xí)細(xì)粒度的圖像分類。你能把Norwich Terriers和Norfolk Terriers 區(qū)別開來嗎？有120個(gè)品種的狗和數(shù)量有限的訓(xùn)練圖像，你可能會(huì)發(fā)現(xiàn)問題比你預(yù)期的更多。
數(shù)據(jù)
查看數(shù)據(jù)labels.csv是帶有標(biāo)簽的訓(xùn)練數(shù)據(jù),10.2kx2，sample_submission.csv是測(cè)試的數(shù)據(jù)，10.4k x 121。訓(xùn)練集和測(cè)試集幾乎是一樣的。

二、代碼解析

讀取數(shù)據(jù)信息
查看labels信息。id是圖片名稱，id.jpg保存在train文件夾下。breed是狗的類別

df_train = pd.read_csv('labels.csv') submission = pd.read_csv('sample_submission.csv') df_train.head()

類別名轉(zhuǎn)成數(shù)字標(biāo)簽
df_train.breed.unique()獲取df_train這個(gè)數(shù)據(jù)表的breed列中唯一值得個(gè)數(shù)。
將120種狗的類別對(duì)應(yīng)成數(shù)字標(biāo)簽0~119.

class_to_idx = {x:i for i,x in enumerate(df_train.breed.unique())}#類別名轉(zhuǎn)成數(shù)字標(biāo)簽 idx_to_class = {i:x for i,x in enumerate(df_train.breed.unique())}#數(shù)字標(biāo)簽轉(zhuǎn)成類別名，便于測(cè)試時(shí)知道輸出類別 df_train['target'] = [class_to_idx[x] for x in df_train.breed]#在原始的表格中添加數(shù)字標(biāo)簽列

劃分訓(xùn)練集
需要調(diào)用sklearn.model_selection的train_test_split.
將練集的一部份劃分出來作為驗(yàn)證集的目的是挑選模型，防止模型過擬合。

train,val =train_test_split(df_train,test_size=0.4, random_state=0)#將訓(xùn)練集的一部分劃分為測(cè)試集

構(gòu)建數(shù)據(jù)集

class DogsDataset(Dataset):'''df: df_train,有id,breed和新增的target信息root_dir:圖片存放的目錄transform: 圖像處理方法'''def __init__(self, df, root_dir, transform=None):self.df = dfself.root_dir = root_dirself.transform = transformdef __len__(self):return len(self.df)#數(shù)據(jù)量def __getitem__(self, idx):img_name = '{}.jpg'.format(self.df.iloc[idx, 0])#圖片名fullname = os.path.join(self.root_dir, img_name)#圖片路徑image = Image.open(fullname)#PIL的Image方法cls = self.df.iloc[idx,2]#2是target信息if self.transform:image = self.transform(image)return [image, cls]#返回PIL對(duì)象和數(shù)字標(biāo)簽

定義圖像處理方法
用到torchvision.transforms庫

normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],std=[0.229, 0.224, 0.225] ) ds_trans = transforms.Compose([transforms.Resize(224),transforms.CenterCrop(224),transforms.ToTensor(),normalize])

定義dataloader
用到from torch.utils.data 的DataLoader

BATCH_SIZE = 128 data_dir = '/train/'#注意地址 train_ds = DogsDataset(train, data_dir+'train/', transform=ds_trans)#形成Dataset val_ds = DogsDataset(val, data_dir+'train/', transform=ds_trans) train_dl = DataLoader(train_ds, batch_size=BATCH_SIZE, shuffle=True, num_workers=1)#構(gòu)建Dataloader val_dl = DataLoader(val_ds, batch_size=4, shuffle=True, num_workers=1) dataloaders= {'train':train_dl,'val':val_dl}

查看Dataloader是否構(gòu)建成功

for data in train_dl:x,y = dataprint(x.shape,y.shape)print(y)break

定義模型
數(shù)據(jù)集是從Imagnet的一個(gè)子集，可以使用在這個(gè)數(shù)據(jù)集上預(yù)訓(xùn)練的模型，這里選用的是resnet18,再次基礎(chǔ)上微調(diào)。
NUM_CLASS狗的種類數(shù)，即最后預(yù)測(cè)結(jié)果的維度.
model.fc.in_features是 resnet18最后一層輸入神經(jīng)元的個(gè)數(shù)
用in_fc_nums和NUM_CLASS作為輸入和輸出神經(jīng)元的個(gè)數(shù)，替換resnet18的全連接層

model = models.resnet18(pretrained=True) NUM_CLASS = 120#狗的種類數(shù)，即最后預(yù)測(cè)結(jié)果的維度 in_fc_nums = model.fc.in_features#resnet18最后一層輸入神經(jīng)元的個(gè)數(shù) fc = nn.Linear(in_fc_nums,NUM_CLASS) model.fc = fc model = model.cuda()

定義優(yōu)化器及學(xué)習(xí)率的調(diào)節(jié)方法
使用的庫分別是torch.optim,torch.nn,torch.optim.lr_scheduler

optimizer = optim.Adam(model.parameters(), lr=0.001, betas=(0.9, 0.999)) # 選用AdamOptimizer #optimizer = optim.Adam(model.fc.parameters(), lr=0.001, betas=(0.9, 0.999)) # 只優(yōu)化全連接層 criterion = nn.CrossEntropyLoss() # 定義損失函數(shù)，交叉熵 scheduler = lr_scheduler.StepLR(optimizer, step_size=7, gamma=0.1)

模型訓(xùn)練
模型訓(xùn)練函數(shù)有以下幾個(gè)功能：

訓(xùn)練模型
訓(xùn)練一個(gè)epoch后驗(yàn)證，記錄訓(xùn)練的訓(xùn)練和驗(yàn)證集的誤差及精度
保存在驗(yàn)證集上精度最高的模型
記錄每個(gè)epoch和整個(gè)訓(xùn)練過程的時(shí)間

def train_model(model, criterion, optimizer, scheduler, num_epochs=25):since = time.time()best_model_wts = model.cuda().state_dict()best_acc = 0.0for epoch in range(num_epochs):print('Epoch {}/{}'.format(epoch+1, num_epochs))print('-' * 20)# Each epoch has a training and validation phasefor phase in ['train', 'val']:since_epoch = time.time()if phase == 'train': #訓(xùn)練階段更新學(xué)習(xí)率scheduler.step()model.train(True) # Set model to training modeelse:model.train(False) # Set model to evaluate moderunning_loss = 0.0running_corrects = 0# Iterate over data.for data in dataloaders[phase]:# get the inputsinputs, labels = datainputs = inputs.float().cuda()labels = labels.cuda()# zero the parameter gradientsoptimizer.zero_grad()# forwardoutputs = model(inputs)_, preds = torch.max(outputs.data, 1) # print(outputs.data.shape) # print(preds.shape)loss = criterion(outputs, labels)# backward + optimize only if in training phaseif phase == 'train':loss.backward()optimizer.step()# statisticsrunning_loss += loss.item() # item(),將torch數(shù)據(jù)轉(zhuǎn)成python數(shù)據(jù)（數(shù)據(jù)只有一個(gè)元素）running_corrects += torch.sum(preds == labels.data)if phase == 'train':train_epoch_loss = running_loss / len(dataloaders[phase])train_epoch_acc = running_corrects / len(dataloaders[phase])if phase == 'val':val_epoch_loss = running_loss / len(dataloaders[phase])val_epoch_acc = running_corrects / len(dataloaders[phase])time_elapsed_epoch = time.time() - since_epoch# deep copy the modelif phase == 'val' and val_epoch_acc > best_acc:best_acc = val_epoch_accbest_model_wts = model.state_dict()print('{} Train Loss: {:.4f} Train Acc: {:.4f} Valdation Loss: {:.4f} Valdation Acc: {:.4f} in {:.0f}m {:.0f}s'.format(phase, train_epoch_loss, train_epoch_acc, val_epoch_loss, val_epoch_acc, time_elapsed_epoch // 60,time_elapsed_epoch % 60))print()time_elapsed = time.time() - sinceprint('Training complete in {:.0f}m {:.0f}s'.format(time_elapsed // 60, time_elapsed % 60))print('Best val Acc: {:4f}'.format(best_acc))# load best model weightsmodel.load_state_dict(best_model_wts)return model

調(diào)用訓(xùn)練模塊，得到最優(yōu)模型

model = train_model(model, criterion, optimizer, scheduler, num_epochs=25)

驗(yàn)證單張圖像

image_path = '/train/train/cb7fb54008ef21a8b55da46d5145acb3.jpg' img = Image.open(image_path) img = ds_trans(img)#處理圖像 #顯示圖像 inp = img.numpy().transpose((1, 2, 0)) mean = np.array([0.485, 0.456, 0.406]) std = np.array([0.229, 0.224, 0.225]) inp = std * inp + mean plt.imshow(inp)model = model.cpu() out = model(img)#獲得輸出 idx = torch.argmax(out).item() cls = idx_to_class[idx]#獲取測(cè)試圖像類別 print('The breed of testing dog is: ',cls)

總結(jié)

以上是生活随笔為你收集整理的Kaggle狗的种类识别竞赛——基于Pytorch框架的迁移学习方法的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。