【小白学习PyTorch教程】十九、 基于torch实现UNet 图像分割模型
@Author:Runsen
在圖像領(lǐng)域,除了分類,CNN 今天還用于更高級(jí)的問題,如圖像分割、對(duì)象檢測(cè)等。圖像分割是計(jì)算機(jī)視覺中的一個(gè)過程,其中圖像被分割成代表圖像中每個(gè)不同類別的不同段。
上面圖片一段代表貓,另一段代表背景。
從自動(dòng)駕駛汽車到衛(wèi)星,圖像分割在許多領(lǐng)域都很有用。其中最重要的是醫(yī)學(xué)成像。
UNet 是一種卷積神經(jīng)網(wǎng)絡(luò)架構(gòu),在 CNN 架構(gòu)幾乎沒有變化的情況下進(jìn)行了擴(kuò)展。它的發(fā)明是為了處理生物醫(yī)學(xué)圖像,其目標(biāo)不僅是對(duì)是否存在感染進(jìn)行分類,而且還要識(shí)別感染區(qū)域。
UNet
論文:https://arxiv.org/abs/1505.04597
UNet結(jié)構(gòu)看起來像一個(gè)“U”,該架構(gòu)由三部分組成:收縮部分、瓶頸部分和擴(kuò)展部分。收縮段由許多收縮塊組成。每個(gè)塊接受一個(gè)輸入,應(yīng)用兩個(gè) 3X3 卷積層,然后是 2X2 最大池化。每個(gè)塊之后的內(nèi)核或特征圖的數(shù)量加倍,以便架構(gòu)可以有效地學(xué)習(xí)復(fù)雜的結(jié)構(gòu)。最底層介于收縮層和膨脹層之間。它使用兩個(gè) 3X3 CNN 層,然后是 2X2 上卷積層。
每個(gè)塊將輸入傳遞給兩個(gè) 3X3 CNN 層,然后是一個(gè) 2X2 上采樣層。同樣在每個(gè)塊之后,卷積層使用的特征圖數(shù)量減半以保持對(duì)稱性。然而,每次輸入也會(huì)附加相應(yīng)收縮層的特征圖。此操作將確保在收縮圖像時(shí)學(xué)習(xí)的特征將用于重建它。擴(kuò)展塊的數(shù)量與收縮塊的數(shù)量相同。之后,生成的映射通過另一個(gè) 3X3 CNN 層,特征映射的數(shù)量等于所需的片段數(shù)量。
torch實(shí)現(xiàn)
使用的數(shù)據(jù)集是:https://www.kaggle.com/paultimothymooney/chiu-2015
這個(gè)數(shù)據(jù)集用于分割糖尿病性黃斑水腫的光學(xué)相干斷層掃描圖像的圖像。
對(duì)于mat的數(shù)據(jù),使用scipy.io.loadmat進(jìn)行加載
下面使用 Pytorch 框架實(shí)現(xiàn)了 UNet 模型,代碼來源下面的Github:https://github.com/Hsankesara/DeepResearch
import torch from torch import nn import torch.nn.functional as F import torch.optim as optimclass UNet(nn.Module):def contracting_block(self, in_channels, out_channels, kernel_size=3):block = torch.nn.Sequential(torch.nn.Conv2d(kernel_size=kernel_size, in_channels=in_channels, out_channels=out_channels),torch.nn.ReLU(),torch.nn.BatchNorm2d(out_channels),torch.nn.Conv2d(kernel_size=kernel_size, in_channels=out_channels, out_channels=out_channels),torch.nn.ReLU(),torch.nn.BatchNorm2d(out_channels),)return blockdef expansive_block(self, in_channels, mid_channel, out_channels, kernel_size=3):block = torch.nn.Sequential(torch.nn.Conv2d(kernel_size=kernel_size, in_channels=in_channels, out_channels=mid_channel),torch.nn.ReLU(),torch.nn.BatchNorm2d(mid_channel),torch.nn.Conv2d(kernel_size=kernel_size, in_channels=mid_channel, out_channels=mid_channel),torch.nn.ReLU(),torch.nn.BatchNorm2d(mid_channel),torch.nn.ConvTranspose2d(in_channels=mid_channel, out_channels=out_channels, kernel_size=3, stride=2, padding=1, output_padding=1))return blockdef final_block(self, in_channels, mid_channel, out_channels, kernel_size=3):block = torch.nn.Sequential(torch.nn.Conv2d(kernel_size=kernel_size, in_channels=in_channels, out_channels=mid_channel),torch.nn.ReLU(),torch.nn.BatchNorm2d(mid_channel),torch.nn.Conv2d(kernel_size=kernel_size, in_channels=mid_channel, out_channels=mid_channel),torch.nn.ReLU(),torch.nn.BatchNorm2d(mid_channel),torch.nn.Conv2d(kernel_size=kernel_size, in_channels=mid_channel, out_channels=out_channels, padding=1),torch.nn.ReLU(),torch.nn.BatchNorm2d(out_channels),)return blockdef __init__(self, in_channel, out_channel):super(UNet, self).__init__()#Encodeself.conv_encode1 = self.contracting_block(in_channels=in_channel, out_channels=64)self.conv_maxpool1 = torch.nn.MaxPool2d(kernel_size=2)self.conv_encode2 = self.contracting_block(64, 128)self.conv_maxpool2 = torch.nn.MaxPool2d(kernel_size=2)self.conv_encode3 = self.contracting_block(128, 256)self.conv_maxpool3 = torch.nn.MaxPool2d(kernel_size=2)# Bottleneckself.bottleneck = torch.nn.Sequential(torch.nn.Conv2d(kernel_size=3, in_channels=256, out_channels=512),torch.nn.ReLU(),torch.nn.BatchNorm2d(512),torch.nn.Conv2d(kernel_size=3, in_channels=512, out_channels=512),torch.nn.ReLU(),torch.nn.BatchNorm2d(512),torch.nn.ConvTranspose2d(in_channels=512, out_channels=256, kernel_size=3, stride=2, padding=1, output_padding=1))# Decodeself.conv_decode3 = self.expansive_block(512, 256, 128)self.conv_decode2 = self.expansive_block(256, 128, 64)self.final_layer = self.final_block(128, 64, out_channel)def crop_and_concat(self, upsampled, bypass, crop=False):if crop:c = (bypass.size()[2] - upsampled.size()[2]) // 2bypass = F.pad(bypass, (-c, -c, -c, -c))return torch.cat((upsampled, bypass), 1)def forward(self, x):# Encodeencode_block1 = self.conv_encode1(x)encode_pool1 = self.conv_maxpool1(encode_block1)encode_block2 = self.conv_encode2(encode_pool1)encode_pool2 = self.conv_maxpool2(encode_block2)encode_block3 = self.conv_encode3(encode_pool2)encode_pool3 = self.conv_maxpool3(encode_block3)# Bottleneckbottleneck1 = self.bottleneck(encode_pool3)# Decodedecode_block3 = self.crop_and_concat(bottleneck1, encode_block3, crop=True)cat_layer2 = self.conv_decode3(decode_block3)decode_block2 = self.crop_and_concat(cat_layer2, encode_block2, crop=True)cat_layer1 = self.conv_decode2(decode_block2)decode_block1 = self.crop_and_concat(cat_layer1, encode_block1, crop=True)final_layer = self.final_layer(decode_block1)return final_layer上面代碼中的 UNet 模塊代表了 UNet 的整個(gè)架構(gòu)。contraction_block和expansive_block分別用于創(chuàng)建收縮段和膨脹段。該函數(shù)crop_and_concat將收縮層的輸出與新的擴(kuò)展層輸入相加。
unet = Unet(in_channel=1,out_channel=2) #out_channel represents number of segments desired criterion = torch.nn.CrossEntropyLoss() optimizer = torch.optim.SGD(unet.parameters(), lr = 0.01, momentum=0.99) optimizer.zero_grad() outputs = unet(inputs) # permute such that number of desired segments would be on 4th dimension outputs = outputs.permute(0, 2, 3, 1) m = outputs.shape[0] # Resizing the outputs and label to caculate pixel wise softmax loss outputs = outputs.resize(m*width_out*height_out, 2) labels = labels.resize(m*width_out*height_out) loss = criterion(outputs, labels) loss.backward() optimizer.step()對(duì)于該數(shù)據(jù)集解決標(biāo)準(zhǔn)教程代碼:https://www.kaggle.com/hsankesara/unet-image-segmentation
總結(jié)
以上是生活随笔為你收集整理的【小白学习PyTorch教程】十九、 基于torch实现UNet 图像分割模型的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 西安火车站可以带自热米饭吗?
- 下一篇: 100克香菇多糖需要多少香菇粉的提取?