好玩的deep dream(清晰版,pytorch完整代码)
??本文給出pytorch完整代碼實(shí)現(xiàn)deep dream,加入了圖像金字塔處理和高斯平滑處理,使生成圖更加清晰美觀。文中還討論了各種因素對(duì)生成圖的影響。
1, 完整代碼
??Deep dream圖是反向生成的使某層輸出特征最大化的輸入圖像,可以認(rèn)為是特征的可視化圖,可以幫助我們理解每層網(wǎng)絡(luò)學(xué)到了什么東西,deep dream圖也很好玩,可以生成一些漂亮有趣的圖片。在上一篇文章中我們已經(jīng)就主要原理進(jìn)行了介紹,但是上篇使用的簡(jiǎn)單方法生成的圖不美觀。我這次加入了在github上找到的圖像金字塔處理和高斯平滑處理代碼,使生成結(jié)果更加漂亮了,上完整代碼:
import torch import torchvision.models as models import torch.nn.functional as F import torch.nn as nn import numpy as np import numbers import math import cv2 from PIL import Image from torchvision.transforms import Compose, ToTensor, Normalize, Resize, ToPILImage import time t0 = time.time()model = models.vgg16(pretrained=True).cuda() batch_size = 1for params in model.parameters():params.requires_grad = False model.eval()mu = torch.Tensor([0.485, 0.456, 0.406]).unsqueeze(-1).unsqueeze(-1).cuda() std = torch.Tensor([0.229, 0.224, 0.225]).unsqueeze(-1).unsqueeze(-1).cuda() unnormalize = lambda x: x*std + mu normalize = lambda x: (x-mu)/stdtransform_test = Compose([Resize((500,600)),ToTensor(),Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]), ])class CascadeGaussianSmoothing(nn.Module):"""Apply gaussian smoothing separately for each channel (depthwise convolution).Arguments:kernel_size (int, sequence): Size of the gaussian kernel.sigma (float, sequence): Standard deviation of the gaussian kernel."""def __init__(self, kernel_size, sigma):super().__init__()if isinstance(kernel_size, numbers.Number):kernel_size = [kernel_size, kernel_size]cascade_coefficients = [0.5, 1.0, 2.0] # std multipliers, hardcoded to use 3 different Gaussian kernelssigmas = [[coeff * sigma, coeff * sigma] for coeff in cascade_coefficients] # isotropic Gaussianself.pad = int(kernel_size[0] / 2) # assure we have the same spatial resolution# The gaussian kernel is the product of the gaussian function of each dimension.kernels = []meshgrids = torch.meshgrid([torch.arange(size, dtype=torch.float32) for size in kernel_size])for sigma in sigmas:kernel = torch.ones_like(meshgrids[0])for size_1d, std_1d, grid in zip(kernel_size, sigma, meshgrids):mean = (size_1d - 1) / 2kernel *= 1 / (std_1d * math.sqrt(2 * math.pi)) * torch.exp(-((grid - mean) / std_1d) ** 2 / 2)kernels.append(kernel)gaussian_kernels = []for kernel in kernels:# Normalize - make sure sum of values in gaussian kernel equals 1.kernel = kernel / torch.sum(kernel)# Reshape to depthwise convolutional weightkernel = kernel.view(1, 1, *kernel.shape)kernel = kernel.repeat(3, 1, 1, 1)kernel = kernel.cuda()gaussian_kernels.append(kernel)self.weight1 = gaussian_kernels[0]self.weight2 = gaussian_kernels[1]self.weight3 = gaussian_kernels[2]self.conv = F.conv2ddef forward(self, input):input = F.pad(input, [self.pad, self.pad, self.pad, self.pad], mode='reflect')# Apply Gaussian kernels depthwise over the input (hence groups equals the number of input channels)# shape = (1, 3, H, W) -> (1, 3, H, W)num_in_channels = input.shape[1]grad1 = self.conv(input, weight=self.weight1, groups=num_in_channels)grad2 = self.conv(input, weight=self.weight2, groups=num_in_channels)grad3 = self.conv(input, weight=self.weight3, groups=num_in_channels)return (grad1 + grad2 + grad3) / 3#data = torch.ones(batch_size,3,500,600).cuda()*0.5 #data = normalize(data) n = 0 #某層特征的第n個(gè)通道 data = Image.open('./feature_visual/gray.jpg')#使用一張初始圖片 data = transform_test(data).unsqueeze(0).cuda()H,W = data.shape[2],data.shape[3] #data.requires_grad=True input_tensor = data.clone()def hook(module,inp,out):global featuresfeatures = out myhook = model.features[22].register_forward_hook(hook)levels, ratio = 4, 1.8 lr=0.2 IMAGENET_MEAN_1 = np.array([0.485, 0.456, 0.406], dtype=np.float32) IMAGENET_STD_1 = np.array([0.229, 0.224, 0.225], dtype=np.float32) LOWER_IMAGE_BOUND = torch.tensor((-IMAGENET_MEAN_1 / IMAGENET_STD_1).reshape(1, -1, 1, 1)).cuda() UPPER_IMAGE_BOUND = torch.tensor(((1 - IMAGENET_MEAN_1) / IMAGENET_STD_1).reshape(1, -1, 1, 1)).cuda() for pyramid_level in range(levels):#使用圖像金字塔方法,逐漸放大分辨率data = input_tensor.detach()h = int(np.round(H*(1.8**(pyramid_level - levels + 1))))w = int(np.round(W*(1.8**(pyramid_level - levels + 1))))input_tensor = F.interpolate(data,(h,w),mode='bilinear')input_tensor.requires_grad=Truefor i in range(20):_ = model(input_tensor)loss = features[:,n,:,:].mean() #指定通道#loss = features.mean() #指定層loss.backward()grad = input_tensor.gradsigma = ((i + 1) / 20) * 2.0 + 0.5smooth_grad = CascadeGaussianSmoothing(kernel_size=9, sigma=sigma)(grad) # "magic number" 9 just works wellg_std = torch.std(smooth_grad)g_mean = torch.mean(smooth_grad)smooth_grad = smooth_grad - g_meansmooth_grad = smooth_grad / g_stdinput_tensor.data += lr * smooth_gradinput_tensor.grad.zero_()input_tensor.data = torch.max(torch.min(input_tensor, UPPER_IMAGE_BOUND), LOWER_IMAGE_BOUND)print('data.mean():',input_tensor.mean().item(),input_tensor.std().item())print('loss:',loss.item())print('time: %.2f'%(time.time()-t0)) myhook.remove() data_i = input_tensor.clone() data_i = unnormalize(data_i) data_i = torch.clamp(data_i,0,1) data_i = data_i[0].permute(1,2,0).data.cpu().numpy()*255 data_i = data_i[...,::-1].astype('uint8') #注意cv2使用BGR順序 cv2.imwrite('./feature_visual/densenet161block3denselayer36relu2/filter%d.jpg'%n,data_i)2, 效果
我們先看看使用torchvision預(yù)訓(xùn)練模型vgg16和densenet161的某些層特征圖的某個(gè)通道(如下圖第一幅是指vgg16網(wǎng)絡(luò)relu4_3層的第2個(gè)通道)進(jìn)行反推得到的生成圖。
??從這些圖可以看出某個(gè)特定的特征層到底提取的是哪種特征,而且它們像藝術(shù)圖一樣漂亮!
??下圖再給出網(wǎng)絡(luò)不同層的deep dream圖:
??可以看出,淺層輸出生成的deep dream圖呈現(xiàn)出一些簡(jiǎn)單的條紋和色彩特征,隨著網(wǎng)絡(luò)加深,生成的deep dream圖逐漸變得豐富。最漂亮的一般是中間偏后的層,例如上面的block3,再往后的層,信息會(huì)更豐富但也更雜亂一些,視覺效果不是很好。
??下面再給出從最后層輸出(也就是分類的類別)反推得到的生成圖,這種圖也可以稱之為類印象圖,類印象圖的概念在我另一篇文章中有介紹。我們使用ImageNet的第0類丁鱥魚,觀察在幾種不同預(yù)訓(xùn)練網(wǎng)絡(luò)上的類印象圖。
??感覺AlexNet的類印象圖更直觀一些,而更深的網(wǎng)絡(luò)的類印象圖反而不是很直觀。
??使用上面的代碼并不受網(wǎng)絡(luò)輸入圖片分辨率的限制,可以生成任意分辨率的圖片,都很清晰!下面感受一下生成的一張1024 x 1024大圖的效果。
3,圖像金字塔的作用
圖5.分別使用1~4級(jí)圖像金字塔的生成圖, resnet18.layer3 filter37??使用圖像金字塔的作用是使圖像中出現(xiàn)有大有小不同尺寸的紋路,使圖片顯得有層次感,看起來更加美觀而不單調(diào)。
4, 高斯平滑的作用
圖6.使用和不使用高斯平滑的效果對(duì)比,左邊不使用,右邊使用??不使用高斯平滑時(shí)生成圖會(huì)有一些高頻噪點(diǎn),顯得圖像不夠清晰,使用高斯平滑可以去掉。
5, 初始圖像的作用
??初始圖像好比是一顆種子,隨著每次迭代生成圖在初始圖的基礎(chǔ)上不斷發(fā)展,初始圖的輪廓?jiǎng)t在生成圖中持續(xù)存在,這一點(diǎn)很有趣,使我們可以更靈活的按照需要引導(dǎo)生成圖。
總結(jié)
以上是生活随笔為你收集整理的好玩的deep dream(清晰版,pytorch完整代码)的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: pytorch制作CNN的类印象图 cl
- 下一篇: pytorch几种损失函数CrossEn