神经风格迁移(Neural Style Transfer)程序实现(Keras)
前言
以前翻譯了神經風格遷移的論文:一個藝術風格化的神經網絡算法(A Neural Algorithm of Artistic Style)(譯),這篇文章中會給出其基于Keras的實現。github上也有很多相關的實現,也有caffe、tensorflow等等框架的實現,如果感興趣可以自行到github上搜索。出于學習的目的,我也是模仿別人開源在github上的代碼,基于keras進行了實現。
程序中使用到了VGG16的預訓練模型vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5,可以自己到網上下載好放到~/.keras/models文件夾中。
所有模型都可以到百度網盤中下載:
https://pan.baidu.com/s/1geHmOpH#list/path=%2Fkeras%2Fkeras_weights
程序
不說廢話了,上代碼,代碼中已有注釋。
log.py:自定義終端的log信息打印格式
# *_*coding:utf-8 *_* # author: 許鴻斌 # 郵箱:2775751197@qq.comimport logging import sys# 獲取logger實例,如果參數為空則返回root logger logger = logging.getLogger('Test') # 指定logger輸出格式 formatter = logging.Formatter('%(asctime)s %(levelname)-8s: %(message)s') # 文件日志 # file_handler = logging.FileHandler("test.log") # file_handler.setFormatter(formatter) # 可以通過setFormatter指定輸出格式 # 控制臺日志 console_handler = logging.StreamHandler(sys.stdout) console_handler.formatter = formatter # 也可以直接給formatter賦值 # 為logger添加的日志處理器 # logger.addHandler(file_handler) logger.addHandler(console_handler) # 指定日志的最低輸出級別,默認為WARN級別 logger.setLevel(logging.INFO)neural_style_transfer.py:神經風格遷移的程序實現
# *_*coding:utf-8 *_* # author: 許鴻斌 # 郵箱:2775751197@qq.com# 自定義好終端信息打印的格式 from log import loggerimport numpy as np import time import argparsefrom scipy.misc import imsave from scipy.optimize import fmin_l_bfgs_b from keras.preprocessing.image import load_img, img_to_array from keras.applications import vgg16 from keras import backend as K# 配置參數 # base_image_path:內容圖片路徑 # style_reference_image_path:風格圖片路徑 # result_prefix:生成圖片名前綴 # --iter:迭代次數,默認為10次,基本上10次就足夠了,如果不行再加 # --content_weight:內容權重,調整loss中content_loss部分的權重,默認為0.025 # --style_weight:風格權重,調整loss中style_loss部分的權重,默認為1.0 # --tv_weight:整體方差權重,調整loss中total_variation_loss部分的權重,默認為1.0 parser = argparse.ArgumentParser(description="Neural Style Transfer(Keras)") parser.add_argument('base_image_path', metavar='base', type=str, help='Path to the image to transform') parser.add_argument('style_reference_image_path', metavar='ref', type=str, help='Path to the style reference image.') parser.add_argument('result_prefix', metavar='res_prefix', type=str, help='Number of iterations to run.') parser.add_argument('--iter', type=int, default=10, required=False, help='Number of iterations to run.') parser.add_argument('--content_weight', type=float, default=0.025, required=False, help='Content weight') parser.add_argument('--style_weight', type=float, default=1.0, required=False, help='Style Weight.') parser.add_argument('--tv_weight', type=float, default=1.0, required=False, help='Total Variation Weight')args = parser.parse_args() base_image_path = args.base_image_path # 內容圖片路徑 style_reference_image_path = args.style_reference_image_path # 風格圖片路徑 result_prefix = args.result_prefix # 生成圖片文件名前綴 iterations = args.iter # 迭代次數# 幾個不同loss的權重參數 total_variation_weight = args.tv_weight style_weight = args.style_weight content_weight = args.content_weightwidth, height = load_img(base_image_path).size # 獲取內容圖片的尺寸 img_nrows = 400 # 設定生成的圖片的高度為400 img_ncols = int(width * img_nrows / height) # 與內容圖片等比例,計算對應的寬度# 圖像預處理,使用keras導入圖片,轉為合適格式的Tensor def preprocess_image(image_path):img = load_img(image_path, target_size=(img_nrows, img_ncols))img = img_to_array(img)img = np.expand_dims(img, axis=0)img = vgg16.preprocess_input(img)return img# 圖像后處理,將Tensor轉換回圖片 def deprocess_image(x):if K.image_data_format() == 'channels_first':x = x.reshape((3, img_nrows, img_ncols))x = x.transpose((1, 2, 0))else:x = x.reshape((img_nrows, img_ncols, 3))# Remove zero-center by mean pixelx[:, :, 0] += 103.939x[:, :, 1] += 116.779x[:, :, 2] += 123.68# 'BGR' -> 'RGB'x = x[:, :, ::-1]x = np.clip(x, 0, 255).astype('uint8')return x# 讀入圖片 logger.info('Loading content image: {}'.format(base_image_path)) logger.info('Loading style image: {}'.format(style_reference_image_path)) base_image = K.variable(preprocess_image(base_image_path)) style_reference_image = K.variable(preprocess_image(style_reference_image_path))# 創建生成圖片的占位符 logger.info('Creating combination image.') if K.image_data_format() == 'channels_first':combination_image = K.placeholder((1, 3, img_nrows, img_ncols)) else:combination_image = K.placeholder((1, img_nrows, img_ncols, 3))# 將三個圖片:內容圖片、風格圖片、生成圖片,在batch的那個維度上拼到一起,對應的就是batch中第1、2、3個圖像;后面用到時,直接通過batch那個維度的索引可以取出對應的結果。 logger.info('Concatenate content\\style\\combination images.') input_tensor = K.concatenate([base_image, style_reference_image, combination_image], axis=0)# 創建VGG16網絡,輸入為前面提到的三個圖片,不包括VGG的全連接層,采用ImageNet下的預訓練權重 model = vgg16.VGG16(input_tensor=input_tensor, weights='imagenet', include_top=False) logger.info('Model VGG16 loaded.')# 取出每一層的輸出結果,字典的key為每層的名字,對應的value為該層輸出的feature map outputs_dict = dict([(layer.name, layer.output) for layer in model.layers])# compute the neural style loss # first we need to define 4 util functions# 圖像張量的gram矩陣,詳細描述請參考原論文 def gram_matrix(x):assert K.ndim(x) == 3if K.image_data_format() == 'channels_first':features = K.batch_flatten(x)else:features = K.batch_flatten(K.permute_dimensions(x, (2,0,1)))gram = K.dot(features, K.transpose(features))return gram# style loss: # 從VGG16網絡的特定層可以獲取到風格圖片和生成圖片的特征圖(feature map), # 使用feature map計算gram矩陣,即計算出兩張圖片的風格特征。根據論文中的公式,計算出 # style loss,即兩張圖片風格特征的差異。 def style_loss(style, combination):assert K.ndim(style) == 3assert K.ndim(combination) == 3S = gram_matrix(style)C = gram_matrix(combination)channels = 3size = img_nrows * img_ncolsreturn K.sum(K.square(S - C)) / (4. * (channels**2) * (size**2))# content loss # 內容圖片與風格圖片“內容”部分的差異 def content_loss(base, combination):return K.sum(K.square(combination - base))# total variation loss: # 第三個loss函數,用來表示生成圖片的局部相干性 def total_variation_loss(x):assert K.ndim(x) == 4if K.image_data_format() == 'channels_first':a = K.square(x[:, :, :img_nrows-1, :img_ncols-1] - x[:, :, 1:, :img_ncols-1])b = K.square(x[:, :, :img_nrows-1, :img_ncols-1] - x[:, :, :img_nrows-1, 1:])else:a = K.square(x[:, :img_nrows-1, :img_ncols-1, :] - x[:, 1:, :img_ncols-1, :])b = K.square(x[:, :img_nrows-1, :img_ncols-1, :] - x[:, :img_nrows-1, 1:, :])return K.sum(K.pow(a+b, 1.25))# 先計算content部分的loss loss = K.variable(0.) # loss初始化為0 layer_features = outputs_dict['block4_conv2'] # 取出'block4_conv2'層上的輸出,0、1、2分別對應內容圖片、風格圖片、生成圖片的feature map的索引 # 計算content loss只需要取出內容圖片和生成圖片計算得到的特征 base_image_features = layer_features[0, :, :, :] combination_features = layer_features[2, :, :, :] loss += content_weight * content_loss(base_image_features, combination_features) # 乘以content_loss對應的權重(content_weight),然后累加在總的loss上# 計算style部分的loss feature_layers = ['block1_conv1', 'block2_conv1', 'block3_conv1', 'block4_conv1', 'block5_conv1'] for layer_name in feature_layers: # 遍歷這5個層,5個層的style_loss加權平均,累加在總的loss上layer_features = outputs_dict[layer_name] # 取出對應層的特征圖(feature map)# 計算style loss只需要取出風格圖片和生成圖片計算得到的特征style_reference_features = layer_features[1, :, :, :] combination_features = layer_features[2, :, :, :]sl = style_loss(style_reference_features, combination_features)loss += (style_weight / len(feature_layers)) * sl # 幾個層的結果加權平均,再乘以style_loss對應的權重(style_weight),累加在總的loss上loss += total_variation_weight * total_variation_loss(combination_image) # 計算total_variation_loss,累加在總的loss上,得到最終的loss# 計算生成圖片關于loss的梯度 grads = K.gradients(loss, combination_image)outputs = [loss] if isinstance(grads, (list, tuple)): # 判斷是否是tuple或者list中的一種類型outputs += grads else:outputs.append(grads)# 生成一個可調用函數,輸入為:[combination_image],輸出為:outputs f_outputs = K.function([combination_image], outputs)# 計算loss和梯度 def eval_loss_and_grads(x):if K.image_data_format() == 'channels_first':x = x.reshape((1, 3, img_nrows, img_ncols)) else:x = x.reshape((1, img_nrows, img_ncols, 3))outs = f_outputs([x])loss_value = outs[0]if len(outs[1:]) == 1:grad_values = outs[1].flatten().astype('float64')else:grad_values = np.array(outs[1:]).flatten().astype('float64')return loss_value, grad_values# this Evaluator class makes it possible # to compute loss and gradients in one pass # while retrieving them via two separate functions, # "loss" and "grads". This is done because scipy.optimize # requires separate functions for loss and gradients, # but computing them separately would be inefficient. class Evaluator(object):def __init__(self):self.loss_value = Noneself.grads_values = Nonedef loss(self, x):assert self.loss_value is Noneloss_value, grad_values = eval_loss_and_grads(x)self.loss_value = loss_valueself.grad_values = grad_valuesreturn self.loss_valuedef grads(self, x):assert self.loss_value is not Nonegrad_values = np.copy(self.grad_values)self.loss_value = Noneself.grad_values = Nonereturn grad_valuesevaluator = Evaluator()# 調整數據格式,減去像素均值 if K.image_data_format() == 'channels_first':x = np.random.uniform(0, 255, (1, 3, img_nrows, img_ncols)) - 128. else:x = np.random.uniform(0, 255, (1, img_nrows, img_ncols, 3)) - 128.# 使用L-BFGS算法來求解非約束優化問題,scipy中提供了實現,使用fmin_l_bfgs_b函數來求解前面得到的總的loss的最小值 for i in range(iterations):logger.info('Start iteration: {}'.format(i))start_time = time.time()x, min_val, info = fmin_l_bfgs_b(evaluator.loss, x.flatten(), fprime=evaluator.grads, maxfun=20)logger.info('Current loss value: {}'.format(min_val))# save current generated imageimg = deprocess_image(x.copy())fname = result_prefix + '_at_iteration_{}.png'.format(i)imsave(fname, img)end_time = time.time()logger.info('Image saved as {}'.format(fname))logger.info('Iteration {} completed in {}s.'.format(i, end_time-start_time))運行
只要配置好keras等庫的環境,上面的代碼在python2或者python3均可運行。下面就以python3為例。
輸入python3 neural_style_transfer.py:
事先要準備好兩張圖片:內容圖片、風格圖片。
輸入:python3 neural_style_transfer.py './imgs/1-content.jpg' './imgs/1-style.jpg' './imgs/result'
第一個為內容圖片的路徑,第二個為風格圖片的路徑,第三個為生成圖片的前綴。
如圖所示:
上面的例子中就是指定了目錄中的1-content.jpg為內容圖片,1-style.jpg為風格圖片,對應著前綴為result_at_iteration_的圖片為生成的圖片。
運行時根據實際情況自行更改即可。
運行結果
內容圖片
風格圖片
生成的結果(迭代9次):
其他的一些結果:
注:建議采用GPU運行,如果用CPU運行一般要2到3小時,使用GPU幾分鐘即可得到結果。另外也可以更改--iter,即迭代次數,為其他值,看看結果如何。其他參數也可以自行更改,觀察實驗結果。
總結
以上是生活随笔為你收集整理的神经风格迁移(Neural Style Transfer)程序实现(Keras)的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: OpenFace学习(2):FaceNe
- 下一篇: 神经风格迁移(Neural Style