當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

VAE【keras实现】

發布時間：2025/3/17 编程问答 41 豆豆

生活随笔收集整理的這篇文章主要介紹了 VAE【keras实现】小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

變分自編碼(VAE)的東西，將一些理解記錄在這，不對的地方還請指出。

在論文《Auto-Encoding Variational Bayes》中介紹了VAE。

訓練好的VAE可以用來生成圖像。

在Keras 中提供了一個VAE的Demo:variational_autoencoder.py

'''This script demonstrates how to build a variational autoencoder with Keras. Reference: "Auto-Encoding Variational Bayes" https://arxiv.org/abs/1312.6114 ''' import numpy as np import matplotlib.pyplot as plt from scipy.stats import normfrom keras.layers import Input, Dense, Lambda from keras.models import Model from keras import backend as K from keras import objectives from keras.datasets import mnist from keras.utils.visualize_util import plot import syssaveout = sys.stdout file = open('variational_autoencoder.txt','w') sys.stdout = filebatch_size = 100 original_dim = 784 #28*28 latent_dim = 2 intermediate_dim = 256 nb_epoch = 50 epsilon_std = 1.0#my tips:encoding x = Input(batch_shape=(batch_size, original_dim)) h = Dense(intermediate_dim, activation='relu')(x) z_mean = Dense(latent_dim)(h) z_log_var = Dense(latent_dim)(h)#my tips:Gauss sampling,sample Z def sampling(args): z_mean, z_log_var = argsepsilon = K.random_normal(shape=(batch_size, latent_dim), mean=0.,std=epsilon_std)return z_mean + K.exp(z_log_var / 2) * epsilon# note that "output_shape" isn't necessary with the TensorFlow backend # my tips:get sample z(encoded) z = Lambda(sampling, output_shape=(latent_dim,))([z_mean, z_log_var])# we instantiate these layers separately so as to reuse them later decoder_h = Dense(intermediate_dim, activation='relu') decoder_mean = Dense(original_dim, activation='sigmoid') h_decoded = decoder_h(z) x_decoded_mean = decoder_mean(h_decoded)#my tips:loss(restruct X)+KL def vae_loss(x, x_decoded_mean):#my tips:loglossxent_loss = original_dim * objectives.binary_crossentropy(x, x_decoded_mean)#my tips:see paper's appendix Bkl_loss = - 0.5 * K.sum(1 + z_log_var - K.square(z_mean) - K.exp(z_log_var), axis=-1)return xent_loss + kl_lossvae = Model(x, x_decoded_mean) vae.compile(optimizer='rmsprop', loss=vae_loss)# train the VAE on MNIST digits (x_train, y_train), (x_test, y_test) = mnist.load_data(path='mnist.pkl.gz')x_train = x_train.astype('float32') / 255. x_test = x_test.astype('float32') / 255. x_train = x_train.reshape((len(x_train), np.prod(x_train.shape[1:]))) x_test = x_test.reshape((len(x_test), np.prod(x_test.shape[1:])))vae.fit(x_train, x_train,shuffle=True,nb_epoch=nb_epoch,verbose=2,batch_size=batch_size,validation_data=(x_test, x_test))# build a model to project inputs on the latent space encoder = Model(x, z_mean)# display a 2D plot of the digit classes in the latent space x_test_encoded = encoder.predict(x_test, batch_size=batch_size) plt.figure(figsize=(6, 6)) plt.scatter(x_test_encoded[:, 0], x_test_encoded[:, 1], c=y_test) plt.colorbar() plt.show()# build a digit generator that can sample from the learned distribution decoder_input = Input(shape=(latent_dim,)) _h_decoded = decoder_h(decoder_input) _x_decoded_mean = decoder_mean(_h_decoded) generator = Model(decoder_input, _x_decoded_mean)# display a 2D manifold of the digits n = 15 # figure with 15x15 digits digit_size = 28 figure = np.zeros((digit_size * n, digit_size * n)) # linearly spaced coordinates on the unit square were transformed through the inverse CDF (ppf) of the Gaussian # to produce values of the latent variables z, since the prior of the latent space is Gaussian grid_x = norm.ppf(np.linspace(0.05, 0.95, n)) grid_y = norm.ppf(np.linspace(0.05, 0.95, n))for i, yi in enumerate(grid_x):for j, xi in enumerate(grid_y):z_sample = np.array([[xi, yi]])x_decoded = generator.predict(z_sample)digit = x_decoded[0].reshape(digit_size, digit_size)figure[i * digit_size: (i + 1) * digit_size,j * digit_size: (j + 1) * digit_size] = digitplt.figure(figsize=(10, 10)) plt.imshow(figure, cmap='Greys_r') plt.show()plot(vae,to_file='variational_autoencoder_vae.png',show_shapes=True) plot(encoder,to_file='variational_autoencoder_encoder.png',show_shapes=True) plot(generator,to_file='variational_autoencoder_generator.png',show_shapes=True)sys.stdout.close() sys.stdout = saveout

代碼實驗MNIST手寫數字數據集

這個Demo中，VAE的loss函數最小化了兩項之和：

一是重構數據x_decoded_mean與原始數據X的binary_crossentropy

二是近似后驗與真實后驗的KL散度，至于KL散度為何簡化成代碼中的形式，看論文《Auto-Encoding Variational Bayes》中的附錄B有證明。

Demo中VAE的形狀如下：

實驗中的編碼器形狀：

代碼中將編碼得到的均值U可視化結果：

同一顏色為同一數字，發現編碼后的二維U值聚類效果很好

實驗中的生成器形狀：

可將從二維高斯分布中隨機采樣得到的Z，解碼成手寫數字圖片

代碼中將解碼得到的圖像可視化：