深度学习的Xavier初始化方法
在tensorflow中,有一個初始化函數(shù):tf.contrib.layers.variance_scaling_initializer。Tensorflow 官網(wǎng)的介紹為:
variance_scaling_initializer(
factor=2.0,
mode='FAN_IN',
uniform=False,
seed=None,
dtype=tf.float32
)
1
2
3
4
5
6
7
Returns an initializer that generates tensors without scaling variance.
When initializing a deep network, it is in principle advantageous to keep the scale of the input variance constant, so it does not explode or diminish by reaching the final layer. This initializer use the following formula:
if mode='FAN_IN': # Count only number of input connections.
n = fan_in
elif mode='FAN_OUT': # Count only number of output connections.
n = fan_out
elif mode='FAN_AVG': # Average number of inputs and output connections.
n = (fan_in + fan_out)/2.0
truncated_normal(shape, 0.0, stddev=sqrt(factor / n))
1
2
3
4
5
6
7
8
這段話可以理解為,通過使用這種初始化方法,我們能夠保證輸入變量的變化尺度不變,從而避免變化尺度在最后一層網(wǎng)絡(luò)中爆炸或者彌散。
這個方法就是 Xavier 初始化方法,可以從以下這兩篇論文去了解這個方法:
·X. Glorot and Y. Bengio. Understanding the difficulty of training deepfeedforward neural networks. In International Conference on Artificial Intelligence and Statistics, pages 249–256, 2010.
Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S.Guadarrama, and T. Darrell. Caffe: Convolutional architecture for fast featureembedding. arXiv:1408.5093, 2014.
或者可以通過這些文章去了解:
CNN數(shù)值
三種權(quán)重的初始化方法
深度學(xué)習(xí)——Xavier初始化方法
---------------------
作者:路雖遠(yuǎn)在路上
來源:CSDN
原文:https://blog.csdn.net/u010185894/article/details/71104387
版權(quán)聲明:本文為博主原創(chuàng)文章,轉(zhuǎn)載請附上博文鏈接!
?
總結(jié)
以上是生活随笔為你收集整理的深度学习的Xavier初始化方法的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: pytorch学习笔记(九):PyTor
- 下一篇: 深度学习——Xavier初始化方法