极限学习机和支持向量机_极限学习机I
極限學(xué)習(xí)機(jī)和支持向量機(jī)
Around 2005, A novel machine learning approach was introduced by Guang-Bin Huang and a team of researchers at Nanyang Technological University, Singapore.
2005年前后,黃光斌和新加坡南洋理工大學(xué)的一組研究人員介紹了一種新穎的機(jī)器學(xué)習(xí)方法。
This new proposed learning algorithm tends to reach the smallest training error, obtain the smallest norm of weights and the best generalization performance, and runs extremely fast, in order to differentiate it from the other popular SLFN learning algorithms, it is called the Extreme Learning Machine (ELM).
這種新提出的學(xué)習(xí)算法趨于達(dá)到最小的訓(xùn)練誤差 , 獲得最小的權(quán)重范數(shù)和最佳的泛化性能 ,并且運(yùn)行速度極快 ,以使其與其他流行的SLFN學(xué)習(xí)算法區(qū)分開來,稱為極限學(xué)習(xí)機(jī)。 (榆樹)。
This method mainly addresses the issue of far slower training time of neural networks than required, the main reasons for which is that all the parameters of the networks are tuned iteratively by using such learning algorithms. These slow-gradient based learning algorithms are extensively used to train neural networks.
該方法主要解決了神經(jīng)網(wǎng)絡(luò)的訓(xùn)練時(shí)間比所需的慢得多的問題,其主要原因是通過使用這種學(xué)習(xí)算法來迭代地調(diào)整網(wǎng)絡(luò)的所有參數(shù)。 這些基于慢梯度的學(xué)習(xí)算法被廣泛用于訓(xùn)練神經(jīng)網(wǎng)絡(luò)。
Before going into how ELM works and how is it so good, let’s see how gradient-based neural networks based off.
在探討ELM的工作原理以及它的性能如何之前,讓我們看看基于梯度的神經(jīng)網(wǎng)絡(luò)是如何建立的。
基于梯度的神經(jīng)網(wǎng)絡(luò)的演示 (Demonstration of Gradient-based Neural networks)
Below are the steps followed in a single-layered feedforward neural network in brief:
下面簡(jiǎn)要介紹了單層前饋神經(jīng)網(wǎng)絡(luò)中的步驟:
Step 1: Evaluate Wx + B
步驟1:評(píng)估Wx + B
Step 2: Apply activation function g(Wx + B) and Compute output
步驟2:應(yīng)用激活函數(shù)g(Wx + B)并計(jì)算輸出
Step 3: Calculate Loss
步驟3:計(jì)算損失
Step 4: Compute gradients (using delta rule)
第4步:計(jì)算梯度(使用增量規(guī)則)
Step 5: Repeat
步驟5:重復(fù)
This method of propagating forward and back involves a hefty number of calculations Also if the input size is large or if there are more layers/nodes, the training takes up a significant amount of time.
這種向前和向后傳播的方法涉及大量計(jì)算。此外,如果輸入大小很大或如果有更多的層/節(jié)點(diǎn),則訓(xùn)練會(huì)占用大量時(shí)間。
fig.1. 3-layered Neural Network圖。1。 三層神經(jīng)網(wǎng)絡(luò)In the above example, we can see for a 4 node input we require W1 (20 parameters), W2 (53 parameters), and W3 (21 parameters), i.e. 94 parameters in total. And the parameters increase rapidly with the increasing input nodes.
在上面的示例中,我們可以看到對(duì)于4節(jié)點(diǎn)輸入,我們需要W1(20個(gè)參數(shù)),W2(53個(gè)參數(shù))和W3(21個(gè)參數(shù)),即總共94個(gè)參數(shù)。 并且參數(shù)隨著輸入節(jié)點(diǎn)的增加而Swift增加。
Let’s take a real-life example of image classification of numbers with MNIST Dataset:
讓我們以MNIST數(shù)據(jù)集為例,對(duì)數(shù)字進(jìn)行圖像分類:
MNIST Example
MNIST示例
Source資源This has a 28x28 input size i.e. 784 input nodes. For its architecture, let’s consider two layers with 128 nodes and 64 nodes, which are then classified into 10 classes. Then parameters will be:
輸入大小為28x28,即784個(gè)輸入節(jié)點(diǎn)。 對(duì)于其體系結(jié)構(gòu),讓我們考慮兩層,分別具有128個(gè)節(jié)點(diǎn)和64個(gè)節(jié)點(diǎn) ,然后將它們分為10類。 然后參數(shù)將是:
- First Layer (784, 128) = 100352 parameters 第一層(784,128)= 100352參數(shù)
- Second Layer (128, 64) = 8192 parameters 第二層(128、64)= 8192個(gè)參數(shù)
- Output Layer (64, 10) = 640 parameters 輸出層(64、10)= 640個(gè)參數(shù)
This will give us a total of 109184 parameters. And the repeated adjustment of weights by backpropagation increases the training time by a lot.
這將給我們總共109184個(gè)參數(shù)。 而且通過反向傳播對(duì)權(quán)重進(jìn)行反復(fù)調(diào)整會(huì)大大增加訓(xùn)練時(shí)間。
And this just for a 28x28 image, consider training it for bigger input size with 10000’s of features. The training time just gets out of hand.
而這僅適用于28x28的圖片,請(qǐng)考慮使用10000項(xiàng)功能對(duì)其進(jìn)行訓(xùn)練以獲取更大的輸入大小。 培訓(xùn)時(shí)間變得一發(fā)不可收拾。
結(jié)論: (Conclusion:)
In almost all practical learning algorithms of feedforward neural networks, the conventional backpropagation method requires all these weights to be adjusted at every back-prop step.
在幾乎所有前饋神經(jīng)網(wǎng)絡(luò)的實(shí)用學(xué)習(xí)算法中,常規(guī)的反向傳播方法都需要在每個(gè)反向傳播步驟調(diào)整所有這些權(quán)重。
For most of the time, gradient-descent based strategies have been employed in varied learning algorithms of feedforward neural networks. However, it’s clear that gradient descent-based learning strategies square measure usually terribly slow because of improper learning steps or could simply converge to local minimums. And many iterative learning steps are needed by such learning algorithms so as to get higher learning performance.
在大多數(shù)情況下,前饋神經(jīng)網(wǎng)絡(luò)的各種學(xué)習(xí)算法都采用了基于梯度下降的策略。 但是,很明顯,由于學(xué)習(xí)步驟不適當(dāng),基于梯度下降的學(xué)習(xí)策略的平方測(cè)量通常非常慢,或者可能會(huì)收斂到局部最小值。 并且,這樣的學(xué)習(xí)算法需要許多迭代學(xué)習(xí)步驟,以便獲得更高的學(xué)習(xí)性能。
This makes the training far slower than required, which has been a major bottleneck for various applications.
這使培訓(xùn)速度大大慢于所需的時(shí)間,這已成為各種應(yīng)用程序的主要瓶頸。
Next Article in this series: Part II: Algorithm https://medium.com/@prasad.kumkar/extreme-learning-machines-9c8be01f6f77
本系列的下一篇文章: 第二部分:算法 https://medium.com/@prasad.kumkar/extreme-learning-machines-9c8be01f6f77
翻譯自: https://medium.com/datadriveninvestor/extreme-learning-machines-82095ee198ce
極限學(xué)習(xí)機(jī)和支持向量機(jī)
總結(jié)
以上是生活随笔為你收集整理的极限学习机和支持向量机_极限学习机I的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 基金ac怎么选
- 下一篇: 建设银行二类卡一天可以转入多少 建设银行