DL:深度学习算法(神经网络模型集合)概览之《THE NEURAL NETWORK ZOO》的中文解释和感悟(六)
DL:深度學習算法(神經網絡模型集合)概覽之《THE NEURAL NETWORK ZOO》的中文解釋和感悟(六)
?
?
?
目錄
DRN
DNC
NTM
CN
KN
AN
?
?
?
?
?
?
?
相關文章
DL:深度學習算法(神經網絡模型集合)概覽之《THE NEURAL NETWORK ZOO》的中文解釋和感悟(一)
DL:深度學習算法(神經網絡模型集合)概覽之《THE NEURAL NETWORK ZOO》的中文解釋和感悟(二)
DL:深度學習算法(神經網絡模型集合)概覽之《THE NEURAL NETWORK ZOO》的中文解釋和感悟(三)
DL:深度學習算法(神經網絡模型集合)概覽之《THE NEURAL NETWORK ZOO》的中文解釋和感悟(四)
DL:深度學習算法(神經網絡模型集合)概覽之《THE NEURAL NETWORK ZOO》的中文解釋和感悟(五)
DL:深度學習算法(神經網絡模型集合)概覽之《THE NEURAL NETWORK ZOO》的中文解釋和感悟(六)
?
?
?
DRN
? ? ? ?Deep residual networks (DRN)?are very deep FFNNs with extra connections passing input from one layer to a later layer (often 2 to 5 layers) as well as the next layer. Instead of trying to find a solution for mapping some input to some output across say 5 layers, the network is enforced to learn to map some input to some output + some input. Basically, it adds an identity to the solution, carrying the older input over and serving it freshly to a later layer. It has been shown that these networks are very effective at learning patterns up to 150 layers deep, much more than the regular 2 to 5 layers one could expect to train. However, it has been proven that these networks are in essence just RNNs without the explicit time based construction and they’re often compared to LSTMs without gates.
? ? ? ?深度殘差網絡(DRN)是非常深的FFNNs,有額外的連接將輸入從一層傳遞到下一層(通常是2到5層)以及下一層。與其試圖尋找一個跨5層將一些輸入映射到一些輸出的解決方案,不如強制網絡學會將一些輸入映射到一些輸出+一些輸入。基本上,它為解決方案添加了一個標識,將舊的輸入傳送到新層。
? ? ? ?研究表明,這些網絡在學習高達150層的模式方面非常有效,遠遠超過人們可以預期訓練的常規2至5層。然而,已經證明,這些網絡本質上只是沒有顯式的基于時間的構造的RNNs,它們經常被比作沒有門的LSTMs。
He, Kaiming, et al. “Deep residual learning for image recognition.” arXiv preprint arXiv:1512.03385 (2015).
Original Paper PDF
?
DNC
? ? ? ? Differentiable Neural Computers (DNC)?are enhanced Neural Turing Machines with scalable memory, inspired by how memories are stored by the human hippocampus. The idea is to take the classical Von Neumann computer architecture and replace the CPU with an RNN, which learns when and what to read from the RAM. Besides having a large bank of numbers as memory (which may be resized without retraining the RNN). The DNC also has three attention mechanisms. These mechanisms allow the RNN to query the similarity of a bit of input to the memory’s entries, the temporal relationship between any two entries in memory, and whether a memory entry was recently updated – which makes it less likely to be overwritten when there’s no empty memory available.
? ? ? ? 可微神經計算機(DNC)是一種增強的神經圖靈機,具有可伸縮的內存,其靈感來自于人類海馬區存儲記憶的方式。其想法是采用經典的馮?諾依曼計算機架構,用RNN替換CPU, RNN可以學習何時以及從RAM中讀取什么。除了擁有大量的數字作為內存(可以在不重新訓練RNN的情況下調整大小)之外。DNC也有三個注意機制。這些機制允許RNN查詢少量輸入與內存條目的相似性、內存中任意兩個條目之間的時間關系,以及最近是否更新了內存條目——這使得在沒有可用的空內存時不太可能覆蓋該條目。
Graves, Alex, et al. “Hybrid computing using a neural network with dynamic external memory.” Nature 538 (2016): 471-476.
Original Paper PDF
?
?
NTM
? ? ? ?Neural Turing machines (NTM)?can be understood as an abstraction of LSTMs and an attempt to un-black-box neural networks (and give us some insight in what is going on in there). Instead of coding a memory cell directly into a neuron, the memory is separated. It’s an attempt to combine the efficiency and permanency of regular digital storage and the efficiency and expressive power of neural networks. The idea is to have a content-addressable memory bank and a neural network that can read and write from it. The “Turing” in Neural Turing Machines comes from them being Turing complete: the ability to read and write and change state based on what it reads means it can represent anything a Universal Turing Machine can represent.
? ? ? ?神經網絡圖靈機(NTM)可以被理解為LSTMs的抽象,是一種試圖消除黑盒神經網絡(并讓我們對其中發生的事情有一些了解)的嘗試。不是直接將記憶細胞編碼成神經元,而是將記憶分開。它試圖將常規數字存儲的效率和持久性與神經網絡的效率和表達能力結合起來。這個想法是要有一個內容可尋址的存儲庫和一個可以從中讀寫的神經網絡。神經圖靈機器中的“圖靈”來自于它們的圖靈完備性:根據它所讀取的內容讀寫和改變狀態的能力意味著它可以表示任何通用圖靈機器能夠表示的東西。
Graves, Alex, Greg Wayne, and Ivo Danihelka. “Neural turing machines.” arXiv preprint arXiv:1410.5401 (2014).
Original Paper PDF
?
CN
? ? ? ?Capsule Networks (CapsNet)?are biology inspired alternatives to pooling, where neurons are connected with multiple weights (a vector) instead of just one weight (a scalar). This allows neurons to transfer more information than simply which feature was detected, such as where a feature is in the picture or what colour and orientation it has. The learning process involves a local form of Hebbian learning that values correct predictions of output in the next layer.
?? ? ? ?膠囊網絡(CapsNet)是受生物學啟發的池的替代品,其中神經元連接多個權重(向量),而不是一個權重(標量)。這使得神經元能夠傳遞更多的信息,而不僅僅是檢測到哪些特征,比如某個特征在圖片中的什么位置,或者它的顏色和方向。學習過程包括一種局部形式的Hebbian學習,它重視對下一層輸出的正確預測。
Sabour, Sara, Frosst, Nicholas, and Hinton, G. E. “Dynamic Routing Between Capsules.” In Advances in neural information processing systems (2017): 3856-3866.
Original Paper PDF
?
KN
? ? ? ?Kohonen networks (KN, also self organising (feature) map, SOM, SOFM)?utilise competitive learning to classify data without supervision. Input is presented to the network, after which the network assesses which of its neurons most closely match that input. These neurons are then adjusted to match the input even better, dragging along their neighbours in the process. How much the neighbours are moved depends on the distance of the neighbours to the best matching units.
? ? ? ?Kohonen networks (KN,也是self - organizational (feature) map, SOM, SOFM)利用競爭性學習對數據進行分類,無需監督。輸入被呈現給網絡,然后網絡評估哪個神經元與輸入最匹配。然后,這些神經元被調整,以更好地匹配輸入,在這個過程中拖拽它們的鄰居。鄰域的移動量取決于鄰域到最佳匹配單元的距離。
Kohonen, Teuvo. “Self-organized formation of topologically correct feature maps.” Biological cybernetics 43.1 (1982): 59-69.
Original Paper PDF
?
?
AN
? ? ? ?Attention networks (AN)?can be considered a class of networks, which includes the Transformer architecture. They use an attention mechanism to combat information decay by separately storing previous network states and switching attention between the states. The hidden states of each iteration in the encoding layers are stored in memory cells. The decoding layers are connected to the encoding layers, but it also receives data from the memory cells filtered by an attention context. This filtering step adds context for the decoding layers stressing the importance of particular features. The attention network producing this context is trained using the error signal from the output of decoding layer. Moreover, the attention context can be visualized giving valuable insight into which input features correspond with what output features.
? ? ? ?注意機制網絡(AN)可以看作是一類網絡,它包括轉換器體系結構。他們使用一種注意機制,通過單獨存儲以前的網絡狀態和在狀態之間切換注意來對抗信息衰減。編碼層中每個迭代的隱藏狀態存儲在內存單元中。解碼層連接到編碼層,但它也接收由注意上下文過濾的記憶細胞的數據。此過濾步驟為解碼層添加上下文,強調特定特性的重要性。利用解碼層輸出的錯誤信號對產生該上下文的注意網絡進行訓練。此外,注意上下文可以被可視化,從而提供有價值的見解,了解哪些輸入特性對應于哪些輸出特性。
Jaderberg, Max, et al. “Spatial Transformer Networks.” In Advances in neural information processing systems (2015): 2017-2025.
Original Paper PDF
?
Follow us on?twitter?for future updates and posts. We welcome comments and feedback, and thank you for reading!
[Update 22 April 2019] Included Capsule Networks, Differentiable Neural Computers and Attention Networks to the Neural Network Zoo; Support Vector Machines are removed; updated links to original articles.?The previous version of this post can be found here?.
???????【2019年4月22日更新】包括膠囊網絡、可微神經計算機和神經網絡動物園的注意力網絡;刪除支持向量機;更新到原始文章的鏈接。這篇文章的前一個版本可以在這里找到。
?
?
?
?
總結
以上是生活随笔為你收集整理的DL:深度学习算法(神经网络模型集合)概览之《THE NEURAL NETWORK ZOO》的中文解释和感悟(六)的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: DL:深度学习算法(神经网络模型集合)概
- 下一篇: Py之keras-retinanet:k