循环神经网络 递归神经网络_了解递归神经网络中的注意力
循環神經網絡 遞歸神經網絡
I recently started a new newsletter focus on AI education. TheSequence is a no-BS( meaning no hype, no news etc) AI-focused newsletter that takes 5 minutes to read. The goal is to keep you up to date with machine learning projects, research papers and concepts. Please give it a try by subscribing below:
我最近開始了一份有關AI教育的新時事通訊。 TheSequence是無BS(意味著沒有炒作,沒有新聞等),它是專注于AI的新聞通訊,需要5分鐘的閱讀時間。 目標是讓您了解機器學習項目,研究論文和概念的最新動態。 請通過以下訂閱嘗試一下:
In 2017, the Google Brain team published the uber-famous paper “Attention is all You Need” which started the transformers, pre-trained model revolution. Before that paper, Google had been exploring attention-based models for a few years. Today, I would like to revisit an earlier Google paper from 2016 that was the first paper I read about the attention subject.
2017年,Google Brain團隊發表了著名的論文“ Attention is all You Need” ,該論文開始了變形金剛,預訓練的模型革命。 在此之前,谷歌已經探索基于注意力的模型已有幾年了。 今天,我想回顧一下2016年以來Google的早期論文,這是我閱讀的有關注意力主題的第一篇論文。
Attention is a cognitive ability that we rely on all the time. Just trying to read this article is a complicated task from the neuroscientific standpoint. At this time you are probably bombarded with emails, news, notifications on our phone, the usual annoying coworker interrupting and other distractions that cause your brain to spin on many directions. In order to read this tiny article or perform many other cognitive tasks, you need to focus, you need attention.
注意是我們一直依賴的一種認知能力。 從神經科學的角度來看,僅嘗試閱讀本文是一項復雜的任務。 這時,您可能會受到電子郵件,新聞,我們手機上的通知的轟炸,平時煩人的同事打擾以及其他使您的大腦向多個方向旋轉的干擾。 為了閱讀這篇小文章或執行許多其他認知任務,您需要集中精力,需要關注。
Attention is a cognitive skill that is pivotal to the formation of knowledge. However, the dynamics of attention have remained a mystery to neuroscientists for centuries and, just recently, that we have had major breakthroughs that help to explain how attention works. In the context of deep learning programs, building attention dynamics seems to be an obvious step in order to improve the knowledge of models and adapt them to different scenarios. Building attention mechanisms into deep learning systems is a very nascent and active area of research. In 2016, researchers from the Google Brain team published a paper that detailed some of the key models that can be used to simulate attention in deep neural networks.
注意是一種認知技能,對知識的形成至關重要。 但是,注意力的動態變化一直是神經科學家一個世紀以來的謎,而就在最近,我們已經取得了重大突破,有助于解釋注意力的作用。 在深度學習程序的背景下,建立注意力動態似乎是顯而易見的一步,以提高模型的知識并使模型適應不同的場景。 將注意力機制構建到深度學習系統中是一個非常活躍的新興研究領域。 2016年,Google Brain團隊的研究人員發表了一篇論文 ,詳細介紹了可用于模擬深度神經網絡中注意力的一些關鍵模型。
注意如何工作? (How Attention Works?)
In order to understand attention in deep learning systems it might be useful to take a look at how this cognitive phenomenon takes place in the human brain. From the perspective of neuroscience, attention is the ability of the brain to selectively concentrate on one aspect of the environment while ignoring other things. The current research identifies two main types of attention both related to different areas of the brain. Object-based attention is often referred to the ability of the brain to focus on specific objects such as a picture of a section in this article. Spatial-based attention is mostly related to the focus on specific locations. Both types of attention are relevant in deep learning models. While object-based attention can be used in systems such as image recognition or machine translation, spatial-attention is relevant in deep reinforcement learning scenarios such as self-driving vehicles.
為了了解深度學習系統中的注意力,研究一下這種認知現象在人腦中如何發生可能是有用的。 從神經科學的角度來看,注意力是大腦選擇性地專注于環境的一方面而忽略其他事物的能力。 當前的研究確定了兩種主要的注意力類型,它們都與大腦的不同區域有關。 基于對象的注意力通常是指大腦專注于特定對象的能力,例如本文中某個部分的圖片。 基于空間的注意力主要與對特定位置的關注有關。 兩種注意力都與深度學習模型相關。 盡管基于對象的注意力可以用于諸如圖像識別或機器翻譯之類的系統中,但空間注意力與諸如自動駕駛汽車之類的深度強化學習場景相關。
深度神經網絡中的注意界面 (Attentional Interfaces in Deep Neural Networks)
When comes to deep learning systems, there are different techniques that have been created in order to simulate different types of attention. The Google research paper focuses on four fundamental models that are relevant to recurrent neural networks(RNNs). Why RNNs specifically? Well, RNNs are a type of network that is mostly used to process sequential data and obtain higher-level knowledge. As a result, RNNs are often used as a second step to refine the work of other neural network models such as convolutional neural networks(CNNs) or generative interfaces. Building attention mechanisms into RNNs can help improve the knowledge of different deep neural models. The Google Brain team identified the following four techniques for building attention into RNNs models:
當涉及深度學習系統時,為了模擬不同類型的注意力,已經創建了不同的技術。 Google的研究論文集中在與循環神經網絡(RNN)相關的四個基本模型上。 為什么要使用RNN? 好吧,RNN是一種網絡,主要用于處理順序數據并獲得更高級的知識。 結果,RNN通常被用作改進其他神經網絡模型(例如卷積神經網絡(CNN)或生成接口)的工作的第二步。 在RNN中建立注意力機制可以幫助提高對不同深度神經模型的了解。 Google Brain團隊確定了以下四種將注意力吸引到RNN模型中的技術:
· Neural Turing Machines: One of the simplest attentional interfaces, Neural Turing Machines(NTMs) add a memory structure to traditional RNNs. Using a memory structure allows ATM to specify an “attention distribution” section that describes the area that the model should focus on. Implementations of NTMs can be found in many of the popular deep learning frameworks such as TensorFlow and PyTorch.
神經圖靈機:神經圖靈機(NTM)是最簡單的關注界面之一,為傳統RNN添加了內存結構。 使用內存結構使ATM可以指定“注意力分布”部分,以描述模型應關注的區域。 NTM的實現可以在許多流行的深度學習框架中找到,例如TensorFlow和PyTorch。
Source: https://distill.pub/2016/augmented-rnns/資料來源: https : //distill.pub/2016/augmented-rnns/· Adaptive Computation Time: This is a brand-new technique that allows RNNs to perform multiple steps of computation for each time step. How is this related to attention? Very simply, standard RNNs perform the same amount of computation of each step. Adaptive computation time techniques used an attention distribution model to the number of steps to run each time allowing to put more emphasis on specific parts of the model.
·自適應計算時間:這是一種嶄新的技術,允許RNN在每個時間步長執行多個計算步長。 這與注意力有什么關系? 很簡單,標準RNN在每個步驟中執行相同的計算量。 自適應計算時間技術使用注意力分布模型來確定每次運行的步驟數,從而可以將更多的重點放在模型的特定部分上。
Source: https://distill.pub/2016/augmented-rnns/資料來源: https : //distill.pub/2016/augmented-rnns/· Neural Programmer: A fascinating new area in the deep learning space, neural programmer models focus on learning to create programs in order to solve a specific task. In fact, it learns to generate such programs without needing examples of correct programs. It discovers how to produce programs as a means to the end of accomplishing some task. Conceptually, neural programmer techniques try to bridge the gap between neural networks and traditional programming techniques that can be used to develop attention mechanisms in deep learning models.
·神經編程器:神經編程器模型是深度學習領域一個引人入勝的新領域,專注于學習創建程序以解決特定任務。 實際上,它學會了生成此類程序而無需正確程序的示例 。 它發現了如何制作程序作為完成某些任務的一種手段。 從概念上講,神經程序員技術試圖彌合神經網絡與傳統編程技術之間的鴻溝,而傳統編程技術可用于開發深度學習模型中的注意力機制。
Source: https://distill.pub/2016/augmented-rnns/資料來源: https : //distill.pub/2016/augmented-rnns/· Attentional Interfaces: Attentional interfaces uses an RNN model to focus on specific sections of another neural network. A classic example of this technique can be found in image recognition models using a CNN-RNN duplex. In this architecture, the RNN will focus on specific parts of the images generated by the CNN in order to refine it and improve the quality of the knowledge.
·注意接口:注意接口使用RNN模型來關注另一個神經網絡的特定部分。 可以在使用CNN-RNN雙工的圖像識別模型中找到此技術的經典示例。 在這種體系結構中,RNN將專注于CNN生成的圖像的特定部分,以對其進行完善和提高知識質量。
Source: https://distill.pub/2016/augmented-rnns/資料來源: https : //distill.pub/2016/augmented-rnns/Attention is becoming one of the most important elements of modern neural network architectures but, at the same time, we are just getting started in that area. The fascinating feature of attention is that is not a brand new neural network architecture but a way to augment existing architectures with new capabilities. Attention-based architectures like Transformers have become one of the most important developments in the recent years of deep learning and we can’t wait to see what’s next.
注意正在成為現代神經網絡體系結構中最重要的元素之一,但是與此同時,我們才剛剛開始涉足這一領域。 注意的引人入勝的特征在于,它不是全新的神經網絡架構,而是一種通過新功能擴展現有架構的方法。 諸如Transformers之類的基于注意力的架構已成為近年來深度學習中最重要的發展之一,我們迫不及待地想看到下一步。
翻譯自: https://medium.com/dataseries/understanding-attention-in-recurrent-neural-networks-d8ae05f558f4
循環神經網絡 遞歸神經網絡
總結
以上是生活随笔為你收集整理的循环神经网络 递归神经网络_了解递归神经网络中的注意力的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 如何在Python中建立回归模型
- 下一篇: 吞食天地2赵子龙传攻略是什么