No, Machine Learning is not just glorified Statistics
學(xué)習(xí)筆記,僅供參考
轉(zhuǎn)載自:No, Machine Learning is not just glorified Statistics
文章目錄
- No, Machine Learning is not just glorified Statistics
- Machine Learning = Representation + Evaluation + Optimization
- Regression Over 100 Million Variables?—?No Problem?
No, Machine Learning is not just glorified Statistics
This meme has been all over social media lately, producing appreciative chuckles across the internet as the hype around deep learning begins to subside. The sentiment that machine learning is really nothing to get excited about, or that it’s just a redressing of age-old statistical techniques, is growing increasingly ubiquitous; the trouble is it isn’t true.
題圖上這張?jiān)谏缃幻襟w上瘋狂傳播的惡搞漫畫博得了不少轉(zhuǎn)發(fā),這似乎暗示著,對機(jī)器學(xué)習(xí)的炒作熱度開始消退。越來越多的人都開始認(rèn)為機(jī)器學(xué)習(xí)真的沒有什么可值得興奮的,它只不過是對老舊的統(tǒng)計(jì)技術(shù)的重新包裝罷了。然而問題是,事實(shí)并非如此。
I get it?—?it’s not fashionable to be part of the overly enthusiastic, hype-drunk crowd of deep learning evangelists. ML experts who in 2013 preached deep learning from the rooftops now use the term only with a hint of chagrin, preferring instead to downplay the power of modern neural networks lest they be associated with the scores of people that still seem to think that import keras is the leap for every hurdle, and that they, in knowing it, have some tremendous advantage over their competition.
可以看出,深度學(xué)習(xí)傳播的狂熱分子不流行了。甚至是那些站在科學(xué)頂端的專家們,現(xiàn)在對使用這個(gè)術(shù)語都失去了極大的熱情,僅剩些許懊惱,反而更傾向于淡化現(xiàn)代神經(jīng)網(wǎng)絡(luò)的力量,避免讓大量群眾認(rèn)為 import keras 能夠克服每一個(gè)障礙。
Machine Learning = Representation + Evaluation + Optimization
Machine learning is a class of computational algorithms which iteratively “l(fā)earn” an approximation to some function. Pedro Domingos, a professor of computer science at the University of Washington, laid out three components that make up a machine learning algorithm: representation, evaluation, and optimization.
機(jī)器學(xué)習(xí)是一類計(jì)算算法,它采用迭代學(xué)習(xí)的方法向某個(gè)函數(shù)逼近。華盛頓大學(xué)計(jì)算機(jī)科學(xué)教授Pedro Domingos提出了構(gòu)成機(jī)器學(xué)習(xí)算法的三個(gè)組成部分:映射、評估和優(yōu)化。
Representation involves the transformation of inputs from one space to another more useful space which can be more easily interpreted. Think of this in the context of a Convolutional Neural Network. Raw pixels are not useful for distinguishing a dog from a cat, so we transform them to a more useful representation (e.g., logits from a softmax output) which can be interpreted and evaluated.
映射(Representation)就是把輸入從一個(gè)空間轉(zhuǎn)化到另一個(gè)更加有用的空間。在卷積神經(jīng)網(wǎng)絡(luò)中,原始像素對于區(qū)分貓狗的作用不大,因此我們把這些像素映射到另一個(gè)空間中(例如從softmax輸出的邏輯值),使其能夠被解釋和評估。
Evaluation is essentially the loss function. How effectively did your algorithm transform your data to a more useful space? How closely did your softmax output resemble your one-hot encoded labels (classification)? Did you correctly predict the next word in the unrolled text sequence (text RNN)? How far did your latent distribution diverge from a unit Gaussian (VAE)? These questions tell you how well your representation function is working; more importantly, they define what it will learn to do.
評估(Evaluation)的本質(zhì)就是損失函數(shù)。你的算法是否有效地把數(shù)據(jù)轉(zhuǎn)化到另一個(gè)更有用的空間?你在softmax的輸出與在one-hot編碼的分類結(jié)果是否相近?你是否正確預(yù)測了展開文本序列中下一個(gè)會出現(xiàn)的單詞(文本RNN)? 你的潛在分布離單位高斯(VAE)相差多少?這些問題的答案可以告訴你映射函數(shù)是否有效;更重要的是,它們定義了你需要學(xué)習(xí)的內(nèi)容。
Optimization is the last piece of the puzzle. Once you have the evaluation component, you can optimize the representation function in order to improve your evaluation metric. In neural networks, this usually means using some variant of stochastic gradient descent to update the weights and biases of your network according to some defined loss function. And voila! You have the world’s best image classifier (at least, if you’re Geoffrey Hinton in 2012, you do).
優(yōu)化(Optimization)是拼圖的最后一塊。當(dāng)你有了評估的方法之后,你可以對映射函數(shù)進(jìn)行優(yōu)化,然后提高你的評估參數(shù)。在神經(jīng)網(wǎng)絡(luò)中,這通常意味著使用一些隨機(jī)梯度下降的變量來根據(jù)某些定義的損失函數(shù)更新網(wǎng)絡(luò)的權(quán)重和偏差。 這樣一來,你就擁有了世界上最好的圖像分類器(2012年,杰弗里·辛頓就是這樣做到的)。
Regression Over 100 Million Variables?—?No Problem?
Let me also point out the difference between deep nets and traditional statistical models by their scale. Deep neural networks are huge. The VGG-16 ConvNet architecture, for example, has approximately 138 million parameters. How do you think your average academic advisor would respond to a student wanting to perform a multiple regression of over 100 million variables? The idea is ludicrous. That’s because training VGG-16 is not multiple regression?—?it’s machine learning.
我還要指出深度學(xué)習(xí)網(wǎng)絡(luò)和傳統(tǒng)統(tǒng)計(jì)模型的一個(gè)差別,就是它們的規(guī)模問題。深度神經(jīng)網(wǎng)絡(luò)的規(guī)模是巨大的。VGG-16 ConvNet架構(gòu)具有1.38億個(gè)參數(shù)。如果一個(gè)學(xué)生告訴導(dǎo)師要進(jìn)行一個(gè)具有超過1億變量的多重線性回歸,他會有什么反應(yīng)?這是很荒謬的。因?yàn)閂GG-16不是多重線性回歸,它是一種機(jī)器學(xué)習(xí)手段。
創(chuàng)作挑戰(zhàn)賽新人創(chuàng)作獎(jiǎng)勵(lì)來咯,堅(jiān)持創(chuàng)作打卡瓜分現(xiàn)金大獎(jiǎng)總結(jié)
以上是生活随笔為你收集整理的No, Machine Learning is not just glorified Statistics的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 回归素材(part8)--python机
- 下一篇: 统计与机器学习的异同