python 数组合并排重_并排深度学习:Julia vs Python
python 數組合并排重
Julia could possibly be the biggest threat to Python. For a variety of applications, Julia is hands-down faster than Python and is almost as fast as C. Julia also offers features like multiple dispatch and metaprogramming that give it an edge over Python.
Julia可能是對Python的最大威脅。 對于各種應用程序,Julia比Python放慢了速度, 幾乎與C一樣快 。 Julia還提供了諸如多調度和元編程的功能, 使它在Python上更具優勢 。
At the same time, Python is established, widely used, and has a variety of time tested packages. The question of switching to Julia is a hard question to address. Often the answer is a frustrating, “It depends”.
同時,Python已建立并得到廣泛使用,并且具有各種經過時間測試的軟件包。 改用Julia這個問題很難解決。 答案通常令人沮喪,“取決于情況”。
To help showcase Julia and to address the question of whether to use it, I’ve taken samples of deep learning code from both languages and placed them in series for easy comparison. I will walk through training VGG19 model on the CIFAR10 dataset.
為了幫助展示Julia并解決是否使用它的問題,我從這兩種語言中提取了深度學習代碼示例,并將它們串聯放置以便于比較。 我將逐步介紹如何在CIFAR10數據集上訓練VGG19模型。
楷模 (Models)
Photo by Tom Parkes on Unsplash 湯姆·帕克斯 ( Tom Parkes)在Unsplash上攝Deep learning models can be huge and often take a lot of work to define, especially when they contain specialized layers like ResNet [1]. We will use a medium sized model (no pun intended) , VGG19, for this comparison [2].
深度學習模型可能非常龐大,通常需要花費大量工作來定義,特別是當它們包含像ResNet [1]這樣的專門層時。 為了進行比較,我們將使用中等大小的模型(無雙關語)VGG19。
VGG19 in Python
Python中的VGG19
I’ve chosen Keras for our Python implementation because its lightweight and flexible design is competitive with Julia.
我選擇Keras作為我們的Python實現是因為它的輕巧靈活的設計與Julia競爭。
from keras.models import Sequentialfrom keras.layers import Dense, Conv2D, MaxPool2D , Flattenvgg19 = Sequential()
vgg19.add(Conv2D(input_shape=(224,224,3),filters=64,kernel_size=(3,3),padding="same", activation="relu"))
vgg19.add(Conv2D(filters=64,kernel_size=(3,3),padding="same", activation="relu"))
vgg19.add(MaxPool2D(pool_size=(2,2),strides=(2,2)))
vgg19.add(Conv2D(filters=128, kernel_size=(3,3), padding="same", activation="relu"))
vgg19.add(Conv2D(filters=128, kernel_size=(3,3), padding="same", activation="relu"))
vgg19.add(MaxPool2D(pool_size=(2,2),strides=(2,2)))
vgg19.add(Conv2D(filters=256, kernel_size=(3,3), padding="same", activation="relu"))
vgg19.add(Conv2D(filters=256, kernel_size=(3,3), padding="same", activation="relu"))
vgg19.add(Conv2D(filters=256, kernel_size=(3,3), padding="same", activation="relu"))
vgg19.add(MaxPool2D(pool_size=(2,2),strides=(2,2)))
vgg19.add(Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu"))
vgg19.add(Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu"))
vgg19.add(Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu"))
vgg19.add(MaxPool2D(pool_size=(2,2),strides=(2,2)))
vgg19.add(Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu"))
vgg19.add(Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu"))
vgg19.add(Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu"))
vgg19.add(MaxPool2D(pool_size=(2,2),strides=(2,2)))
vgg19.add(Flatten())model.add(Dense(units=4096,activation="relu"))
vgg19.add(Dense(units=4096,activation="relu"))
vgg19.add(Dense(units=10, activation="softmax"))# Code from Rohit Thakur on GitHub
The task here is to concatenate 21 layers of deep learning machinery. Python handles this well. The syntax is simple and easy to understand. While the .add() function might be a little ugly, it is obvious what it is doing. Furthermore, it is clear in the code what each model layer does. (Convolves, pools, flattens, etc..)
這里的任務是連接21層深度學習機器。 Python處理得很好。 語法簡單易懂。 雖然.add()函數可能有點丑陋,但是很明顯它在做什么。 此外,在代碼中很清楚每個模型層的作用。 (卷積,池化,展平等。)
VGG19 In Julia
VGG19在Julia
using Fluxvgg16() = Chain(Conv((3, 3), 3 => 64, relu, pad=(1, 1), stride=(1, 1)),
Conv((3, 3), 64 => 64, relu, pad=(1, 1), stride=(1, 1)),
MaxPool((2,2)),
Conv((3, 3), 64 => 128, relu, pad=(1, 1), stride=(1, 1)),
Conv((3, 3), 128 => 128, relu, pad=(1, 1), stride=(1, 1)),
MaxPool((2,2)),
Conv((3, 3), 128 => 256, relu, pad=(1, 1), stride=(1, 1)),
Conv((3, 3), 256 => 256, relu, pad=(1, 1), stride=(1, 1)),
Conv((3, 3), 256 => 256, relu, pad=(1, 1), stride=(1, 1)),
MaxPool((2,2)),
Conv((3, 3), 256 => 512, relu, pad=(1, 1), stride=(1, 1)),
Conv((3, 3), 512 => 512, relu, pad=(1, 1), stride=(1, 1)),
Conv((3, 3), 512 => 512, relu, pad=(1, 1), stride=(1, 1)),
MaxPool((2,2)),
Conv((3, 3), 512 => 512, relu, pad=(1, 1), stride=(1, 1)),
Conv((3, 3), 512 => 512, relu, pad=(1, 1), stride=(1, 1)),
Conv((3, 3), 512 => 512, relu, pad=(1, 1), stride=(1, 1)),
BatchNorm(512),
MaxPool((2,2)),
flatten,
Dense(512, 4096, relu),
Dropout(0.5),
Dense(4096, 4096, relu),
Dropout(0.5),
Dense(4096, 10),
softmax
)# Code from Flux Model Zoo on Github
Discussion
討論區
At a glance, Julia looks slightly less cluttered than Python. The import statements are a little cleaner and the code is a little easier to read. Like Python, it is clear what each layer does. The Chain type is a little ambiguous, but it is pretty clear that it concatenates the layers together.
乍一看,Julia看上去比Python顯得混亂一些。 import語句更加簡潔,代碼更易于閱讀。 像Python一樣,很清楚每一層的作用。 Chain類型有點模棱兩可,但是很顯然,它將圖層連接在一起。
Something to notice is that there is no model class. In fact, Julia is not object oriented, so each layer is a type instead of a class. This is worth noting because it emphasizes how the Julia model is very lightweight. Each of these layers was defined independently and then chained together without any class structure to control how they interact.
需要注意的是,沒有模型類。 實際上,Julia不是面向對象的,因此每一層都是類型而不是類。 值得注意的是,它強調了Julia模型的重量非常輕。 這些層中的每一個都是獨立定義的,然后鏈接在一起而沒有任何類結構來控制它們如何交互。
However, avoiding a little clutter doesn’t really matter when training giant models. The advantage for Python here is that Python has a huge amount of support for troubleshooting and working through bugs. The documentation is excellent and there are hundreds of VGG19 examples online. Contrast this with Julia where there are five unique VGG19 examples online (maybe).
但是,在訓練巨型模型時,避免一點混亂并不重要。 這里的Python的優勢在于Python對故障排除和錯誤修復提供了大量支持。 該文檔非常出色,在線上有數百個VGG19示例。 與Julia對比,在網上有五個獨特的VGG19示例(也許)。
數據處理 (Data Processing)
Sandro Katalina on 桑德羅·卡塔琳娜 ( UnsplashUnderlash)攝For data processing we will look at the dataset CIFAR10 that is commonly associated with VGG19.
對于數據處理,我們將查看通常與VGG19相關聯的數據集CIFAR10。
Data Processing In Python
Python中的數據處理
from keras.datasets import cifar10from keras.utils import to_categorical(X, Y), (tsX, tsY) = cifar10.load_data() # Use a one-hot-encoding
Y = to_categorical(Y)
tsY = to_categorical(tsY)# Change datatype to float
X = X.astype('float32')
tsX = tsX.astype('float32')
# Scale X and tsX so each entry is between 0 and 1
X = X / 255.0
tsX = tsX / 255.0
In order to train the model on image data, images must be put into the correct format. It only takes a few lines of code to do this. Images are loaded into variables along with image labels. To make classification easier, the labels are translated into a one hot encoding format. This is relatively straightforward in Python.
為了在圖像數據上訓練模型,必須將圖像放入正確的格式。 只需幾行代碼即可完成此操作。 圖像與圖像標簽一起加載到變量中。 為了簡化分類,將標簽轉換為一種熱編碼格式。 這在Python中相對簡單。
Data Processing In Julia
Julia中的數據處理
using MLDatasets: CIFAR10using Flux: onehotbatch# Data comes pre-normalized in Julia
trainX, trainY = CIFAR10.traindata(Float64)
testX, testY = CIFAR10.testdata(Float64)# One hot encode labels
trainY = onehotbatch(trainY, 0:9)
testY = onehotbatch(testY, 0:9)
Julia requires the same kind of image processing as Python to prepare images for the training process. The code looks extremely similar and does not appear to favor either language.
Julia需要與Python相同類型的圖像處理才能為訓練過程準備圖像。 該代碼看起來非常相似,并且似乎不支持這兩種語言。
訓練 (Training)
Photo by Zii Miller on Unsplash Zii Miller在Unsplash上的照片Next we will look at the model training loop.
接下來,我們將研究模型訓練循環。
Training in Python
用Python訓練
optimizer = SGD(lr=0.001, momentum=0.9)vgg19.compile(optimizer=optimizer, loss='categorical_crossentropy', metrics=['accuracy'])
history = model.fit(X, Y, epochs=100, batch_size=64, validation_data=(tsX, tsY), verbose=0)
Training In Julia
Julia訓練
using Flux: crossentropy, @epochsusing Flux.Data: DataLoadermodel = vgg19()
opt = Momentum(.001, .9)
loss(x, y) = crossentropy(model(x), y)
data = DataLoader(trainX, trainY, batchsize=64)
@epochs 100 Flux.train!(loss, params(model), data, opt)
The code here is about equally verbose, but the differences in the languages show. In Python, model.fit returns a dictionary containing accuracy and loss evaluations. It also has keyword arguments to automate the optimization process for you. Julia is much more bare-bones. The training algorithm requires the user to provide their own loss function, optimizer and iterable containing batches of data along with the model.
此處的代碼同樣冗長,但是顯示了語言上的差異。 在Python中, model.fit返回一個包含準確性和損失評估的字典。 它還具有關鍵字參數,可以為您自動執行優化過程。 Julia(Julia)更為準。 訓練算法要求用戶提供自己的損失函數,優化器和可迭代的包含批次數據以及模型。
The Python implementation is much more user friendly. The training process is easy and produces useful output. Julia requires a little more from the user. At the same time, Julia is more abstract, and allows any optimizer and loss function. The user can define a loss function any way that they want without needing to consult a list of built in loss functions. This kind of abstraction is typical of Julia developers, who work to make code as abstract and generic as possible.
Python實現更加用戶友好。 培訓過程很容易,并且會產生有用的輸出。 Julia要求用戶多一點。 同時,Julia更抽象,并且允許任何優化器和損失函數。 用戶可以以自己想要的任何方式定義損失函數,而無需查閱內置損失函數列表。 這種抽象是Julia開發人員的典型代表,他們致力于使代碼盡可能抽象和通用。
For this reason, Keras is more practical for implementing known techniques and standard model training, but makes Flux better suited for developing new techniques.
因此,Keras對于實施已知技術和標準模型訓練更為實用,但是使Flux更適合開發新技術。
速度 (Speed)
Photo by Florian Steciuk on Unsplash Florian Steciuk在Unsplash上的照片Unfortunately, there is no available benchmark comparing Flux and Keras on the internet. There are a few resources that give us an idea and we can use TensorFlow speed as a reference.
不幸的是,互聯網上沒有可比較Flux和Keras的基準。 有一些資源可以給我們一個想法,我們可以使用TensorFlow速度作為參考。
One benchmark found that on the GPU and on the CPU, Flux is barely slower than TensorFlow. It’s been shown that Keras is slightly slower than TensorFlow on the GPU as well. Unfortunately this doesn’t give us a clear winner but suggests that the speed of the two packages are similar.
一項基準測試發現,在GPU和CPU上, Flux的運行速度僅比TensorFlow慢 。 已經證明Keras在GPU上也比TensorFlow稍慢 。 不幸的是,這并不能給我們一個明顯的勝利者,而是表明這兩個軟件包的速度是相似的。
The Flux benchmark above was done before a major rework of Flux’s automatic differentiation package. The new package, Zygote.jl, sped up computations considerably. A more recent benchmark of Flux on the CPU found that the improved Flux is faster than TensorFlow on the CPU. This suggests that Flux could be faster on the on the GPU as well, but winning on the CPU doesn’t necessarily imply a victory on the GPU. At the same time, this is still good evidence that Flux would beat Keras on the CPU.
上面的Flux基準測試是在對Flux的自動差分程序進行重大修改之前完成的。 新軟件包Zygote.jl大大加快了計算速度。 CPU上最新的Flux基準測試發現, 改進的Flux比CPU上的TensorFlow更快 。 這表明Flux在GPU上也可能更快,但是在CPU上獲勝并不一定意味著在GPU上取得勝利。 同時,這仍然是Flux在CPU上擊敗Keras的充分證據。
誰贏? (Who Wins?)
Both languages preform well in every category. Differences between the two are largely matters of taste. However there are two places that each language has an edge.
兩種語言在每個類別中的表現都很好。 兩者之間的差異很大程度上取決于口味。 但是,每種語言在兩個地方都有優勢。
Python的邊緣 (Python’s Edge)
Python has a huge support community and offers time tested libraries. It is reliable and standard. Deep learning in Python is much more common. Developers who use Python for deep learning will fit in well in the deep learning community.
Python具有龐大的支持社區,并提供經過時間檢驗的庫。 它是可靠和標準的。 Python中的深度學習更為常見。 使用Python進行深度學習的開發人員將非常適合深度學習社區。
Julia的邊緣 (Julia’s Edge)
Julia is cleaner and more abstract. The deep learning code could definitely be faster and improvements are in the works. Julia has an edge on potential. Deep Learning libraries in Python are much more complete, and don’t have as much potential to grow and develop. Julia, with its richer base language has potential for many new ideas and much faster code in the future. Developers who adopt Julia will be closer to the frontier of programming, but will have to deal with forging their own path.
Julia更干凈,更抽象。 深度學習代碼肯定可以更快,并且正在進行改進。 Julia(Julia)在潛力方面擁有優勢。 Python中的深度學習庫更加完善,沒有那么大的發展潛力。 Julia(Julia)以其更豐富的基礎語言,有可能在未來產生許多新想法和更快的代碼。 采用Julia的開發人員將更接近編程的前沿,但將不得不面對自己的道路。
優勝者 (Winner)
Deep learning is difficult and requires a lot of troubleshooting. It can be very difficult to reach state of the art accuracy. For this reason, Python wins this comparison. Deep learning in Julia does not have a strong level of online support for deep learning troubleshooting. This can make writing complicated deep learning scripts very difficult. Julia is excellent for many applications, but for deep learning, I would recommend Python.
深度學習很困難,并且需要大量故障排除。 要達到最新的準確性可能非常困難。 因此, Python贏得了這一比較 。 Julia的深度學習在深度學習故障排除方面沒有強大的在線支持。 這會使編寫復雜的深度學習腳本非常困難。 Julia非常適合許多應用程序,但對于深度學習,我建議使用Python。
翻譯自: https://towardsdatascience.com/deep-learning-side-by-side-julia-v-s-python-5ac0645587f6
python 數組合并排重
總結
以上是生活随笔為你收集整理的python 数组合并排重_并排深度学习:Julia vs Python的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: vivo卖爆!中国Q4智能手机市场萎缩1
- 下一篇: 诺基亚 2022 年净销售额同比增长 6