當前位置：首頁 > 人工智能 > 卷积神经网络 >内容正文

卷积神经网络

pytorch卷积可视化_使用Pytorch可视化卷积神经网络

發布時間：2023/12/15 卷积神经网络 32 豆豆

生活随笔收集整理的這篇文章主要介紹了 pytorch卷积可视化_使用Pytorch可视化卷积神经网络小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

pytorch卷積可視化

Filter and Feature map Image by the author篩選和特征圖作者提供的圖像

When dealing with image’s and image data, CNN are the go-to architectures. Convolutional neural networks have proved to provide many state-of-the-art solutions in deep learning and computer vision. Image recognition, object detection, self-driving cars would not be possible without CNN.

w ^ 母雞交易與圖像的圖像數據，CNN是去到架構。卷積神經網絡已被證明可以提供深度學習和計算機視覺方面的許多最新解決方案。沒有CNN，圖像識別，物體檢測，自動駕駛汽車將無法實現。

But when it comes down to how CNN see’s and recognize the image the way they do, things can be trickier.

但是，當涉及到CNN如何看待并以其方式識別圖像時，事情可能會變得更加棘手。

How a CNN decides whether a image is a cat or dog ?
CNN如何確定圖片是貓還是狗？
What makes a CNN more powerful than other models when it comes to image classification problems ?
在圖像分類問題上，什么使CNN比其他模型更強大？
How and what do they see in an image ?
他們如何以及在圖像中看到什么？

These were some of the questions i had back back when i first learned about CNN. The questions will grow as you deep dive into it.

這些是我第一次了解CNN時回想的一些問題。當您深入研究時，問題將會越來越多。

Back then i heard about these terms filters and featuremaps, but dont know what they are and what they do. Later i know what they are but dont know what they look like but now, i know. When dealing with Deep Convolutional Networks filters and featuremaps are important. Filters are what makes the Featuremaps and that’s what the model see’s.

那時我聽說過這些術語過濾器和功能圖，但不知道它們是什么以及它們做什么。后來我知道它們是什么，但不知道它們是什么樣子，但是現在，我知道了。在處理深度卷積網絡時，過濾器和功能圖很重要。過濾器是構成Featuremap的要素，而這正是模型所看到的。

什么是CNN中的過濾器和FeatureMap？ (What are Filters and FeatureMaps in CNN?)

Filters are set of weights which are learned using the backpropagation algorithm. If you do alot of practical deep learning coding, you may know them as kernels. Filter size can be of 3×3 or maybe 5×5 or maybe even 7×7.

?Filters設置其使用的是BP算法了解到砝碼。如果您進行了大量實用的深度學習編碼，則可能將它們稱為內核。過濾器尺寸可以是3×3或5×5甚至7×7 。

Filters in a CNN layer learn to detect abstract concepts like boundary of a face, edges of a buildings etc. By stacking more and more CNN layers on top of each other, we can get more abstract and in-depth information from a CNN.

CNN層中的過濾器學習檢測抽象概念，例如人臉邊界，建筑物邊緣等。通過在彼此之上堆疊越來越多的CNN層，我們可以從CNN獲得更多抽象和深入的信息。

7×7 and 3×3 filters7×7和3×3濾鏡

Feature Maps are the results we get after applying the filter through the pixel value of the image.This is what the model see’s in a image and the process is called convolution operation. The reason for visualising the feature maps is to gain deeper understandings about CNN.

?Feature地圖是結果通過image.This的像素值應用篩選后我們拿到的是什么模型中看到的一個圖像中的過程被稱為卷積運算 。可視化特征圖的原因是為了獲得對CNN的更深入了解。

Feature map功能圖

選擇型號 (Selecting the model)

We will use the ResNet-50 neural network model for visualizing filters and feature maps. Using a ResNet-50 model for visualizing filters and feature maps is not ideal. The reason is that the resnet models in general, are a bit complex. Traversing through the inner convolutional layers can become quite difficult. You will learn how to access the inner convolutional layers of a difficult architecture. In the future, you will feel much more comfortable working with similar or more complex architectures.

我們將使用ResNet-50神經網絡模型來可視化過濾器和特征圖。使用ResNet-50模型來可視化過濾器和功能圖不是理想的選擇。原因是resnet模型通常比較復雜。遍歷內部卷積層可能變得非常困難。您將學習如何訪問困難體系結構的內部卷積層。將來，您將在使用類似或更復雜的體系結構時感到更加自在。

The image i used is a photo from pexels. Its a image i collected to train my face-detection classifier.

我使用的圖像是來自像素像素的照片。我收集來訓練我的面部檢測分類器的圖像。

pixels像素的圖像

模型結構 (Model Structure)

At first glance, looking at the model structure can be intimidating, but it is really easy to get what we want. By knowing how to extract the layers of this model, you will be able to extract layers of more complex models. Below is the model structure.

乍一看，看模型結構可能會令人生畏，但真正容易獲得我們想要的。通過了解如何提取此模型的圖層，您將能夠提取更復雜的模型的圖層。下面是模型結構。

ResNet(
(conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
(layer1): Sequential(
(0): Bottleneck(
(conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(downsample): Sequential(
(0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): Bottleneck(
(conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
...
(2): Bottleneck(
(conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
)
(avgpool): AdaptiveAvgPool2d(output_size=(1, 1))
(fc): Linear(in_features=2048, out_features=1000, bias=True)

提取CNN層 (Extracting the CNN layers)

First, at line 4, we initialize a counter variable to keep track of the number of convolutional layers.
首先，在第4行，我們初始化一個counter變量以跟蹤卷積層的數量。
Starting from line 6, we are going through all the layers of the ResNet-50 model.
從第6行開始，我們將遍歷ResNet-50模型的所有層。
Specifically, we are checking for convolutional layers at three levels of nesting
具體來說，我們正在檢查三個嵌套級別的卷積層
Line 7, checks if any of the direct children of the model is a convolutional layer.
第7行，檢查模型的任何直接子級是否為卷積層。
Then from line 10, we check whether any of the Bottleneck layer inside the Sequential blocks contain any convolutional layers.
然后從第10行開始，檢查Sequential塊內的任何Bottleneck層是否包含任何卷積層。
If any of the above two conditions satisfy, then we append that child node and the weights to the conv_layers and model_weights respectively,
如果以上兩個條件中的任何一個都滿足，則我們將該子節點和權重分別附加到conv_layers和model_weights上，

The above code is simple and self-explanatory but it is limited to pre-existing models like other resnet model resnet-18, 34, 101, 152. For a custom model ,things will be different ,lets say there is a Sequential layer inside another Sequential layer and if there is a CNN layer it will be unchecked by the program. This is where the extractor.py module i wrote can be useful.

上面的代碼很簡單，不言自明，但僅限于像其他resnet模型resnet-18、34、101、152這樣的現有模型。對于自定義模型，情況將有所不同，可以說內部有一個順序層。另一個順序層，如果有CNN層，程序將不對其進行檢查。這是我編寫的extractor.py模塊有用的地方。

提取器類 (Extractor class)

The Extractor class can find every CNN layer(except down-sample layers) including their weights in any resnet model and almost in any custom resnet and vgg model. Its not limited to CNN layers, it can find Linear layers and if the name of the Down-sampling layer is mentioned, it can find that too. It can also give some useful information like the number of CNN, Linear and Sequential layers in a model.

Extractor類可以在任何resnet模型以及幾乎任何自定義的resnet和vgg模型中找到每個CNN層(向下采樣層除外)，包括它們的權重。它不限于CNN層，它可以找到線性層，并且如果提到下采樣層的名稱，它也可以找到。它還可以提供一些有用的信息，例如模型中的CNN，線性層和順序層的數量。

如何使用 (How to use)

In the Extractor class the model parameter takes in a model and the DS_layer_name parameter is optional. The DS_layer_name parameter is to find the down-sampling layer, normally in resnet layer the name will be ‘downsample’ so it is kept as default.

在Extractor類中，模型參數接受模型，而DS_layer_name參數是可選的。 DS_layer_name參數用于查找下采樣層，通常在resnet層中，名稱為“ downsample”，因此將其保留為默認值。

extractor = Extractor(model = resnet, DS_layer_name = 'downsample')

The code extractor.activate() is to activate the program.

代碼extractor.activate()用于激活程序。

You can get relevant details in a dictionary by calling extractor.info()

您可以通過調用extractor.info()獲取字典中的相關詳細信息。

{'Down-sample layers name': 'downsample', 'Total CNN Layers': 49, 'Total Sequential Layers': 4, 'Total Downsampling Layers': 4, 'Total Linear Layers': 1, 'Total number of Bottleneck and Basicblock': 16, 'Total Execution time': '0.00137 sec'}

訪問權重和圖層 (Accessing the weights and the layers)

extractor.CNN_layers -----> Gives all the CNN layers in a model
extractor.Linear_layers --> Gives all the Linear layers in a model
extractor.DS_layers ------> Gives all the Down-sample layers in a model if there are any
extractor.CNN_weights ----> Gives all the CNN layer's weights in a model
extractor.Linear_weights -> Gives all the Linear layer's weights in a model

Without any coding you can get CNN and Linear layers and their weights in almost every resnet model. Below is what the class methods looks like , there is more, do go through the entire script.

無需任何編碼，您幾乎可以在每個resnet模型中獲得CNN和Linear圖層及其權重。下面是類方法的樣子，還有更多，請仔細閱讀整個腳本。

可視化 (Visualizing)

卷積層過濾器 (Convolutional Layer Filters)

Here we will visualize the convolutional layer filters. For simplicity, we will only visualize the filters of the first convolutional layer.

在這里，我們將可視化卷積層過濾器。為了簡單起見，我們將僅可視化第一卷積層的過濾器。

We are looping through the model weights of the first layer. For the first layer the filter size is 7×7 and there are 64 channels(hidden layers).

我們正在遍歷第一層的模型權重。對于第一層，過濾器大小為7×7，并且有64個通道(隱藏層)。

7×7 filter7×7過濾器 7×7 filters from trained resnet-50 model來自受過訓練的resnet-50模型的7×7過濾器

The pixel values for each small boxes is between 0 to 255. 0 being complete black and 255 being white. The range can be different like between 0 to 1 or -1 to 1 with 0 as the mean.

每個小盒子的像素值在0到255之間。0為全黑，而255為白。范圍可以不同，例如0到1或-1到1，平均值為0。

要素圖 (The Feature Maps)

Transformin g ^ (Transforming)

To visualize the feature maps, first the image need to be converted to a tensor image. Using the transforms from torchvision the image can be normalized and transformed to a tensor.

為了可視化特征圖，首先需要將圖像轉換為張量圖像。使用來自火炬視覺的變換，可以將圖像標準化并變換為張量。

The last line after the transforms means applying the transforms to the image. You can create a new variable and then apply it, but make sure to change the variable name. And the .unsqueeze(0) is to add an extra dimension to the tensor img. Adding the batch dimension is an important step. Now the size of the image, instead of being [3, 128, 128], is [1, 3, 128, 128], indicating that there is only one image in the batch.

變換后的最后一行表示將變換應用于圖像。您可以創建一個新變量，然后應用它，但請確保更改變量名稱。 .unsqueeze(0)用于為張量img添加額外的尺寸。添加批次尺寸是重要的步驟。現在，圖像的大小不是[3, 128, 128] ，而是[1, 3, 128, 128] ，指示批次中只有一個圖像。

使輸入圖像通過每個卷積層 (Passing the Input Image Through Each Convolutional Layer)

The below code will pass the image through each convolutional layer.

以下代碼將使圖像通過每個卷積層。

We will first give the image as an input to the first convolutional layer. After that, we will use a for loop to pass the last layer’s outputs to the next layer, until we reach the last convolutional layer.

我們首先將圖像作為第一卷積層的輸入。之后，我們將使用for循環將最后一層的輸出傳遞到下一層，直到到達最后一個卷積層。

At line 1, we give the image as input to the first convolutional layer.
在第1行，我們將圖像作為輸入輸入到第一卷積層。
Then we iterate from through the second till the last convolutional layer using a for loop.
然后，我們使用for循環從第二個卷積層到最后一個卷積層進行迭代。
We give the last layer’s output as the input to the next convolutional layer (featuremaps[-1]).
我們將最后一層的輸出作為下一個卷積層( featuremaps[-1 ])的輸入。
Also, we append each layer’s output to the featuremaps list.
另外，我們將每個圖層的輸出附加到featuremaps列表。

可視化特征圖 (Visualizing the Feature Maps)

This is the final step. We will write the code to visualize the feature maps. Notice that the final cnn layer have many feature maps, in the range of 512 to 2048. But we will only visualize 64 feature maps from each layer as any more than that will make the outputs really cluttered.

這是最后一步。我們將編寫代碼以可視化要素地圖。請注意，最后的cnn圖層具有許多要素圖，范圍在512到2048之間。但是，我們將僅可視化每個圖層的64個要素圖，因為這將使輸出真正混亂。

Starting from line 2, we iterate through the featuremaps.
從第2行開始，我們遍歷featuremaps 。
Then we get layers as featuremaps[x][0, :, :, :].detach() .
然后，我們將layers作為featuremaps[x][0, :, :, :].detach() 。
Starting from line 5, we iterate through the filters in each layers. We break out of the loop if it is the 64th feature map.
從第5行開始，我們遍歷每layers的過濾器。如果它是第64個要素圖，我們將跳出循環。
After that we plot the feature map, and save them if necessary.
之后，我們繪制特征圖，并在必要時保存它們。

結果 (Results)

Feature maps from the first convolutional layer of ResNet-50 model來自ResNet-50模型的第一卷積層的特征圖

You can see that different filters focus on different aspects while creating the feature map of an image.

您可以看到在創建圖像的特征圖時，不同的濾鏡專注于不同的方面。

Some feature maps focus on the background of the image. Some others create an outline of the image. A few filters create feature maps where the background is dark but the image of the face is bright. This is due to the corresponding weights of the filters. It is very clear from the above image that in the deep layers, the neural network gets to see very detailed feature maps of the input image.

一些功能貼圖集中在圖像的背景上。其他一些則創建圖像的輪廓。一些濾鏡會創建要素圖，其中背景較暗，但臉部圖像較亮。這是由于過濾器的相應重量。從上面的圖像很清楚，在較深的層中，神經網絡可以看到輸入圖像的非常詳細的特征圖。

Let’s take a look at a few other feature maps.

讓我們看一下其他一些功能圖。

Feature maps from the 20th and 10th convolutional layer of ResNet-50 modelResNet-50模型的第20和第10卷積層的特征圖 Feature maps from the 40th and 30th convolutional layer of ResNet-50 modelResNet-50模型的第40和第30卷積層的特征圖

You can observe that as the image progresses through the layers the details from the images slowly disappears. They look like noise, but surely there is a pattern in those feature maps which human eyes cannot detect, but a neural network can.

您可以觀察到，隨著圖像逐步穿過圖層，圖像中的細節逐漸消失。它們看起來像噪聲，但可以肯定的是，在這些特征圖中，人眼無法檢測到某種模式，但是神經網絡可以檢測到。

By the time the image reaches the last convolutional layer then it is impossible for a human being to tell what that is. These last layer outputs are really important for the fully connected neurons which basically form the classification layers in a convolutional neural network.

到圖像到達最后一個卷積層時，人類就不可能知道那是什么。這些最后一層的輸出對于完全連接的神經元非常重要，這些神經元基本上形成了卷積神經網絡中的分類層。

結論 (Conclusions)

A big thanks to @sovitrath5 author of machine learning blog DebuggerCafe for the content.

非常感謝機器學習博客DebuggerCafe的作者@ sovitrath5提供的內容。

翻譯自: https://medium.com/swlh/visualizing-filters-and-feature-maps-in-convolutional-neural-networks-using-pytorch-110d4c1cfdeb

pytorch卷積可視化

總結

以上是生活随笔為你收集整理的pytorch卷积可视化_使用Pytorch可视化卷积神经网络的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： swap最大值和平均值_SWAP：Sof
下一篇：我叫mt4生活技能转换怎么做