當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

cv2 interpolate插值-align_corners

發布時間：2023/12/20 编程问答 22 豆豆

生活随笔收集整理的這篇文章主要介紹了 cv2 interpolate插值-align_corners 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

torch interpolate

torch.nn.functional.interpolate(input, size=None, scale_factor=None, mode='nearest', align_corners=None, recompute_scale_factor=None)

input (Tensor)：輸入數據
size (int or Tuple[int] or Tuple[int, int] or Tuple[int, int, int])：輸出數據的尺寸
scale_factor (float or Tuple[float])：縮放因子
mode (str)：采樣算法
align_corners (bool, optional)：幾何上，我們認為輸入和輸出的像素是正方形，而不是點。如果設置為True，則輸入和輸出張量由其角像素的中心點對齊，從而保留角像素處的值。如果設置為False，則輸入和輸出張量由它們的角像素的角點對齊，插值使用邊界外值的邊值填充;當scale_factor保持不變時，使該操作獨立于輸入大小。僅當使用的算法為’linear’, ‘bilinear’, 'bilinear’or 'trilinear’時可以使用。默認設置為False

角像素：縮放后四個角的像素值

注意：

scale_factor與size只能設置一個。

當設置scale_factor時，會對輸出size下取整，比如輸入[2, 2], scale_factor=2.1, 則輸出size為[4.2, 4.2] = [4, 4]。

當設置scale_factor時，再設置recompute_scale_factor時，會根據輸出的實際大小重新計算一下scale_factor。

用scale_factor不用size是因為scale_factor可以不寫死大小，而size會固定輸出大小，在處理多分辨率輸入圖像的時候會有問題。

input：輸入Tensor。size：插值后輸出Tensor的空間維度的大小，這個spatial size就是去掉Batch，Channel，Depth維度后剩下的值。比如NCHW的spatial size是HW。scale_factor(float 或者 Tuple[float])：spatial size的乘數，如果是tuple則必須匹配輸入數據的大小。 mode(str)：上采樣的模式，包含'nearest' | 'linear' | 'bilinear' | 'bicubic' | 'trilinear' | 'area'。默認是 'nearest'。align_corners(bool)：在幾何上，我們將輸入和輸出的像素視為正方形而不是點。如果設置為True，則輸入和輸出張量按其角像素的中心點對齊，保留角像素處的值。如果設置為False，則輸入和輸出張量按其角像素的角點對齊，插值使用邊緣值填充來處理邊界外值，當scale_factor保持不變時，此操作與輸入大小無關。這僅在mode為 'linear' | 'bilinear' | 'bicubic' | 'trilinear'時有效。默認值是False。recompute_scale_factor(bool)：重新計算用于插值計算的 scale_factor。當 scale_factor 作為參數傳遞時，它用于計算 output_size。如果 recompute_scale_factor 為 False 或未指定，則傳入的 scale_factor 將用于插值計算。否則，將根據用于插值計算的輸出和輸入大小計算新的 scale_factor（即，等價于顯示傳入output_size）。請注意，當 scale_factor 是浮點數時，由于舍入和精度問題，重新計算的 scale_factor 可能與傳入的不同。

ops_version對導出onnx影響：

op9, op10是Unsample，而op11變成了Resize。

不同的ops_version對interpolate的支持程度：

F.interpolatenearest bilinear, align_corners=Falsebilinear, align_corners=Truebicubic

op-9	Y	Y	N	N
op-10	Y	Y	N	N
op-11	Y	Y	Y	Y

align_corner的表現行為：

align_corner

如果設置為True，則輸入和輸出張量由其角像素的中心點對齊，從而保留角像素處的值。如果設置為False，則輸入和輸出張量由它們的角像素的角點對齊，插值使用邊界外值的邊值填充

當**align_corners = True**時，像素被視為網格的格子上的點,拐角處的像素對齊.可知是點之間是等間距的
當**align_corners = False**時, 像素被視為網格的交叉線上的點, 拐角處的點依然是原圖像的拐角像素,但是差值的點間卻按照上圖的取法取,導致點與點之間是不等距的

opencv, PIL的align_corner為False， mxnet為True，而torch和tensorflow可以設置。

?首先介紹 align_corners=False，它是 pytorch 中 interpolate 的默認選項。這種設定下，我們認定像素值位于像素塊的中心，如下圖所示：(3*3)

?對它上采樣兩倍后，得到下圖:(6*6)

首先觀察綠色框內的像素，我們會發現它們嚴格遵守了 bilinear 的定義。而對于角上的四個點，其像素值保持了原圖的值。邊上的點（超出邊界的點）則根據角點的值填充。所以，我們從全局來看，內部和邊緣處采用了比較不同的規則。?

# align_corners = False # x_ori is the coordinate in original image # x_up is the coordinate in the upsampled image x_ori = (x_up + 0.5) / factor - 0.5

?接下來，我們看看 align_corners=True 情況下，用同樣畫法對上采樣的可視化：(5*5)

這里像素之間毫無對齊的美感，強迫癥看到要爆炸。事實上，在 align_corners=True 的世界觀下，上圖的畫法是錯誤的。在其世界觀里，像素值位于網格上，如下圖所示：?

那么，把它上采樣兩倍后，我們會得到如下的結果：

1、align_corners 參數的實驗（11-14）

import torch import torch.nn as nn import torch.nn.functional as F s= [4.] # 由于函數需要浮點數,所以需要加點 s = torch.tensor(s).reshape(1, 1, 1, 1) # 自定義s的通道數和尺寸大小 print(s) # tensor([[[[4.]]]]) x = F.interpolate(s, size=(1,4), mode='bilinear', align_corners=False) print(x) # tensor([[[[4., 4., 4., 4.]]]])x = F.interpolate(s, size=(32,32), mode='bilinear', align_corners=False) print(x) #tensor([[[[4., 4., 4., ..., 4., 4., 4.], # [4., 4., 4., ..., 4., 4., 4.], # [4., 4., 4., ..., 4., 4., 4.], # ..., # [4., 4., 4., ..., 4., 4., 4.], # [4., 4., 4., ..., 4., 4., 4.], # [4., 4., 4., ..., 4., 4., 4.]]]])

2、align_corners 參數的實驗（22-44）

import torch import torch.nn as nn import torch.nn.functional as Fa = [[1., 2.], [4., 5.]] a = torch.tensor(a).reshape(1, 1, 2, 2) x = F.interpolate(a, scale_factor=2, mode='bilinear', align_corners=True) print(x) #tensor([[[[1.0000, 1.3333, 1.6667, 2.0000], # [2.0000, 2.3333, 2.6667, 3.0000], # [3.0000, 3.3333, 3.6667, 4.0000], # [4.0000, 4.3333, 4.6667, 5.0000]]]]) # 等距 # 像素被視為網格的格子上的點,拐角處的像素對齊.可知是點之間是等間距的y = F.interpolate(a, scale_factor=2, mode='bilinear', align_corners=False) print(y) #tensor([[[[1.0000, 1.2500, 1.7500, 2.0000], # [1.7500, 2.0000, 2.5000, 2.7500], # [3.2500, 3.5000, 4.0000, 4.2500], # [4.0000, 4.2500, 4.7500, 5.0000]]]]) # 不等距 #

3、align_corners 參數的實驗（33-66）?

import torch import torch.nn as nn import torch.nn.functional as F a = [[1., 2., 3.], [4., 5., 6.], [7., 8., 9.]] a = torch.tensor(a).reshape(1, 1, 3, 3) print(a) #tensor([[[[1., 2., 3.], # [4., 5., 6.], # [7., 8., 9.]]]])#*************等價的寫法**********# r = torch.arange(1,10,dtype=torch.float32).view(1,1,3,3) r #tensor([[[[1., 2., 3.], # [4., 5., 6.], # [7., 8., 9.]]]]) #*********************************#x = F.interpolate(a, scale_factor=2, mode='bilinear', align_corners=True) print(x) #tensor([[[[1.0000, 1.4000, 1.8000, 2.2000, 2.6000, 3.0000], # [2.2000, 2.6000, 3.0000, 3.4000, 3.8000, 4.2000], # [3.4000, 3.8000, 4.2000, 4.6000, 5.0000, 5.4000], # [4.6000, 5.0000, 5.4000, 5.8000, 6.2000, 6.6000], # [5.8000, 6.2000, 6.6000, 7.0000, 7.4000, 7.8000], # [7.0000, 7.4000, 7.8000, 8.2000, 8.6000, 9.0000]]]]) # 等距y = F.interpolate(a, scale_factor=2, mode='bilinear', align_corners=False) print(y) #tensor([[[[1.0000, 1.2500, 1.7500, 2.2500, 2.7500, 3.0000], # [1.7500, 2.0000, 2.5000, 3.0000, 3.5000, 3.7500], # [3.2500, 3.5000, 4.0000, 4.5000, 5.0000, 5.2500], # [4.7500, 5.0000, 5.5000, 6.0000, 6.5000, 6.7500], # [6.2500, 6.5000, 7.0000, 7.5000, 8.0000, 8.2500], # [7.0000, 7.2500, 7.7500, 8.2500, 8.7500, 9.0000]]]]) # 不等距

?補充說明：

由于圖像雙線性插值只會用相鄰的4個點，因此上述公式雙線性插值的分母都是1。opencv中的源碼如下，用了一些優化手段，比如用整數計算代替float（下面代碼中的*2048就是變11位小數為整數，最后有兩個連乘，因此>>22位），以及源圖像和目標圖像幾何中心的對齊
-?SrcX=(dstX+0.5)* (srcWidth/dstWidth) -0.5
-?SrcY=(dstY+0.5) * (srcHeight/dstHeight)-0.5，
這個要重點說一下，源圖像和目標圖像的原點（0，0）均選擇左上角，然后根據插值公式計算目標圖像每點像素，假設你需要將一幅5x5的圖像縮小成3x3，那么源圖像和目標圖像各個像素之間的對應關系如下。如果沒有這個中心對齊，根據基本公式去算，就會得到左邊這樣的結果；而用了對齊，就會得到右邊的結果：

?原本的插值公式：

（原本的）srcX=dstX*(srcW/dstW)? eg:srcX=0(5/3)=0?

（中心對齊）srcX=(0+0.5)/(5/3)-0.5=1/3

中心點對齊的縮放在卷積網絡結構設計中的注意事項

OpenCV縮放圖片是基于中心點對齊的，
Pytorch中 mode=‘bilinear’, align_corners=False 與OpenCV中的保持一致，
Pytorch中 mode=‘bilinear’, align_corners=True 與TensorFlow中的align_corners=True的條件下保持一致。

tensorFlow的resize_bilinear并未中心對齊，坐標計算方式為

align_corners=False：

srcX=dstX* (srcWidth/dstWidth) ,
srcY = dstY * (srcHeight/dstHeight)

align_corners=True：

srcX=dstX* (srcWidth-1/dstWidth-1) ,
srcY = dstY * (srcHeight-1/dstHeight-1)

參考博客：

一文看懂align_corners - 知乎

cv2.reisze, interpolate采樣比較 - bairuiworld - 博客園

【上采樣問題】雙線性插值的幾何中心點重合與align_corners_Hali_Botebie的博客-CSDN博客

總結

以上是生活随笔為你收集整理的cv2 interpolate插值-align_corners的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： c++课程设计——美发店管理系统
下一篇： Pr 入门教程：如何处理图片文件？