图像局部显著性—点特征(SURF)
?????? 1999年的SIFT(ICCV 1999,并改進(jìn)發(fā)表于IJCV 2004,本文描述);參考描述:圖像特征點(diǎn)描述。
?????? 參考原文:SURF特征提取分析 本文有大量刪除,如有疑義,請(qǐng)參考原文。????
SURF對(duì)SIFT的改進(jìn):
????? 引用Wiki百科中對(duì)SURF描述為:“ SURF (Speeded Up Robust Features) is a robust local feature detector, first presented by Herbert Bay et al. in 2006, that can be used in computer vision tasks likeobject recognition or 3D reconstruction. It is partly inspired by the SIFT descriptor.The standard version of SURF is several times faster than SIFT and claimed by its authors to be more robust against different image transformations than SIFT. SURF is based on sums of2D Haar wavelet responses and makes an efficient use ofintegral images.It uses an integer approximation to the determinant of Hessian blob detector, which can be computed extremely quickly with an integral image (3 integer operations). For features, it uses the sum of the Haar wavelet response around the point of interest. Again, these can be computed with the aid of the integral image".
?????? 從上述對(duì)SURF描述,可知:第一、SURF算法是對(duì)SIFT算法加強(qiáng)版,同時(shí)加速的具有魯棒性的特征。第二、標(biāo)準(zhǔn)的SURF算子比SIFT算子快好幾倍,并且在多幅圖片下具有更好的魯棒性。SURF最大的特征在于采用了harr特征以及積分圖像integral image的概念,這大大加快了程序的運(yùn)行時(shí)間。
算法描述:
? ? ? ? 為了實(shí)現(xiàn)尺度不變性的特征點(diǎn)檢測(cè)與匹配,SURF算法則先利用Hessian矩陣確定候選點(diǎn),然后進(jìn)行非極大抑制,計(jì)算復(fù)雜度降低多了。整個(gè)算法由以下幾個(gè)部分組成。
OpenCV源代碼Nonfree包含了SURF的源代碼,并添加了GPU實(shí)現(xiàn):
1.使用Hessian矩陣構(gòu)建,進(jìn)行特征?提取
void SURF::operator()(InputArray _img, InputArray _mask,CV_OUT vector<KeyPoint>& keypoints,OutputArray _descriptors,bool useProvidedKeypoints) const {Mat img = _img.getMat(), mask = _mask.getMat(), mask1, sum, msum;bool doDescriptors = _descriptors.needed();CV_Assert(!img.empty() && img.depth() == CV_8U);if( img.channels() > 1 )cvtColor(img, img, COLOR_BGR2GRAY);CV_Assert(mask.empty() || (mask.type() == CV_8U && mask.size() == img.size()));CV_Assert(hessianThreshold >= 0);CV_Assert(nOctaves > 0);CV_Assert(nOctaveLayers > 0);integral(img, sum, CV_32S);// 1. 使用Hessian矩陣尋找候選點(diǎn)集合:// Compute keypoints only if we are not asked for evaluating the descriptors are some given locations:if( !useProvidedKeypoints ){if( !mask.empty() ){cv::min(mask, 1, mask1);integral(mask1, msum, CV_32S);}fastHessianDetector( sum, msum, keypoints, nOctaves, nOctaveLayers, (float)hessianThreshold );}int i, j, N = (int)keypoints.size();if( N > 0 ){Mat descriptors;bool _1d = false;int dcols = extended ? 128 : 64;size_t dsize = dcols*sizeof(float);if( doDescriptors ){_1d = _descriptors.kind() == _InputArray::STD_VECTOR && _descriptors.type() == CV_32F;if( _1d ){_descriptors.create(N*dcols, 1, CV_32F);descriptors = _descriptors.getMat().reshape(1, N);}else{_descriptors.create(N, dcols, CV_32F);descriptors = _descriptors.getMat();}}//2.使用SURFInvoker計(jì)算特征描述子// we call SURFInvoker in any case, even if we do not need descriptors,// since it computes orientation of each feature.parallel_for_(Range(0, N), SURFInvoker(img, sum, keypoints, descriptors, extended, upright) );// remove keypoints that were marked for deletionfor( i = j = 0; i < N; i++ ){if( keypoints[i].size > 0 ){if( i > j ){keypoints[j] = keypoints[i];if( doDescriptors )memcpy( descriptors.ptr(j), descriptors.ptr(i), dsize);}j++;}}if( N > j ){N = j;keypoints.resize(N);if( doDescriptors ){Mat d = descriptors.rowRange(0, N);if( _1d )d = d.reshape(1, N*dcols);d.copyTo(_descriptors);}}} }?
?
2.尋找多尺度空間的描述子
??????在進(jìn)行高斯模糊時(shí),SIFT的高斯模板大小是始終不變的,只是在不同的octave之間改變圖片的大小。而在SURF中,圖片的大小是一直不變的,不同的octave層得到的待檢測(cè)圖片是改變高斯模糊尺寸大小得到的,當(dāng)然了,同一個(gè)octave中個(gè)的圖片用到的高斯模板尺度也不同。算法允許尺度空間多層圖像同時(shí)被處理,不需對(duì)圖像進(jìn)行二次抽樣,從而提高算法性能
?????? SURF算法使原始圖像保持不變而只改變?yōu)V波器大小。SURF采用這種方法節(jié)省了降采樣過程,其處理速度自然也就提上去了?
Invoke的代碼:
// Multi-threaded search of the scale-space pyramid for keypoints // 2.多線程尋找多尺度的特征點(diǎn)描述子 struct SURFFindInvoker : ParallelLoopBody {SURFFindInvoker( const Mat& _sum, const Mat& _mask_sum,const vector<Mat>& _dets, const vector<Mat>& _traces,const vector<int>& _sizes, const vector<int>& _sampleSteps,const vector<int>& _middleIndices, vector<KeyPoint>& _keypoints,int _nOctaveLayers, float _hessianThreshold ){sum = &_sum;mask_sum = &_mask_sum;dets = &_dets;traces = &_traces;sizes = &_sizes;sampleSteps = &_sampleSteps;middleIndices = &_middleIndices;keypoints = &_keypoints;nOctaveLayers = _nOctaveLayers;hessianThreshold = _hessianThreshold;}static void findMaximaInLayer( const Mat& sum, const Mat& mask_sum,const vector<Mat>& dets, const vector<Mat>& traces,const vector<int>& sizes, vector<KeyPoint>& keypoints,int octave, int layer, float hessianThreshold, int sampleStep );void operator()(const Range& range) const{for( int i=range.start; i<range.end; i++ ){int layer = (*middleIndices)[i];int octave = i / nOctaveLayers;findMaximaInLayer( *sum, *mask_sum, *dets, *traces, *sizes,*keypoints, octave, layer, hessianThreshold,(*sampleSteps)[layer] );}}const Mat *sum;const Mat *mask_sum;const vector<Mat>* dets;const vector<Mat>* traces;const vector<int>* sizes;const vector<int>* sampleSteps;const vector<int>* middleIndices;vector<KeyPoint>* keypoints;int nOctaveLayers;float hessianThreshold;static Mutex findMaximaInLayer_m; };?
3.計(jì)算特征描述子:
?
????? 采用3維線性插值法得到亞像素級(jí)的特征點(diǎn),同時(shí)也去掉那些值小于一定閾值的點(diǎn),增加極值使檢測(cè)到的特征點(diǎn)數(shù)量減少,最終只有幾個(gè)特征最強(qiáng)點(diǎn)會(huì)被檢測(cè)出來。
?
4. 選取特征點(diǎn)主方向確定
????? 為了保證旋轉(zhuǎn)不變性,在SURF中,不統(tǒng)計(jì)其梯度直方圖,而是統(tǒng)計(jì)特征點(diǎn)領(lǐng)域內(nèi)的Harr小波特征。即以特征點(diǎn)為中心,計(jì)算半徑為6s(S為特征點(diǎn)所在的尺度值)的鄰域內(nèi),統(tǒng)計(jì)60度扇形內(nèi)所有點(diǎn)在x(水平)和y(垂直)方向的Haar小波響應(yīng)總和(Haar小波邊長取4s),并給這些響應(yīng)值賦高斯權(quán)重系數(shù),使得靠近特征點(diǎn)的響應(yīng)貢獻(xiàn)大,而遠(yuǎn)離特征點(diǎn)的響應(yīng)貢獻(xiàn)小,然后60度范圍內(nèi)的響應(yīng)相加以形成新的矢量,遍歷整個(gè)圓形區(qū)域,選擇最長矢量的方向?yàn)樵撎卣鼽c(diǎn)的主方向。這樣,通過特征點(diǎn)逐個(gè)進(jìn)行計(jì)算,得到每一個(gè)特征點(diǎn)的主方向
?
5. 構(gòu)造SURF特征點(diǎn)描述算子
????? 在SURF中,也是在特征點(diǎn)周圍取一個(gè)正方形框,框的邊長為20s(s是所檢測(cè)到該特征點(diǎn)所在的尺度)。該框帶方向,方向當(dāng)然就是第4步檢測(cè)出來的主方向了。然后把該框分為16個(gè)子區(qū)域,每個(gè)子區(qū)域統(tǒng)計(jì)25個(gè)像素的水平方向和垂直方向的haar小波特征,這里的x(水平)和y(垂直)方向都是相對(duì)主方向而言的。該haar小波特征為x(水平)方向值之和,水平方向絕對(duì)值之和,垂直方向之和,垂直方向絕對(duì)值之和。
?
?
/** Find the maxima in the determinant of the Hessian in a layer of the* scale-space pyramid*/ void SURFFindInvoker::findMaximaInLayer( const Mat& sum, const Mat& mask_sum,const vector<Mat>& dets, const vector<Mat>& traces,const vector<int>& sizes, vector<KeyPoint>& keypoints,int octave, int layer, float hessianThreshold, int sampleStep ) {// Wavelet Dataconst int NM=1;const int dm[NM][5] = { {0, 0, 9, 9, 1} };SurfHF Dm;int size = sizes[layer];// The integral image 'sum' is one pixel bigger than the source imageint layer_rows = (sum.rows-1)/sampleStep;int layer_cols = (sum.cols-1)/sampleStep;// Ignore pixels without a 3x3x3 neighbourhood in the layer aboveint margin = (sizes[layer+1]/2)/sampleStep+1;if( !mask_sum.empty() )resizeHaarPattern( dm, &Dm, NM, 9, size, mask_sum.cols );int step = (int)(dets[layer].step/dets[layer].elemSize());for( int i = margin; i < layer_rows - margin; i++ ){const float* det_ptr = dets[layer].ptr<float>(i);const float* trace_ptr = traces[layer].ptr<float>(i);for( int j = margin; j < layer_cols-margin; j++ ){float val0 = det_ptr[j];if( val0 > hessianThreshold ){/* Coordinates for the start of the wavelet in the sum image. Thereis some integer division involved, so don't try to simplify this(cancel out sampleStep) without checking the result is the same */int sum_i = sampleStep*(i-(size/2)/sampleStep);int sum_j = sampleStep*(j-(size/2)/sampleStep);/* The 3x3x3 neighbouring samples around the maxima.The maxima is included at N9[1][4] */const float *det1 = &dets[layer-1].at<float>(i, j);const float *det2 = &dets[layer].at<float>(i, j);const float *det3 = &dets[layer+1].at<float>(i, j);float N9[3][9] = { { det1[-step-1], det1[-step], det1[-step+1],det1[-1] , det1[0] , det1[1],det1[step-1] , det1[step] , det1[step+1] },{ det2[-step-1], det2[-step], det2[-step+1],det2[-1] , det2[0] , det2[1],det2[step-1] , det2[step] , det2[step+1] },{ det3[-step-1], det3[-step], det3[-step+1],det3[-1] , det3[0] , det3[1],det3[step-1] , det3[step] , det3[step+1] } };/* Check the mask - why not just check the mask at the center of the wavelet? */if( !mask_sum.empty() ){const int* mask_ptr = &mask_sum.at<int>(sum_i, sum_j);float mval = calcHaarPattern( mask_ptr, &Dm, 1 );if( mval < 0.5 )continue;}//使用非極大值抑制/* Non-maxima suppression. val0 is at N9[1][4]*/if( val0 > N9[0][0] && val0 > N9[0][1] && val0 > N9[0][2] &&val0 > N9[0][3] && val0 > N9[0][4] && val0 > N9[0][5] &&val0 > N9[0][6] && val0 > N9[0][7] && val0 > N9[0][8] &&val0 > N9[1][0] && val0 > N9[1][1] && val0 > N9[1][2] &&val0 > N9[1][3] && val0 > N9[1][5] &&val0 > N9[1][6] && val0 > N9[1][7] && val0 > N9[1][8] &&val0 > N9[2][0] && val0 > N9[2][1] && val0 > N9[2][2] &&val0 > N9[2][3] && val0 > N9[2][4] && val0 > N9[2][5] &&val0 > N9[2][6] && val0 > N9[2][7] && val0 > N9[2][8] ){/* Calculate the wavelet center coordinates for the maxima */float center_i = sum_i + (size-1)*0.5f;float center_j = sum_j + (size-1)*0.5f;KeyPoint kpt( center_j, center_i, (float)sizes[layer],-1, val0, octave, CV_SIGN(trace_ptr[j]) );//對(duì)最顯著點(diǎn)位置進(jìn)行3*3插值/* Interpolate maxima location within the 3x3x3 neighbourhood */int ds = size - sizes[layer-1];int interp_ok = interpolateKeypoint( N9, sampleStep, sampleStep, ds, kpt );/* Sometimes the interpolation step gives a negative size etc. */if( interp_ok ){/*printf( "KeyPoint %f %f %d\n", point.pt.x, point.pt.y, point.size );*/cv::AutoLock lock(findMaximaInLayer_m);keypoints.push_back(kpt);}}}}} } //使用非極大值抑制/* Non-maxima suppression. val0 is at N9[1][4]*/if( val0 > N9[0][0] && val0 > N9[0][1] && val0 > N9[0][2] &&val0 > N9[0][3] && val0 > N9[0][4] && val0 > N9[0][5] &&val0 > N9[0][6] && val0 > N9[0][7] && val0 > N9[0][8] &&val0 > N9[1][0] && val0 > N9[1][1] && val0 > N9[1][2] &&val0 > N9[1][3] && val0 > N9[1][5] &&val0 > N9[1][6] && val0 > N9[1][7] && val0 > N9[1][8] &&val0 > N9[2][0] && val0 > N9[2][1] && val0 > N9[2][2] &&val0 > N9[2][3] && val0 > N9[2][4] && val0 > N9[2][5] &&val0 > N9[2][6] && val0 > N9[2][7] && val0 > N9[2][8] ){/* Calculate the wavelet center coordinates for the maxima */float center_i = sum_i + (size-1)*0.5f;float center_j = sum_j + (size-1)*0.5f;KeyPoint kpt( center_j, center_i, (float)sizes[layer],-1, val0, octave, CV_SIGN(trace_ptr[j]) );//對(duì)最顯著點(diǎn)位置進(jìn)行3*3插值/* Interpolate maxima location within the 3x3x3 neighbourhood */int ds = size - sizes[layer-1];int interp_ok = interpolateKeypoint( N9, sampleStep, sampleStep, ds, kpt );/* Sometimes the interpolation step gives a negative size etc. */if( interp_ok ){/*printf( "KeyPoint %f %f %d\n", point.pt.x, point.pt.y, point.size );*/cv::AutoLock lock(findMaximaInLayer_m);keypoints.push_back(kpt);}}}}} }?
改進(jìn)之處的可靠性?
1. 為什么選用高斯金字塔來作特征提取?
?????? 首先,采用DOG的金字塔原因是:因?yàn)樗咏麹OG,而LOG的極值點(diǎn)提供了最穩(wěn)定的特征,而且DOG方便計(jì)算(只要做減法).而LOG的極值點(diǎn)提供的特征最穩(wěn)定。其次,我們從直觀理解:特征明顯的點(diǎn)經(jīng)過不同尺度的高斯濾波器進(jìn)行濾波后,差別較大,所以用到的是DOG。但是直觀上怎么理解呢. 如果相鄰Octave的sigma不是兩倍關(guān)系還好理解:如果兩幅圖像只是縮放的關(guān)系,那么假設(shè)第一個(gè)Octave找到了小一倍圖像的極值點(diǎn),那么大一倍圖像的極值點(diǎn)會(huì)在下一個(gè)Octave找到相似的。但是現(xiàn)在,如果把大一倍圖像進(jìn)行一次下采樣(這樣和小的圖像就完全一樣了),進(jìn)行Gauss濾波時(shí),兩個(gè)圖像濾波系數(shù)(sigma)是不一樣的,不就找不到一樣的極值點(diǎn)了么.
2.Hessian矩陣為什么能用來篩選極值點(diǎn)?
?????? SIFT先利用非極大抑制,再用到Hessian矩陣進(jìn)行濾除。SURF先用Hessian矩陣,再進(jìn)行非極大抑制。SURF的順序可以加快篩選速度么(Hessian矩陣濾除的點(diǎn)更多?)至于SURF先用Hessian矩陣,再進(jìn)行非極大抑制的原因,是不管先極大值抑制還是判斷Hessian矩陣的行列式,金字塔上的點(diǎn)的行列式都是要計(jì)算出來的。先判斷是否大于0只要進(jìn)行1次判斷,而判斷是否是極大值點(diǎn)或者極小值點(diǎn)要與周圍26個(gè)點(diǎn)比較,只比較1次肯定快。而在SIFT中,構(gòu)建的高斯金字塔只有一座(不想SURF是有3座),要進(jìn)行非極大抑制可以直接用金字塔的結(jié)果進(jìn)行比較。而如果計(jì)算Hessian矩陣的行列式,還要再計(jì)算Dxx、Dxy、Dyy。因此先進(jìn)行非極大抑制。這兩個(gè)步驟的先后與SIFT/SURF的實(shí)際計(jì)算情況有關(guān)的,都是當(dāng)前算法下的最佳順序,而不是說哪種先計(jì)算一定更好。
3、為什么采用梯度特征作為局部不變特征?
?????? 這與人的視覺神經(jīng)相關(guān)。采用梯度作為描述子的原因是,人的視覺皮層上的神經(jīng)元對(duì)特定方向和空間頻率的梯度相應(yīng)很敏感,經(jīng)過SIFT作者的一些實(shí)驗(yàn)驗(yàn)證,用梯度的方法進(jìn)行匹配效果很好。
4、為什么可以采用某些特征點(diǎn)的局部不變特征進(jìn)行整幅圖像的匹配?
?????? 從直觀的人類視覺印象來看,人類視覺對(duì)物體的描述也是局部化的,基于局部不變特征的圖像識(shí)別方法十分接近于人類視覺機(jī)理,通過局部化的特征組合,形成對(duì)目標(biāo)物體的整體印象,這就為局部不變特征提取方法提供了生物學(xué)上的解釋,因此局部不變特征也得到了廣泛應(yīng)用。圖像中的每個(gè)局部區(qū)域的重要性和影響范圍并非同等重要,即特征不是同等顯著的,其主要理論來源是Marr的計(jì)算機(jī)視覺理論和Treisman的特征整合理論,一般也稱為“原子論”。該理論認(rèn)為視覺的過程開始于對(duì)物體的特征性質(zhì)和簡(jiǎn)單組成部分的分析,是從局部性質(zhì)到大范圍性質(zhì)。SIFT/SURF都是對(duì)特征點(diǎn)的局部區(qū)域的描述,這些特征點(diǎn)應(yīng)該是影響重要的點(diǎn),對(duì)這些點(diǎn)的分析更加重要。所以在局部不變特征的提取和描述時(shí)也遵循與人眼視覺注意選擇原理相類似的機(jī)制,所以SIFT/SURF用于匹配有效果。
5. 為什么Hessian矩陣可以用來判斷極大值/極小值?
?????? 在x0點(diǎn)上,hessian矩陣是正定的,且各分量的一階偏導(dǎo)數(shù)為0,則x0為極小值點(diǎn)。在x0點(diǎn)上,hessian矩陣是負(fù)定的,且各分量的一階偏導(dǎo)數(shù)為0,則x0為極大值點(diǎn)。
對(duì)于某個(gè)局部區(qū)域,若hessian矩陣是半正定的,則這個(gè)區(qū)域是凸的(反之依然成立);若負(fù)定,則這個(gè)區(qū)域是凹的(反之依然成立)。而對(duì)于正定和負(fù)定來說,Hessian矩陣的行列式總是大于等于0的。反過來就是說:某個(gè)點(diǎn)若是極大值/極小值,hessian矩陣的行列式必然要大于等于0,而大于等于0如果是滿足的,這個(gè)點(diǎn)不一定是極大值/極小值(還要判斷一階導(dǎo)數(shù))。所以后面還要進(jìn)行極大值抑制.
?
?
參考資料:
????? 1.SURF原始論文:Herbert Bay, Andreas Ess, Tinne Tuytelaars, Luc Van Gool, "SURF: Speeded Up Robust Features", Computer Vision and Image Understanding (CVIU), Vol. 110, No. 3, pp. 346--359, 2008.
?? ?? 2.Wiki百科: Speeded Up Robust Features(SURF) ,Wikipedia, the free encyclopedia .
? ? ? 3. The Website of SURF: Speeded Up Robust Features.
?
?
總結(jié)
以上是生活随笔為你收集整理的图像局部显著性—点特征(SURF)的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 图像的全局特征--用于目标检测
- 下一篇: 银行卡账户名称是什么意思