6D姿态估计从0单排——看论文的小鸡篇——Learning Analysis-by-Synthesis for 6D Pose Estimation in RGB-D Images...
迎來了第一篇使用CNN對姿態進行估計的文章了,哭了。
這篇文章是基于2014_Learning 6D Object Pose Estimation using 3D Object Coordinates(我們讀過的)這篇文章,在14年的文章中作者把模型渲染的結果——一個像素點可能的模型坐標軸和位置、以及可能所屬的object這兩點用隨機叢林來保存,然后通過像素級別的估計合成結果,最后利用一個energy function來評估pose渲染出的估計結果和實際值之間的誤差來優化pose。這篇文章主要做的內容,就是在之前的隨機森林的基礎上,吧之前能量函數的部分用CNN來完成——用CNN對比模板生成的結果和實際觀測的結果來生成能量值,從而利用能量值來精化得到的Pose。
Analysis-by-Synthesis: compare the observation with the output of a forward process, such as a rendered image of the object of interest in a particular pose.
We propose an approach that "learns to compare", while taking these difficulties (occlusion, complicated sensor noise) into account. This is done by describing the posterior density of a particular object pose with a CNN that compares an observed and rendered image.
Our goal is to estimate the pose \(H\) of a rigid object from a set of observations denoted by \(x\). Each pose \(H=(R,T)\) is a combination of two components. The rotational component \(R\) is a \(3\times3\) matrix describing the rotation around the center of the object. The translational component \(T\) is a 3D vector corresponding to the position of the object center in the cemara coordinate system.
Sampling: approximate the expected value by a set of pose samples \(\mathbb{E}[\frac{\partial}{\partial\theta_j}E(H,x_i;\theta)|x_i;\theta]\approx \frac{1}{N}\sum^N_{k=1}\frac{\partial}{\partial\theta_j}E(H_k,\hat{x};\theta)\), where \(H_1...H_N\) are pose-samples drawn independently from the posterior \(p(H|x;\theta)\) with the current parameters \(\theta\). Metropolis algorithm generates a sequence of samples \(H_t\) by repeating two steps: 1. Draw a new proposed sample \(H'\) according to a proposal distribution \(Q(H'|H_t)\), which the distribution has to be symmetric 2. Accept or reject the proposed sample according to an acceptance probability \(A(H'|H_t)\). If the proposed sample is accepted set \(H_{t+1}=H'\) else \(H_{t+1}=H_t\). \(A(H'|H_t)=min(1,\frac{p(H'|x;\theta)}{p(H_t|x;\theta)})\)
Proposal Distribution: We define \(Q(H'|H_t)\) implicitly by describing a sampling procedure and ensuring that it is symmetric. The translational component \(T'\) of the proposed sample is directly drawn from a 3D isotropic normal distribution \(N(T_t,\sum_T)\) centered at the translational component \(T_t\) of the current sample \(H_t\). The rotational component \(R'\) of the proposed sample \(H'\) is generated by applying random rotation \(\hat{R}\) to the rotational component \(R_t\) of the current sample: \(R'=\hat{R}R_t\), \(\hat{R}\) is calculated as the Euler vector(rotation matrix), which is drawn from a 3D zero centered isotropic normal distribution \(e\sim N(0,\sum_R)\)
Initialization and Burn-in-phase: To find a good initialization we run our inference procedure using the current parameter set. We then perform the Metropolis algorithm for a total of 130 iterations, disregarding the samples from the first 30 iterations which are considered as burn-in-phase.
轉載于:https://www.cnblogs.com/LeeGoHigh/p/10512135.html
總結
以上是生活随笔為你收集整理的6D姿态估计从0单排——看论文的小鸡篇——Learning Analysis-by-Synthesis for 6D Pose Estimation in RGB-D Images...的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: mybatis-plus的代码生成器
- 下一篇: 封装方法