纹理对象的实时姿态估计
如今,增強現實技術是計算機視覺和機器人領域的熱門研究課題之一。The most elemental problem in augmented reality is the estimation of the camera pose respect of an object in the case of computer vision area to do later some 3D rendering or in the case of robotics obtain an object pose in order to grasp it and do some manipulation。然而,這不是圖像處理中最常見的問題,實際上最常見的問題是應用的很多算法或數學運算解決一個問題的計算消耗。
目的:
在文將解釋如何建立一個實時相機姿態估計應用程序,用來跟蹤一個具有六個自由度(six degrees of freedom)的紋理對象(基于給定的 2D image and its 3D textured model).
代碼具有以下功能部分:
1) 讀取3D紋理對象和對象的網格數據(Read 3D textured object model and object mesh)
2) 輸入源為攝像頭或者視頻文件(Take input from Camera or Video)
3) 提取ORB特征(Extract ORB features and descriptors from the scene)
4) 基于Flann算法對ORB特征描述子進行匹配(Match scene descriptors with model descriptors using Flann matcher)
5) 使用PnP + Ransac進行姿態估計(Pose estimation using PnP + Ransac)
6) 使用線性卡爾曼濾波去除錯誤的姿態估計(Linear Kalman Filter for bad poses rejection)
理論
在計算機視覺中,從 n對3D到2D點對應關系中估計相機的的姿態是一個很基礎和很好理解的問題。姿態估計(The most general version)一般需要估計姿態的六個自由度和五個標定參數: focal length, principal point, aspect ratio and skew. 姿態估計使用著名的DLT(Direct Linear Transform)算法,可以使用最少6組對應關系建立(It could be established with a minimum of 6 correspondences)。對于DLT算法的精度提高,有非常多的簡化方法,最常用簡化方法為參數標定,也就是Perspective-*n*-Point問題:
(問題的公式)Problem Formulation:
給定一組3D點pi(expressed in a world reference frame)和投影到圖像的上的2D點ui的對應關系,我們嘗試得到相機相對于世界坐標系(w.r.t. the world)的姿態(R 和 t)和焦距f。OpenCV 提供了4種不同的方法解決Perspective-*n*-Point 問題,返回值為 R 和 t 。然后使用下面的公式可以把3D的點投影到圖像平面上:
s?????uv1????=?????fx000fy0cxcy1?????????r11r21r31r12r22r32r13r23r33t1t2t3??????????XYZ1??????關于公式的詳細文檔參見: Camera Calibration and 3D Reconstruction .
源代碼
源代碼路徑:(opencv 3.1.0 ) samples/cpp/tutorial_code/calib3d/real_time_pose_estimation/
下載路徑: real_time_pose_estimation
包含兩個主要的程序:
1) Model registration
This applicaton is exclusive to whom don't have a 3D textured model of the object to be detected. 你可以使用這個代碼創建自定義的textured 3D model.當前代碼只對平面的(或者說面試平的)物體有效,如果需要得到一個形狀復雜的模型需要使用復雜的軟件創建。
當前代碼的輸入為一幅圖像和圖像對應的3D網格。我們提供相機的內參,用來校正獲取的圖像作為算法的輸入圖像.所有的文件需要通過絕對路徑或者相對于工作目錄的相對路徑指定。如果沒有指定文件,代碼將會嘗試使用默認參數.
程序執行,首先從輸入圖像中提取 ORB特征描述子,然后使用網格數據和 M?ller–Trumbore intersection 算法 來計算特征的3D坐標系。最后,3D坐標點和特征描述子存在YAML格式文件的不同列表中,每一行存儲一個不同的點.關于文件存儲和打開的技術文檔,請參見File Input and Output using XML and YAML files .
2) Model detection
本程序的目的為實時估計對象的姿態并給出其對應的3D textured model.
模型檢測程序運行時,先加載YAML(structure explained in the model registration program)文件格式的3D textured model.在視覺場景中, 會檢測和提取ORB特征描述子.然后使用?cv::FlannBasedMatcher 和 cv::flann::GenericIndex 進行場景中特征描述子和模型中特征描述子的匹配. 將找到的匹配關系,用 cv::solvePnPRansac 函數計算得到相機的 R和t參數。最后使用 KalmanFilter 去除錯誤的姿態估計.
如果在編譯OpenCV的時候編譯了源代碼中的例子程序,可以待對應路徑下找到編譯好的程序: opencv/build/bin/cpp-tutorial-pnp_detection. 然后你可以運行程序,并嘗試改變一些參數:
//This program shows how to detect an object given its 3D textured model. You can choose to use a recorded video or the webcam.Usage:./cpp-tutorial-pnp_detection -help Keys:'esc' - to quit. -------------------------------------------------------------------------- Usage: cpp-tutorial-pnp_detection [params]-c, --confidence (value:0.95)RANSAC confidence-e, --error (value:2.0)RANSAC reprojection errror-f, --fast (value:true)use of robust fast match-h, --help (value:true)print this message--in, --inliers (value:30)minimum inliers for Kalman update--it, --iterations (value:500)RANSAC maximum iterations count-k, --keypoints (value:2000)number of keypoints to detect--meshpath to ply mesh--method, --pnp (value:0)PnP method: (0) ITERATIVE - (1) EPNP - (2) P3P - (3) DLS--modelpath to yml model-r, --ratio (value:0.7)threshold for ratio test-v, --videopath to recorded video例如你可以在運行程序的時候,改變pnp的方法:
./cpp-tutorial-pnp_detection --method=2代碼講解
1) 讀取3D紋理對象和對象的網格數據(Read 3D textured object model and object mesh)
為了加載textured model實現了Model類(class),其中的函數 load() 用來打開 YAML 格式的文件,并讀取存儲的 3D點和相應的描述子。在OpenCV的源代碼中可以找到一個3D textured model,路徑samples/cpp/tutorial_code/calib3d/real_time_pose_estimation/Data/cookies_ORB.yml.
主程序中加載模型的方式如下: Model model; // instantiate Model object model.load(yml_read_path); // load a 3D textured object model 為了加載網格模型實現了 Mesh類( class),其中的函數 load() 用來打開 ?.ply 格式的文件,并讀取存儲的object的 3D點and also the composed。在OpenCV的源代碼中可以找到一個例子,路徑 samples/cpp/tutorial_code/calib3d/real_time_pose_estimation/Data/box.ply /* Load a CSV with *.ply format */ void Mesh::load(const std::string path) {// Create the readerCsvReader csvReader(path);// Clear previous datalist_vertex_.clear();list_triangles_.clear();// Read from .ply filecsvReader.readPLY(list_vertex_, list_triangles_);// Update mesh attributesnum_vertexs_ = list_vertex_.size();num_triangles_ = list_triangles_.size(); }
主程序中網格加載的代碼為: Mesh mesh; // instantiate Mesh object mesh.load(ply_read_path); // load an object mesh
可以加載不同的模型和網格:
./cpp-tutorial-pnp_detection --mesh=/absolute_path_to_your_mesh.ply --model=/absolute_path_to_your_model.yml
2) 輸入源為攝像頭或者視頻文件(Take input from Camera or Video)
To detect is necessary capture video. It's done loading a recorded video by passing the absolute path where it is located in your machine. In order to test the application you can find a recorded video insamples/cpp/tutorial_code/calib3d/real_time_pose_estimation/Data/box.mp4.
cv::VideoCapture cap; // instantiate VideoCapture cap.open(video_read_path); // open a recorded video if(!cap.isOpened()) // check if we succeeded {std::cout << "Could not open the camera device" << std::endl;return -1; }Then the algorithm is computed frame per frame: cv::Mat frame, frame_vis; while(cap.read(frame) && cv::waitKey(30) != 27) // capture frame until ESC is pressed {frame_vis = frame.clone(); // refresh visualisation frame// MAIN ALGORITHM }
You can also load different recorded video: ./cpp-tutorial-pnp_detection --video=/absolute_path_to_your_video.mp4
3) 提取ORB特征(Extract ORB features and descriptors from the scene)
The next step is to detect the scene features and extract it descriptors. For this task I implemented aclassRobustMatcher which has a function for keypoints detection and features extraction. You can find it insamples/cpp/tutorial_code/calib3d/real_time_pose_estimation/src/RobusMatcher.cpp. In yourRobusMatch object you can use any of the 2D features detectors of OpenCV. In this case I usedcv::ORB features because is based oncv::FAST to detect the keypoints and cv::xfeatures2d::BriefDescriptorExtractor to extract the descriptors which means that is fast and robust to rotations. You can find more detailed information aboutORB in the documentation.
The following code is how to instantiate and set the features detector and the descriptors extractor:
RobustMatcher rmatcher; // instantiate RobustMatcher cv::FeatureDetector * detector = new cv::OrbFeatureDetector(numKeyPoints); // instatiate ORB feature detector cv::DescriptorExtractor * extractor = new cv::OrbDescriptorExtractor(); // instatiate ORB descriptor extractor rmatcher.setFeatureDetector(detector); // set feature detector rmatcher.setDescriptorExtractor(extractor); // set descriptor extractorThe features and descriptors will be computed by the RobustMatcher inside the matching function.
4) 基于Flann算法對ORB特征描述子進行匹配(Match scene descriptors with model descriptors using Flann matcher)
It is the first step in our detection algorithm. The main idea is to match the scene descriptors with our model descriptors in order to know the 3D coordinates of the found features into the current scene.
Firstly, we have to set which matcher we want to use. In this case is usedcv::FlannBasedMatcher matcher which in terms of computational cost is faster than thecv::BFMatcher matcher as we increase the trained collectction of features. Then, for FlannBased matcher the index created isMulti-Probe LSH: Efficient Indexing for High-Dimensional Similarity Search due toORB descriptors are binary.
You can tune the LSH and search parameters to improve the matching efficiency:
cv::Ptr<cv::flann::IndexParams> indexParams = cv::makePtr<cv::flann::LshIndexParams>(6, 12, 1); // instantiate LSH index parameters cv::Ptr<cv::flann::SearchParams> searchParams = cv::makePtr<cv::flann::SearchParams>(50); // instantiate flann search parameters cv::DescriptorMatcher * matcher = new cv::FlannBasedMatcher(indexParams, searchParams); // instantiate FlannBased matcher rmatcher.setDescriptorMatcher(matcher);Secondly, we have to call the matcher by using robustMatch() or fastRobustMatch() function. The difference of using this two functions is its computational cost. The first method is slower but more robust at filtering good matches because uses two ratio test and a symmetry test. In contrast, the second method is faster but less robust because only applies a single ratio test to the matches.
The following code is to get the model 3D points and its descriptors and then call the matcher in the main program:
// Get the MODEL INFO std::vector<cv::Point3f> list_points3d_model = model.get_points3d(); // list with model 3D coordinates cv::Mat descriptors_model = model.get_descriptors(); // list with descriptors of each 3D coordinate// -- Step 1: Robust matching between model descriptors and scene descriptors std::vector<cv::DMatch> good_matches; // to obtain the model 3D points in the scene std::vector<cv::KeyPoint> keypoints_scene; // to obtain the 2D points of the scene if(fast_match) {rmatcher.fastRobustMatch(frame, good_matches, keypoints_scene, descriptors_model); } else {rmatcher.robustMatch(frame, good_matches, keypoints_scene, descriptors_model); }
The following code corresponds to the robustMatch() function which belongs to the RobustMatcher class. This function uses the given image to detect the keypoints and extract the descriptors, match using two Nearest Neighbour the extracted descriptors with the given model descriptors and vice versa. Then, a ratio test is applied to the two direction matches in order to remove these matches which its distance ratio between the first and second best match is larger than a given threshold. Finally, a symmetry test is applied in order the remove non symmetrical matches. void RobustMatcher::robustMatch( const cv::Mat& frame, std::vector<cv::DMatch>& good_matches,std::vector<cv::KeyPoint>& keypoints_frame,const std::vector<cv::KeyPoint>& keypoints_model, const cv::Mat& descriptors_model ) {// 1a. Detection of the ORB featuresthis->computeKeyPoints(frame, keypoints_frame);// 1b. Extraction of the ORB descriptorscv::Mat descriptors_frame;this->computeDescriptors(frame, keypoints_frame, descriptors_frame);// 2. Match the two image descriptorsstd::vector<std::vector<cv::DMatch> > matches12, matches21;// 2a. From image 1 to image 2matcher_->knnMatch(descriptors_frame, descriptors_model, matches12, 2); // return 2 nearest neighbours// 2b. From image 2 to image 1matcher_->knnMatch(descriptors_model, descriptors_frame, matches21, 2); // return 2 nearest neighbours// 3. Remove matches for which NN ratio is > than threshold// clean image 1 -> image 2 matchesint removed1 = ratioTest(matches12);// clean image 2 -> image 1 matchesint removed2 = ratioTest(matches21);// 4. Remove non-symmetrical matchessymmetryTest(matches12, matches21, good_matches); } After the matches filtering we have to subtract the 2D and 3D correspondences from the found scene keypoints and our 3D model using the obtained DMatches vector. For more information about cv::DMatch check the documentation.
// -- Step 2: Find out the 2D/3D correspondences std::vector<cv::Point3f> list_points3d_model_match; // container for the model 3D coordinates found in the scene std::vector<cv::Point2f> list_points2d_scene_match; // container for the model 2D coordinates found in the scene for(unsigned int match_index = 0; match_index < good_matches.size(); ++match_index) {cv::Point3f point3d_model = list_points3d_model[ good_matches[match_index].trainIdx ]; // 3D point from modelcv::Point2f point2d_scene = keypoints_scene[ good_matches[match_index].queryIdx ].pt; // 2D point from the scenelist_points3d_model_match.push_back(point3d_model); // add 3D pointlist_points2d_scene_match.push_back(point2d_scene);
You can also change the ratio test threshold, the number of keypoints to detect as well as use or not the robust matcher:
./cpp-tutorial-pnp_detection --ratio=0.8 --keypoints=1000 --fast=false5) 使用PnP + Ransac進行姿態估計(Pose estimation using PnP + Ransac)
Once with the 2D and 3D correspondences we have to apply a PnP algorithm in order to estimate the camera pose. The reason why we have to usecv::solvePnPRansac instead of cv::solvePnP is due to the fact that after the matching not all the found correspondences are correct and, as like as not, there are false correspondences or also calledoutliers. TheRandom Sample Consensus orRansac is a non-deterministic iterative method which estimate parameters of a mathematical model from observed data producing an aproximate result as the number of iterations increase. After appylingRansac all theoutliers will be eliminated to then estimate the camera pose with a certain probability to obtain a good solution.
For the camera pose estimation I have implemented a classPnPProblem. Thisclass has 4 atributes: a given calibration matrix, the rotation matrix, the translation matrix and the rotation-translation matrix. The intrinsic calibration parameters of the camera which you are using to estimate the pose are necessary. In order to obtain the parameters you can check Camera calibration with square chessboard and Camera calibration With OpenCV tutorials.
The following code is how to declare the PnPProblem class in the main program:
// Intrinsic camera parameters: UVC WEBCAM double f = 55; // focal length in mm double sx = 22.3, sy = 14.9; // sensor size double width = 640, height = 480; // image size double params_WEBCAM[] = { width*f/sx, // fxheight*f/sy, // fywidth/2, // cxheight/2}; // cy PnPProblem pnp_detection(params_WEBCAM); // instantiate PnPProblem classThe following code is how the PnPProblem class initialises its atributes:
// Custom constructor given the intrinsic camera parameters PnPProblem::PnPProblem(const double params[]) {_A_matrix = cv::Mat::zeros(3, 3, CV_64FC1); // intrinsic camera parameters_A_matrix.at<double>(0, 0) = params[0]; // [ fx 0 cx ]_A_matrix.at<double>(1, 1) = params[1]; // [ 0 fy cy ]_A_matrix.at<double>(0, 2) = params[2]; // [ 0 0 1 ]_A_matrix.at<double>(1, 2) = params[3];_A_matrix.at<double>(2, 2) = 1;_R_matrix = cv::Mat::zeros(3, 3, CV_64FC1); // rotation matrix_t_matrix = cv::Mat::zeros(3, 1, CV_64FC1); // translation matrix_P_matrix = cv::Mat::zeros(3, 4, CV_64FC1); // rotation-translation matrix }OpenCV provides four PnP methods: ITERATIVE, EPNP, P3P and DLS. Depending on the application type, the estimation method will be different. In the case that we want to make a real time application, the more suitable methods are EPNP and P3P due to that are faster than ITERATIVE and DLS at finding an optimal solution. However, EPNP and P3P are not especially robust in front of planar surfaces and sometimes the pose estimation seems to have a mirror effect. Therefore, in this this tutorial is used ITERATIVE method due to the object to be detected has planar surfaces.
The OpenCV Ransac implementation wants you to provide three parameters: the maximum number of iterations until stop the algorithm, the maximum allowed distance between the observed and computed point projections to consider it an inlier and the confidence to obtain a good result. You can tune these paramaters in order to improve your algorithm performance. Increasing the number of iterations you will have a more accurate solution, but will take more time to find a solution. Increasing the reprojection error will reduce the computation time, but your solution will be unaccurate. Decreasing the confidence your arlgorithm will be faster, but the obtained solution will be unaccurate.
The following parameters work for this application:
// RANSAC parameters int iterationsCount = 500; // number of Ransac iterations. float reprojectionError = 2.0; // maximum allowed distance to consider it an inlier. float confidence = 0.95; // ransac successful confidence.The following code corresponds to the estimatePoseRANSAC() function which belongs to the PnPProblem class. This function estimates the rotation and translation matrix given a set of 2D/3D correspondences, the desired PnP method to use, the output inliers container and the Ransac parameters:
// Estimate the pose given a list of 2D/3D correspondences with RANSAC and the method to use void PnPProblem::estimatePoseRANSAC( const std::vector<cv::Point3f> &list_points3d, // list with model 3D coordinatesconst std::vector<cv::Point2f> &list_points2d, // list with scene 2D coordinatesint flags, cv::Mat &inliers, int iterationsCount, // PnP method; inliers containerfloat reprojectionError, float confidence ) // Ransac parameters {cv::Mat distCoeffs = cv::Mat::zeros(4, 1, CV_64FC1); // vector of distortion coefficientscv::Mat rvec = cv::Mat::zeros(3, 1, CV_64FC1); // output rotation vectorcv::Mat tvec = cv::Mat::zeros(3, 1, CV_64FC1); // output translation vectorbool useExtrinsicGuess = false; // if true the function uses the provided rvec and tvec values as// initial approximations of the rotation and translation vectorscv::solvePnPRansac( list_points3d, list_points2d, _A_matrix, distCoeffs, rvec, tvec,useExtrinsicGuess, iterationsCount, reprojectionError, confidence,inliers, flags );Rodrigues(rvec,_R_matrix); // converts Rotation Vector to Matrix_t_matrix = tvec; // set translation matrixthis->set_P_matrix(_R_matrix, _t_matrix); // set rotation-translation matrix }
In the following code are the 3th and 4th steps of the main algorithm. The first, calling the above function and the second taking the output inliers vector from Ransac to get the 2D scene points for drawing purpose. As seen in the code we must be sure to apply Ransac if we have matches, in the other case, the function cv::solvePnPRansac crashes due to any OpenCV bug. if(good_matches.size() > 0) // None matches, then RANSAC crashes {// -- Step 3: Estimate the pose using RANSAC approachpnp_detection.estimatePoseRANSAC( list_points3d_model_match, list_points2d_scene_match,pnpMethod, inliers_idx, iterationsCount, reprojectionError, confidence );// -- Step 4: Catch the inliers keypoints to drawfor(int inliers_index = 0; inliers_index < inliers_idx.rows; ++inliers_index){int n = inliers_idx.at<int>(inliers_index); // i-inliercv::Point2f point2d = list_points2d_scene_match[n]; // i-inlier point 2Dlist_points2d_inliers.push_back(point2d); // add i-inlier to list }
Finally, once the camera pose has been estimated we can use the R and t in order to compute the 2D projection onto the image of a given 3D point expressed in a world reference frame using the showed formula onTheory.
The following code corresponds to the backproject3DPoint() function which belongs to thePnPProblem class. The function backproject a given 3D point expressed in a world reference frame onto a 2D image:
// Backproject a 3D point to 2D using the estimated pose parameters cv::Point2f PnPProblem::backproject3DPoint(const cv::Point3f &point3d) {// 3D point vector [x y z 1]'cv::Mat point3d_vec = cv::Mat(4, 1, CV_64FC1);point3d_vec.at<double>(0) = point3d.x;point3d_vec.at<double>(1) = point3d.y;point3d_vec.at<double>(2) = point3d.z;point3d_vec.at<double>(3) = 1;// 2D point vector [u v 1]'cv::Mat point2d_vec = cv::Mat(4, 1, CV_64FC1);point2d_vec = _A_matrix * _P_matrix * point3d_vec;// Normalization of [u v]'cv::Point2f point2d;point2d.x = point2d_vec.at<double>(0) / point2d_vec.at<double>(2);point2d.y = point2d_vec.at<double>(1) / point2d_vec.at<double>(2);return point2d; }The above function is used to compute all the 3D points of the object Mesh to show the pose of the object.
You can also change RANSAC parameters and PnP method:
./cpp-tutorial-pnp_detection --error=0.25 --confidence=0.90 --iterations=250 --method=36) 使用線性卡爾曼濾波去除錯誤的姿態估計(Linear Kalman Filter for bad poses rejection)
Is it common in computer vision or robotics fields that after applying detection or tracking techniques, bad results are obtained due to some sensor errors. In order to avoid these bad detections in this tutorial is explained how to implement a Linear Kalman Filter. The Kalman Filter will be applied after detected a given number of inliers.
You can find more information about what?Kalman Filter?is. In this tutorial it's used the OpenCV implementation of the?cv::KalmanFilter?based on?Linear Kalman Filter for position and orientation tracking?to set the dynamics and measurement models.
Firstly, we have to define our state vector which will have 18 states: the positional data (x,y,z) with its first and second derivatives (velocity and acceleration), then rotation is added in form of three euler angles (roll, pitch, jaw) together with their first and second derivatives (angular velocity and acceleration)
X=(x,y,z,x˙,y˙,z˙,x¨,y¨,z¨,ψ,θ,?,ψ˙,θ˙,?˙,ψ¨,θ¨,?¨)TSecondly, we have to define the number of measuremnts which will be 6: from?R?and?t?we can extract?(x,y,z)?and?(ψ,θ,?). In addition, we have to define the number of control actions to apply to the system which in this case will be?zero. Finally, we have to define the differential time between measurements which in this case is?1/T, where?T?is the frame rate of the video.
cv::KalmanFilter KF; // instantiate Kalman Filter int nStates = 18; // the number of states int nMeasurements = 6; // the number of measured states int nInputs = 0; // the number of action control double dt = 0.125; // time between measurements (1/FPS) initKalmanFilter(KF, nStates, nMeasurements, nInputs, dt); // init functionThe following code corresponds to the?Kalman Filter?initialisation. Firstly, is set the process noise, the measurement noise and the error covariance matrix. Secondly, are set the transition matrix which is the dynamic model and finally the measurement matrix, which is the measurement model.
You can tune the process and measurement noise to improve the?Kalman Filter?performance. As the measurement noise is reduced the faster will converge doing the algorithm sensitive in front of bad measurements.
void initKalmanFilter(cv::KalmanFilter &KF, int nStates, int nMeasurements, int nInputs, double dt) {KF.init(nStates, nMeasurements, nInputs, CV_64F); // init Kalman Filtercv::setIdentity(KF.processNoiseCov, cv::Scalar::all(1e-5)); // set process noisecv::setIdentity(KF.measurementNoiseCov, cv::Scalar::all(1e-4)); // set measurement noisecv::setIdentity(KF.errorCovPost, cv::Scalar::all(1)); // error covariance/* DYNAMIC MODEL */// [1 0 0 dt 0 0 dt2 0 0 0 0 0 0 0 0 0 0 0]// [0 1 0 0 dt 0 0 dt2 0 0 0 0 0 0 0 0 0 0]// [0 0 1 0 0 dt 0 0 dt2 0 0 0 0 0 0 0 0 0]// [0 0 0 1 0 0 dt 0 0 0 0 0 0 0 0 0 0 0]// [0 0 0 0 1 0 0 dt 0 0 0 0 0 0 0 0 0 0]// [0 0 0 0 0 1 0 0 dt 0 0 0 0 0 0 0 0 0]// [0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0]// [0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0]// [0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0]// [0 0 0 0 0 0 0 0 0 1 0 0 dt 0 0 dt2 0 0]// [0 0 0 0 0 0 0 0 0 0 1 0 0 dt 0 0 dt2 0]// [0 0 0 0 0 0 0 0 0 0 0 1 0 0 dt 0 0 dt2]// [0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 dt 0 0]// [0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 dt 0]// [0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 dt]// [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0]// [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0]// [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1]// positionKF.transitionMatrix.at<double>(0,3) = dt;KF.transitionMatrix.at<double>(1,4) = dt;KF.transitionMatrix.at<double>(2,5) = dt;KF.transitionMatrix.at<double>(3,6) = dt;KF.transitionMatrix.at<double>(4,7) = dt;KF.transitionMatrix.at<double>(5,8) = dt;KF.transitionMatrix.at<double>(0,6) = 0.5*pow(dt,2);KF.transitionMatrix.at<double>(1,7) = 0.5*pow(dt,2);KF.transitionMatrix.at<double>(2,8) = 0.5*pow(dt,2);// orientationKF.transitionMatrix.at<double>(9,12) = dt;KF.transitionMatrix.at<double>(10,13) = dt;KF.transitionMatrix.at<double>(11,14) = dt;KF.transitionMatrix.at<double>(12,15) = dt;KF.transitionMatrix.at<double>(13,16) = dt;KF.transitionMatrix.at<double>(14,17) = dt;KF.transitionMatrix.at<double>(9,15) = 0.5*pow(dt,2);KF.transitionMatrix.at<double>(10,16) = 0.5*pow(dt,2);KF.transitionMatrix.at<double>(11,17) = 0.5*pow(dt,2);/* MEASUREMENT MODEL */// [1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]// [0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]// [0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]// [0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0]// [0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0]// [0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0]KF.measurementMatrix.at<double>(0,0) = 1; // xKF.measurementMatrix.at<double>(1,1) = 1; // yKF.measurementMatrix.at<double>(2,2) = 1; // zKF.measurementMatrix.at<double>(3,9) = 1; // rollKF.measurementMatrix.at<double>(4,10) = 1; // pitchKF.measurementMatrix.at<double>(5,11) = 1; // yaw }In the following code is the 5th step of the main algorithm. When the obtained number of inliers after? Ransac ?is over the threshold, the measurements matrix is filled and then the? Kalman Filter ?is updated: // -- Step 5: Kalman Filter // GOOD MEASUREMENT if( inliers_idx.rows >= minInliersKalman ) {// Get the measured translationcv::Mat translation_measured(3, 1, CV_64F);translation_measured = pnp_detection.get_t_matrix();// Get the measured rotationcv::Mat rotation_measured(3, 3, CV_64F);rotation_measured = pnp_detection.get_R_matrix();// fill the measurements vectorfillMeasurements(measurements, translation_measured, rotation_measured); } // Instantiate estimated translation and rotation cv::Mat translation_estimated(3, 1, CV_64F); cv::Mat rotation_estimated(3, 3, CV_64F); // update the Kalman filter with good measurements updateKalmanFilter( KF, measurements,translation_estimated, rotation_estimated);
The following code corresponds to the? fillMeasurements() ?function which converts the measured? Rotation Matrix to Eulers angles ?and fill the measurements matrix along with the measured translation vector: void fillMeasurements( cv::Mat &measurements,const cv::Mat &translation_measured, const cv::Mat &rotation_measured) {// Convert rotation matrix to euler anglescv::Mat measured_eulers(3, 1, CV_64F);measured_eulers = rot2euler(rotation_measured);// Set measurement to predictmeasurements.at<double>(0) = translation_measured.at<double>(0); // xmeasurements.at<double>(1) = translation_measured.at<double>(1); // ymeasurements.at<double>(2) = translation_measured.at<double>(2); // zmeasurements.at<double>(3) = measured_eulers.at<double>(0); // rollmeasurements.at<double>(4) = measured_eulers.at<double>(1); // pitchmeasurements.at<double>(5) = measured_eulers.at<double>(2); // yaw }
The following code corresponds to the? updateKalmanFilter() ?function which update the Kalman Filter and set the estimated Rotation Matrix and translation vector. The estimated Rotation Matrix comes from the estimated? Euler angles to Rotation Matrix . void updateKalmanFilter( cv::KalmanFilter &KF, cv::Mat &measurement,cv::Mat &translation_estimated, cv::Mat &rotation_estimated ) {// First predict, to update the internal statePre variablecv::Mat prediction = KF.predict();// The "correct" phase that is going to use the predicted value and our measurementcv::Mat estimated = KF.correct(measurement);// Estimated translationtranslation_estimated.at<double>(0) = estimated.at<double>(0);translation_estimated.at<double>(1) = estimated.at<double>(1);translation_estimated.at<double>(2) = estimated.at<double>(2);// Estimated euler anglescv::Mat eulers_estimated(3, 1, CV_64F);eulers_estimated.at<double>(0) = estimated.at<double>(9);eulers_estimated.at<double>(1) = estimated.at<double>(10);eulers_estimated.at<double>(2) = estimated.at<double>(11);// Convert estimated quaternion to rotation matrixrotation_estimated = euler2rot(eulers_estimated); }
The 6th step is set the estimated rotation-translation matrix:
// -- Step 6: Set estimated projection matrix pnp_detection_est.set_P_matrix(rotation_estimated, translation_estimated);// -- Step X: Draw pose drawObjectMesh(frame_vis, &mesh, &pnp_detection, green); // draw current pose drawObjectMesh(frame_vis, &mesh, &pnp_detection_est, yellow); // draw estimated pose double l = 5; std::vector<cv::Point2f> pose_points2d; pose_points2d.push_back(pnp_detection_est.backproject3DPoint(cv::Point3f(0,0,0))); // axis center pose_points2d.push_back(pnp_detection_est.backproject3DPoint(cv::Point3f(l,0,0))); // axis x pose_points2d.push_back(pnp_detection_est.backproject3DPoint(cv::Point3f(0,l,0))); // axis y pose_points2d.push_back(pnp_detection_est.backproject3DPoint(cv::Point3f(0,0,l))); // axis z draw3DCoordinateAxes(frame_vis, pose_points2d);
You can also modify the minimum inliers to update Kalman Filter:
./cpp-tutorial-pnp_detection --inliers=20結果
下面的視頻是實時姿勢估計的結果,使用下面的參數和前面解釋的檢測算法:
視頻見YouTube:?YouTube.
總結
以上是生活随笔為你收集整理的纹理对象的实时姿态估计的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 基于OpenCV进行相机标定
- 下一篇: 鉴黄那些事