當前位置：首頁 > 人工智能 > pytorch >内容正文

pytorch

【人脸姿态】2D人脸姿态估计的两种方式：solvePnP与3DMM参数

發布時間：2023/12/20 pytorch 26 豆豆

生活随笔收集整理的這篇文章主要介紹了【人脸姿态】2D人脸姿态估计的两种方式：solvePnP与3DMM参数小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

先看結果：

face

man1

1，solvePNP姿態估計

1.1簡介

這里的姿態估計其實就是人臉相對相機的方向估計，估計的要點就是找出2D像素點與3D像素點之間的映射關系。這個映射矩陣是一個平移矩陣和旋轉矩陣的組合。我們先給出3D到3D坐標的映射關系，其實就是相機坐標系向世界坐標系的變換關系(稱作相機外參)，此變換關系就是人臉相對人臉的方向估計。3D變換關系如下：? ? ? ? ? ? ? ? ?

可是我們現在不知道對于相機的3D坐標，所以我們需要2D點向相機3D點映射關系（相機內參），關系如下：

?其中f是焦距，c是光學中心（我們先不考慮相機畸變）。組合之后的2d到3d變換關系如下

展開得到：

1.2內參標定

內參矩陣我們可以使用棋盤格對相機進行標定,程序如下：

import cv2 import numpy as np import glob# 設置尋找亞像素角點的參數，采用的停止準則是最大循環次數30和最大誤差容限0.001 criteria = (cv2.TERM_CRITERIA_MAX_ITER | cv2.TERM_CRITERIA_EPS, 30, 0.001)# 獲取標定板角點的位置 objp = np.zeros((4 * 6, 3), np.float32) objp[:, :2] = np.mgrid[0:6, 0:4].T.reshape(-1, 2) # 將世界坐標系建在標定板上，所有點的Z坐標全部為0，所以只需要賦值x和yobj_points = [] # 存儲3D點 img_points = [] # 存儲2D點images = glob.glob("image4/*.jpg") i=0; for fname in images:img = cv2.imread(fname)gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)size = gray.shape[::-1]ret, corners = cv2.findChessboardCorners(gray, (6, 4), None)#print(corners)if ret:obj_points.append(objp)corners2 = cv2.cornerSubPix(gray, corners, (5, 5), (-1, -1), criteria) # 在原角點的基礎上尋找亞像素角點#print(corners2)if [corners2]:img_points.append(corners2)else:img_points.append(corners)cv2.drawChessboardCorners(img, (6, 4), corners, ret) # 記住，OpenCV的繪制函數一般無返回值i+=1;cv2.imwrite('conimg'+str(i)+'.jpg', img)cv2.waitKey(1500)print(len(img_points)) cv2.destroyAllWindows()# 標定 ret, mtx, dist, rvecs, tvecs = cv2.calibrateCamera(obj_points, img_points, size, None, None)print("ret:", ret) print("mtx:\n", mtx) # 內參數矩陣 print("dist:\n", dist) # 畸變系數 distortion cofficients = (k_1,k_2,p_1,p_2,k_3) print("rvecs:\n", rvecs) # 旋轉向量 # 外參數 print("tvecs:\n", tvecs ) # 平移向量 # 外參數print("-----------------------------------------------------")img = cv2.imread(images[2]) h, w = img.shape[:2] newcameramtx, roi = cv2.getOptimalNewCameraMatrix(mtx,dist,(w,h),1,(w,h))#顯示更大范圍的圖片（正常重映射之后會刪掉一部分圖像） print (newcameramtx) print("------------------使用undistort函數-------------------") dst = cv2.undistort(img,mtx,dist,None,newcameramtx) x,y,w,h = roi dst1 = dst[y:y+h,x:x+w] cv2.imwrite('calibresult3.jpg', dst1) print ("方法一:dst的大小為:", dst1.shape)

1.3 估計方法

姿勢估計的歐拉角都是根據旋轉矩陣轉換而來，而旋轉矩陣是根據2d圖像到3D圖像，或者3d相機坐標系到3D世界坐標系的變換關系使用solvePnP算法求得，2d圖像到3D相機坐標系是有相機內參矩陣變換得到，3D相機到3D世界坐標系是的變換矩陣包括R（旋轉矩陣）和T（平移矩陣構成），其中的旋轉矩陣可轉換成歐拉角，也就是水平，垂直，深度軸三個方向的角度，具體步驟：

1）首先定義一個具有n個關鍵點的3D臉部模型，n可以根據自己對準確度的容忍程度進行定義，以下示例定義6個關鍵點的3D臉部模型（左眼角，右眼角，鼻尖，左嘴角，右嘴角，下頜）；

2）采用人臉檢測以及面部關鍵點檢測得到上述3D臉部對應的2D人臉關鍵點；

3）采用Opencv的solvePnP函數解出旋轉向量；

4）將旋轉向量轉換為歐拉角；

OpenCV中solvePnP 和 solvePnPRansac都可以用來估計Pose。第二個solvePnPRansac利用隨機抽樣一致算法（Random sample consensus，RANSAC）的思想，雖然計算出的姿態更加精確，但速度慢、沒法實現實時系統，所以我們這里只關注第一個solvePnP函數，具體的參數可以參見Opencv文檔。

注意點1：solvePnP里有三種解法：P3P， EPnP，迭代法（默認）；opencv2里參數分別為CV_P3P，CV_EPNP，CV_ITERATIVE （opencv3里多了DLS和UPnP解法）。

注意點2：solvePnP需要至少3組點：P3P只使用4組點，3組求出多個解，第四組確定最優解；EPnP使用大于等于3組點；迭代法調用cvFindExtrinsicCameraParams2，進而使用SVD分解并調用cvFindHomography，而cvFindHomography需要至少4組點。

具體實現：1

import numpy as np import math import cv2 import dlibdetector = dlib.get_frontal_face_detector() predictor = dlib.shape_predictor("./point68.dat")# 3D model points. model_points = np.array([(0.0, 0.0, 0.0), # Nose tip(0.0, -330.0, -65.0), # Chin(-225.0, 170.0, -135.0), # Left eye left corner(225.0, 170.0, -135.0), # Right eye right corne(-150.0, -150.0, -125.0), # Left Mouth corner(150.0, -150.0, -125.0) # Right mouth corner])# 從旋轉向量轉換為歐拉角 def get_euler_angle(rotation_vector):# calculate rotation anglestheta = cv2.norm(rotation_vector, cv2.NORM_L2)# transformed to quaterniondw = math.cos(theta / 2)x = math.sin(theta / 2)*rotation_vector[0][0] / thetay = math.sin(theta / 2)*rotation_vector[1][0] / thetaz = math.sin(theta / 2)*rotation_vector[2][0] / thetaysqr = y * y# pitch (x-axis rotation)t0 = 2.0 * (w * x + y * z)t1 = 1.0 - 2.0 * (x * x + ysqr)print('t0:{}, t1:{}'.format(t0, t1))pitch = math.atan2(t0, t1)# yaw (y-axis rotation)t2 = 2.0 * (w * y - z * x)if t2 > 1.0:t2 = 1.0if t2 < -1.0:t2 = -1.0yaw = math.asin(t2)# roll (z-axis rotation)t3 = 2.0 * (w * z + x * y)t4 = 1.0 - 2.0 * (ysqr + z * z)roll = math.atan2(t3, t4)print('pitch:{}, yaw:{}, roll:{}'.format(pitch, yaw, roll))# 單位轉換：將弧度轉換為度Y = int((pitch/math.pi)*180)X = int((yaw/math.pi)*180)Z = int((roll/math.pi)*180)return X, Y, Zcap=cv2.VideoCapture(0)while True:# Camera internalsflag,img = cap.read()#img = cv2.imread("000_4.jpg")size = img.shapefocal_length = size[1]center = (size[1]/2, size[0]/2)camera_matrix = np.array([[focal_length, 0, center[0]],[0, focal_length, center[1]],[0, 0, 1]], dtype = "double")#print("Camera Matrix :\n {0}".format(camera_matrix)# 取灰度img_gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)# 人臉數rectsrects = detector(img_gray, 0)#for i in range(len(rects)):point_list=[]if len(rects)>0:landmarks = list((p.x, p.y) for p in predictor(img, rects[0]).parts())point_list.append(landmarks[30])point_list.append(landmarks[8])point_list.append(landmarks[36])point_list.append(landmarks[45])point_list.append(landmarks[48])point_list.append(landmarks[54])print(point_list)for idx, point in enumerate(point_list):cv2.circle(img, point, 2, color=(0, 255, 0))# font = cv2.FONT_HERSHEY_SIMPLEX# cv2.putText(img, str(idx + 1), None, font, 0.8, (0, 0, 255), 1, cv2.LINE_AA)image_points = np.array(point_list,dtype="double")dist_coeffs = np.zeros((4,1)) # Assuming no lens distortionsuccess, rotation_vector, translation_vector = cv2.solvePnP(model_points, image_points, camera_matrix, dist_coeffs, flags=cv2.SOLVEPNP_ITERATIVE)#SOLVEPNP_P3P/SOLVEPNP_ITERATIVE#print("Rotation Vector:\n {0}".format(rotation_vector))#print("Translation Vector:\n {0}".format(translation_vector))X1,Y1,Z1 = get_euler_angle(rotation_vector)print("=====================>",X1,Y1,Z1 )# Project a 3D point (0, 0, 1000.0) onto the image plane.# We use this to draw a line sticking out of the nose(nose_end_point2D, jacobian) = cv2.projectPoints(np.array([(0.0, 0.0, 1000.0)]), rotation_vector, translation_vector, camera_matrix, dist_coeffs)for p in image_points:#cv2.circle(img, (int(p[0]), int(p[1])), 3, (0,0,255), -1)p1 = ( int(image_points[0][0]), int(image_points[0][1]))p2 = ( int(nose_end_point2D[0][0][0]), int(nose_end_point2D[0][0][1]))cv2.line(img, p1, p2, (255,0,0), 2)cv2.imshow("src",img)cv2.waitKey(50)cv2.imwrite("dst.jpg",img)cap.release()

1.4檢測結果：?

pnp

2，用3DMM參數人臉姿態估計

2.1參數估計

3DMM參數常用于人臉重建，是由一個固定的標準模型經過線性變換得到特定形狀，紋理的人臉模型。任意的人臉模型可以由數據集中的m個人臉模型進行加權組合如下：

這里的形狀就包括姿態和表情。

?這里我們加載標準模型和CNN模型推理出3DMM參數（估計出的模型要根據標準模型的方差和平均模型估算），這里包括62個3DMM參數（12個姿態向量，40個姿態向量，10個表情向量），然后選用3DMM參數中姿態向量轉化提取出旋轉向量，再用旋轉向量轉化為歐拉角就得出姿態。

# Crop image, forward to get the paramparam_lst = []roi_box_lst = []crop_policy = kvs.get('crop_policy', 'box')for obj in objs:if crop_policy == 'box':# by face boxroi_box = parse_roi_box_from_bbox(obj)elif crop_policy == 'landmark':# by landmarksroi_box = parse_roi_box_from_landmark(obj)else:raise ValueError(f'Unknown crop policy {crop_policy}')roi_box_lst.append(roi_box)img = crop_img(img_ori, roi_box)img = cv2.resize(img, dsize=(self.size, self.size), interpolation=cv2.INTER_LINEAR)inp = self.transform(img).unsqueeze(0)if self.gpu_mode:inp = inp.cuda(device=self.gpu_id)if kvs.get('timer_flag', False):end = time.time()param = self.model(inp)elapse = f'Inference: {(time.time() - end) * 1000:.1f}ms'print(elapse)else:param = self.model(inp)param = param.squeeze().cpu().numpy().flatten().astype(np.float32)param = param * self.param_std + self.param_mean # re-scale# print('output', param)param_lst.append(param)

2.2參數轉化

得到12pos參數進行分解，轉換成歐拉角

def P2sRt(P):""" decompositing camera matrix P.Args:P: (3, 4). Affine Camera Matrix.Returns:s: scale factor.R: (3, 3). rotation matrix.t2d: (2,). 2d translation."""t3d = P[:, 3]R1 = P[0:1, :3]R2 = P[1:2, :3]s = (np.linalg.norm(R1) + np.linalg.norm(R2)) / 2.0r1 = R1 / np.linalg.norm(R1)r2 = R2 / np.linalg.norm(R2)r3 = np.cross(r1, r2)R = np.concatenate((r1, r2, r3), 0)return s, R, t3ddef matrix2angle(R):""" compute three Euler angles from a Rotation Matrix. Ref: http://www.gregslabaugh.net/publications/euler.pdfrefined by: https://stackoverflow.com/questions/43364900/rotation-matrix-to-euler-angles-with-opencvtodo: check and debugArgs:R: (3,3). rotation matrixReturns:x: yawy: pitchz: roll"""if R[2, 0] > 0.998:z = 0x = np.pi / 2y = z + atan2(-R[0, 1], -R[0, 2])elif R[2, 0] < -0.998:z = 0x = -np.pi / 2y = -z + atan2(R[0, 1], R[0, 2])else:x = asin(R[2, 0])y = atan2(R[2, 1] / cos(x), R[2, 2] / cos(x))z = atan2(R[1, 0] / cos(x), R[0, 0] / cos(x))return x, y, zdef calc_pose(param):P = param[:12].reshape(3, -1) # camera matrixs, R, t3d = P2sRt(P)P = np.concatenate((R, t3d.reshape(3, -1)), axis=1) # without scalepose = matrix2angle(R)pose = [p * 180 / np.pi for p in pose]return P, pose

2.3檢測結果

face

總結

以上是生活随笔為你收集整理的【人脸姿态】2D人脸姿态估计的两种方式：solvePnP与3DMM参数的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： HDU 4117 GRE Words
下一篇： Maven 入门（1）—— 安装