訓(xùn)練
我們主要以3000fps matlab實(shí)現(xiàn)為敘述主體。
總體目標(biāo)
- 我們需要為68個(gè)特征點(diǎn)的每一個(gè)特征點(diǎn)訓(xùn)練5棵隨機(jī)樹,每棵樹4層深,即為所謂的隨機(jī)森林。
開始訓(xùn)練
分配樣本
事實(shí)上,對于每個(gè)特征點(diǎn),要訓(xùn)練隨機(jī)森林,我們需要從現(xiàn)有的樣本和特征中抽取一部分,訓(xùn)練成若干個(gè)樹。
現(xiàn)在,我們有N(此處N=1622)個(gè)樣本(圖片和shape)和無數(shù)個(gè)像素差特征。訓(xùn)練時(shí),對于每棵樹,我們從N個(gè)樣本采取有放回抽樣的方法隨機(jī)選取若干樣本,再隨機(jī)選取M個(gè)特征點(diǎn)。然后使用這些素材加以訓(xùn)練。這是一般的方法。不過為了簡化,我們將N個(gè)樣本平均分成5份,且允許彼此之間有重疊。然后分配好的樣本用來作為68個(gè)特征點(diǎn)的共同素材。
示意圖:
代碼:
dbsize = length(Tr_Data);% rf = cell(
1,
params.max_numtrees);overlap_ratio =
params.bagging_overlap;%重疊比例Q = floor(double(dbsize)/((
1-
params.bagging_overlap)*(
params.max_numtrees))); %每顆樹分配的樣本個(gè)數(shù)Data = cell(
1,
params.max_numtrees); %為訓(xùn)練每棵樹準(zhǔn)備的樣本數(shù)據(jù)
for t =
1:
params.max_numtrees% calculate the number
of samples
for each random tree% train t-th random tree
is = max(floor((t-
1)*Q - (t-
1)*Q*overlap_ratio +
1),
1); ie = min(
is + Q, dbsize);Data = Tr_Data(
is:ie);
end
2.隨機(jī)森林訓(xùn)練全程
代碼:
params.radius = (
[0:1/30:1]');
params.angles =
2*
pi*
[0:1/36:1]';rfs = cell(
length(
params.meanshape),
params.max_numtrees);
for i =
1:
length(
params.meanshape)rf = cell(
1,
params.max_numtrees);
disp(strcat(num2str(
i),
'th landmark is processing...'));
for t =
1:
params.max_numtreesis = max(
floor((t-
1)*Q - (t-
1)*Q*overlap_ratio +
1),
1); ie = min(is + Q, dbsize);max_numnodes =
2^
params.max_depth -
1; rf
{t}.ind_samples = cell(max_numnodes,
1); rf
{t}.issplit =
zeros(max_numnodes,
1);rf
{t}.pnode =
zeros(max_numnodes,
1);rf
{t}.depth =
zeros(max_numnodes,
1);rf
{t}.cnodes =
zeros(max_numnodes,
2);rf
{t}.isleafnode =
zeros(max_numnodes,
1); rf
{t}.feat =
zeros(max_numnodes,
4); rf
{t}.thresh =
zeros(max_numnodes,
1); rf
{t}.ind_samples
{1} =
1:(ie - is +
1)*(
params.augnumber); rf
{t}.issplit(
1) =
0;rf
{t}.pnode(
1) =
0;rf
{t}.depth(
1) =
1;rf
{t}.cnodes(
1,
1:
2) =
[0 0];rf
{t}.isleafnode(
1) =
1;rf
{t}.feat(
1, :) =
zeros(
1,
4);rf
{t}.thresh(
1) =
0;num_nodes =
1; num_leafnodes =
1;stop =
0;
while(~stop) num_nodes_iter = num_nodes; num_split =
0;
for n =
1:num_nodes_iter
if ~rf
{t}.issplit(n)
if rf
{t}.depth(n) ==
params.max_depth
if rf
{t}.depth(n) ==
1 rf
{t}.depth(n) =
1;
endrf
{t}.issplit(n) =
1;
else[thresh, feat, lcind, rcind, isvalid] = splitnode(
i, rf
{t}.ind_samples
{n}, Data
{t}, params, stage);
if ~isvalidrf
{t}.feat(n, :) =
[0 0 0 0];rf
{t}.thresh(n) =
0;rf
{t}.issplit(n) =
1;rf
{t}.cnodes(n, :) =
[0 0];rf
{t}.isleafnode(n) =
1;
continue;
endrf
{t}.feat(n, :) = feat;rf
{t}.thresh(n) = thresh;rf
{t}.issplit(n) =
1;rf
{t}.cnodes(n, :) =
[num_nodes+1 num_nodes+2]; rf
{t}.isleafnode(n) =
0;rf
{t}.ind_samples
{num_nodes+1} = lcind;rf
{t}.issplit(num_nodes+
1) =
0;rf
{t}.pnode(num_nodes+
1) = n;rf
{t}.depth(num_nodes+
1) = rf
{t}.depth(n) +
1;rf
{t}.cnodes(num_nodes+
1, :) =
[0 0];rf
{t}.isleafnode(num_nodes+
1) =
1;rf
{t}.ind_samples
{num_nodes+2} = rcind;rf
{t}.issplit(num_nodes+
2) =
0;rf
{t}.pnode(num_nodes+
2) = n;rf
{t}.depth(num_nodes+
2) = rf
{t}.depth(n) +
1;rf
{t}.cnodes(num_nodes+
2, :) =
[0 0];rf
{t}.isleafnode(num_nodes+
2) =
1;num_split = num_split +
1; num_leafnodes = num_leafnodes +
1;num_nodes = num_nodes +
2;
endendendif num_split ==
0stop =
1;
elserf
{t}.num_leafnodes = num_leafnodes;rf
{t}.num_nodes = num_nodes; rf
{t}.id_leafnodes =
find(rf
{t}.isleafnode ==
1);
end endendrfs(
i, :) = rf;
end
3.分裂節(jié)點(diǎn)全程
流程圖:
代碼:
function [thresh, feat, lcind, rcind, isvalid] = splitnode(lmarkID, ind_samples, Tr_Data, params, stage)if isempty(ind_samples)thresh =
0;feat =
[0 0 0 0];rcind =
[];lcind =
[];isvalid =
1;
return;
end
[radiuspairs, anglepairs] = getproposals(
params.max_numfeats(stage),
params.radius,
params.angles);angles_cos =
cos(anglepairs);
angles_sin =
sin(anglepairs);pdfeats =
zeros(
params.max_numfeats(stage),
length(ind_samples)); shapes_residual =
zeros(
length(ind_samples),
2);
for i =
1:
length(ind_samples)s =
floor((ind_samples(
i)-
1)/(
params.augnumber)) +
1; k =
mod(ind_samples(
i)-
1, (
params.augnumber)) +
1; pixel_a_x_imgcoord = (angles_cos(:,
1)).*radiuspairs(:,
1)*
params.max_raio_radius(stage)*Tr_Data
{s}.intermediate_bboxes
{stage}(k,
3);pixel_a_y_imgcoord = (angles_sin(:,
1)).*radiuspairs(:,
1)*
params.max_raio_radius(stage)*Tr_Data
{s}.intermediate_bboxes
{stage}(k,
4);pixel_b_x_imgcoord = (angles_cos(:,
2)).*radiuspairs(:,
2)*
params.max_raio_radius(stage)*Tr_Data
{s}.intermediate_bboxes
{stage}(k,
3);pixel_b_y_imgcoord = (angles_sin(:,
2)).*radiuspairs(:,
2)*
params.max_raio_radius(stage)*Tr_Data
{s}.intermediate_bboxes
{stage}(k,
4);pixel_a_x_lmcoord = pixel_a_x_imgcoord;pixel_a_y_lmcoord = pixel_a_y_imgcoord;pixel_b_x_lmcoord = pixel_b_x_imgcoord;pixel_b_y_lmcoord = pixel_b_y_imgcoord;
[pixel_a_x_lmcoord, pixel_a_y_lmcoord] = transformPointsForward(Tr_Data
{s}.meanshape2tf
{k},
pixel_a_x_imgcoord',
pixel_a_y_imgcoord'); pixel_a_x_lmcoord =
pixel_a_x_lmcoord';pixel_a_y_lmcoord =
pixel_a_y_lmcoord';
[pixel_b_x_lmcoord, pixel_b_y_lmcoord] = transformPointsForward(Tr_Data
{s}.meanshape2tf
{k},
pixel_b_x_imgcoord',
pixel_b_y_imgcoord');pixel_b_x_lmcoord =
pixel_b_x_lmcoord';pixel_b_y_lmcoord =
pixel_b_y_lmcoord'; pixel_a_x = int32(
bsxfun(@plus, pixel_a_x_lmcoord, Tr_Data
{s}.intermediate_shapes
{stage}(lmarkID,
1, k)));pixel_a_y = int32(
bsxfun(@plus, pixel_a_y_lmcoord, Tr_Data
{s}.intermediate_shapes
{stage}(lmarkID,
2, k)));pixel_b_x = int32(
bsxfun(@plus, pixel_b_x_lmcoord, Tr_Data
{s}.intermediate_shapes
{stage}(lmarkID,
1, k)));pixel_b_y = int32(
bsxfun(@plus, pixel_b_y_lmcoord, Tr_Data
{s}.intermediate_shapes
{stage}(lmarkID,
2, k)));width = (Tr_Data
{s}.width);height = (Tr_Data
{s}.height);pixel_a_x = max(
1, min(pixel_a_x, width)); pixel_a_y = max(
1, min(pixel_a_y, height));pixel_b_x = max(
1, min(pixel_b_x, width));pixel_b_y = max(
1, min(pixel_b_y, height));pdfeats(:,
i) = double(Tr_Data
{s}.img_gray(pixel_a_y + (pixel_a_x-
1)*height)) - double(Tr_Data
{s}.img_gray(pixel_b_y + (pixel_b_x-
1)*height));shapes_residual(
i, :) = Tr_Data
{s}.shapes_residual(lmarkID, :, k);
endE_x_2 = mean(shapes_residual(:,
1).^
2);
E_x = mean(shapes_residual(:,
1));E_y_2 = mean(shapes_residual(:,
2).^
2);
E_y = mean(shapes_residual(:,
2));
var_overall =
length(ind_samples)*((E_x_2 - E_x^
2) + (E_y_2 - E_y^
2));
max_step =
1;var_reductions =
zeros(
params.max_numfeats(stage), max_step);
thresholds =
zeros(
params.max_numfeats(stage), max_step);
[pdfeats_sorted] = sort(pdfeats,
2);
for i =
1:
params.max_numfeats(stage) t =
1;ind =
ceil(
length(ind_samples)*(
0.5 +
0.9*(
rand(
1) -
0.5)));threshold = pdfeats_sorted(
i, ind); thresholds(
i, t) = threshold;ind_lc = (pdfeats(
i, :) < threshold); ind_rc = (pdfeats(
i, :) >= threshold);E_x_2_lc = mean(shapes_residual(ind_lc,
1).^
2); E_x_lc = mean(shapes_residual(ind_lc,
1));E_y_2_lc = mean(shapes_residual(ind_lc,
2).^
2);E_y_lc = mean(shapes_residual(ind_lc,
2));var_lc = (E_x_2_lc + E_y_2_lc)- (E_x_lc^
2 + E_y_lc^
2);E_x_2_rc = (E_x_2*
length(ind_samples) - E_x_2_lc*sum(ind_lc))/sum(ind_rc);E_x_rc = (E_x*
length(ind_samples) - E_x_lc*sum(ind_lc))/sum(ind_rc);E_y_2_rc = (E_y_2*
length(ind_samples) - E_y_2_lc*sum(ind_lc))/sum(ind_rc);E_y_rc = (E_y*
length(ind_samples) - E_y_lc*sum(ind_lc))/sum(ind_rc);var_rc = (E_x_2_rc + E_y_2_rc)- (E_x_rc^
2 + E_y_rc^
2);var_reduce = var_overall - sum(ind_lc)*var_lc - sum(ind_rc)*var_rc;var_reductions(
i, t) = var_reduce;
end[~, ind_colmax] = max(var_reductions);
ind_max =
1;
if var_max <=
0isvalid =
0;
elseisvalid =
1;
end
isvalid =
1;thresh = thresholds(ind_colmax(ind_max), ind_max); feat =
[anglepairs(ind_colmax(ind_max), :) radiuspairs(ind_colmax(ind_max), :)];lcind = ind_samples(
find(pdfeats(ind_colmax(ind_max), :) < thresh));
rcind = ind_samples(
find(pdfeats(ind_colmax(ind_max), :) >= thresh));
end
問題:訓(xùn)練時(shí)默認(rèn)一旦可以分割節(jié)點(diǎn),則必然分割成兩部分。那么會不會出現(xiàn)選取一個(gè)閾值將剩余的樣本都?xì)w于一類呢?
說明:
如圖所示外面有一個(gè)current 坐標(biāo)系,里面有mean_shape的中心化歸一化的坐標(biāo)。最里面是以一個(gè)特征點(diǎn)為中心取的極坐標(biāo)。這份代碼取r,θ來標(biāo)注在特征點(diǎn)附近取到的任意兩個(gè)像素點(diǎn)的坐標(biāo).可以說有三個(gè)坐標(biāo)系(按前面順序,分別稱為坐標(biāo)系一、二、三)。里面兩個(gè)坐標(biāo)系的尺寸一樣,但是坐標(biāo)原點(diǎn)不一樣。
假定在坐標(biāo)系三下,取到一像素點(diǎn)坐標(biāo)為(x,y),而特征點(diǎn)在坐標(biāo)系二的坐標(biāo)為(x0,y0),則像素點(diǎn)在坐標(biāo)系二的坐標(biāo)為(x?,y?),則有:
(x?,y?)=(x,y)+(x0,y0)
.
又由前面一篇文章
《face alignment by 3000 fps系列學(xué)習(xí)總結(jié)(二)》中間進(jìn)行的相似性變換,我們知道,將當(dāng)前坐標(biāo)由mean_shape的歸一化中心化坐標(biāo)轉(zhuǎn)換為current_shape的中心化坐標(biāo),需要使用meanshape2tf變換。
即:
(x?,y?)/cR
進(jìn)一步的,取中心化后得
(x?,y?)/cR+mean(immediateshape)=(x,y)+(x0,y0)cR+mean(immediateshape)=(x,y)cR+(x0,y0)cR+mean(immediateshape)=(x,y)cR+immediate_shape_at(x0,y0)
我們又知道:
cR=c?R?/immediate_bbox
所以上式=
(x,y)?immediate_bbox/{c?R?}+immediate_shape_at(x0,y0)
最后一句就解析清了代碼的步驟:
pixel_a_x_imgcoord = (angles_cos(:,
1)).*radiuspairs(:,
1)*
params.max_raio_radius(stage)*Tr_Data
{s}.intermediate_bboxes
{stage}(k,
3);pixel_a_y_imgcoord = (angles_sin(:,
1)).*radiuspairs(:,
1)*
params.max_raio_radius(stage)*Tr_Data
{s}.intermediate_bboxes
{stage}(k,
4);pixel_b_x_imgcoord = (angles_cos(:,
2)).*radiuspairs(:,
2)*
params.max_raio_radius(stage)*Tr_Data
{s}.intermediate_bboxes
{stage}(k,
3);pixel_b_y_imgcoord = (angles_sin(:,
2)).*radiuspairs(:,
2)*
params.max_raio_radius(stage)*Tr_Data
{s}.intermediate_bboxes
{stage}(k,
4);pixel_a_x_lmcoord = pixel_a_x_imgcoord;pixel_a_y_lmcoord = pixel_a_y_imgcoord;pixel_b_x_lmcoord = pixel_b_x_imgcoord;pixel_b_y_lmcoord = pixel_b_y_imgcoord;
[pixel_a_x_lmcoord, pixel_a_y_lmcoord] = transformPointsForward(Tr_Data
{s}.meanshape2tf
{k},
pixel_a_x_imgcoord',
pixel_a_y_imgcoord'); pixel_a_x_lmcoord =
pixel_a_x_lmcoord';pixel_a_y_lmcoord =
pixel_a_y_lmcoord';
[pixel_b_x_lmcoord, pixel_b_y_lmcoord] = transformPointsForward(Tr_Data
{s}.meanshape2tf
{k},
pixel_b_x_imgcoord',
pixel_b_y_imgcoord');pixel_b_x_lmcoord =
pixel_b_x_lmcoord';pixel_b_y_lmcoord =
pixel_b_y_lmcoord'; pixel_a_x = int32(
bsxfun(@plus, pixel_a_x_lmcoord, Tr_Data
{s}.intermediate_shapes
{stage}(lmarkID,
1, k)));pixel_a_y = int32(
bsxfun(@plus, pixel_a_y_lmcoord, Tr_Data
{s}.intermediate_shapes
{stage}(lmarkID,
2, k)));pixel_b_x = int32(
bsxfun(@plus, pixel_b_x_lmcoord, Tr_Data
{s}.intermediate_shapes
{stage}(lmarkID,
1, k)));pixel_b_y = int32(
bsxfun(@plus, pixel_b_y_lmcoord, Tr_Data
{s}.intermediate_shapes
{stage}(lmarkID,
2, k)));width = (Tr_Data
{s}.width);height = (Tr_Data
{s}.height);pixel_a_x = max(
1, min(pixel_a_x, width)); pixel_a_y = max(
1, min(pixel_a_y, height));pixel_b_x = max(
1, min(pixel_b_x, width));pixel_b_y = max(
1, min(pixel_b_y, height));pdfeats(:,
i) = double(Tr_Data
{s}.img_gray(pixel_a_y + (pixel_a_x-
1)*height)) - double(Tr_Data
{s}.img_gray(pixel_b_y + (pixel_b_x-
1)*height));
如此我們訓(xùn)練全程就搞懂了。
總結(jié)
以上是生活随笔為你收集整理的face alignment by 3000 fps系列学习总结(三)的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò),歡迎將生活随笔推薦給好友。