生活随笔
收集整理的這篇文章主要介紹了
py-faster-rcnn在Windows下的end2end训练
小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.
一、制作數(shù)據(jù)集
1. 關(guān)于訓(xùn)練的圖片
不論你是網(wǎng)上找的圖片或者你用別人的數(shù)據(jù)集,記住一點(diǎn)你的圖片不能太小,width和height最好不要小于150。需要是jpeg的圖片。
2.制作xml文件
1)LabelImg
如果你的數(shù)據(jù)集比較小的話,你可以考慮用LabelImg手工打框https://github.com/tzutalin/labelImg。關(guān)于labelimg的具體使用方法我在這就不詳細(xì)說明了,大家可以去網(wǎng)上找一下。labelimg生成的xml直接就能給frcnn訓(xùn)練使用。
2)自己制作xml
如果你的數(shù)據(jù)集比較小的話,你還可以考慮用上面的方法手工打框。如果你的數(shù)據(jù)集有1w+你就可以考慮自動生成xml文件。網(wǎng)上有些資料基本用的是matlab坐標(biāo)生成xml。我給出一段python的生成xml的代碼
[python] view plaincopy
<span?style="font-size:14px;">??def?write_xml(bbox,w,h,iter):??????''??????????root=Element("annotation")??????folder=SubElement(root,"folder")??????folder.text="JPEGImages"??????filename=SubElement(root,"filename")??????filename.text=iter??????path=SubElement(root,"path")??????path.text='D:\\py-faster-rcnn\\data\\VOCdevkit2007\\VOC2007\\JPEGImages'+'\\'+iter+'.jpg'??????source=SubElement(root,"source")??????database=SubElement(source,"database")??????database.text="Unknown"??????size=SubElement(root,"size")??????width=SubElement(size,"width")??????height=SubElement(size,"height")??????depth=SubElement(size,"depth")??????width.text=str(w)??????height.text=str(h)??????depth.text='3'??????segmented=SubElement(root,"segmented")??????segmented.text='0'??????for?i?in?bbox:??????????object=SubElement(root,"object")??????????name=SubElement(object,"name")??????????name.text=i['cls']??????????pose=SubElement(object,"pose")??????????pose.text="Unspecified"??????????truncated=SubElement(object,"truncated")??????????truncated.text='0'??????????difficult=SubElement(object,"difficult")??????????difficult.text='0'??????????bndbox=SubElement(object,"bndbox")??????????xmin=SubElement(bndbox,"xmin")??????????ymin=SubElement(bndbox,"ymin")??????????xmax=SubElement(bndbox,"xmax")??????????ymax=SubElement(bndbox,"ymax")??????????xmin.text=str(i['xmin'])??????????ymin.text=str(i['ymin'])??????????xmax.text=str(i['xmax'])??????????ymax.text=str(i['ymax'])??????xml=tostring(root,pretty_print=True)??????file=open('D:/py-faster-rcnn/data/VOCdevkit2007/VOC2007/Annotations/'+iter+'.xml','w+')??????file.write(xml)</span>?? 3.制作訓(xùn)練、測試、驗(yàn)證集
這個網(wǎng)上可以參考的資料比較多,我直接copy一個小咸魚的用matlab的代碼
我建議train和trainval的部分占得比例可以更大一點(diǎn)
[plain] view plaincopy
<span?style="font-size:14px;">%%????%該代碼根據(jù)已生成的xml,制作VOC2007數(shù)據(jù)集中的trainval.txt;train.txt;test.txt和val.txt????%trainval占總數(shù)據(jù)集的50%,test占總數(shù)據(jù)集的50%;train占trainval的50%,val占trainval的50%;????%上面所占百分比可根據(jù)自己的數(shù)據(jù)集修改,如果數(shù)據(jù)集比較少,test和val可少一些????%%????%注意修改下面四個值????xmlfilepath='E:\Annotations';????txtsavepath='E:\ImageSets\Main\';????trainval_percent=0.5;%trainval占整個數(shù)據(jù)集的百分比,剩下部分就是test所占百分比????train_percent=0.5;%train占trainval的百分比,剩下部分就是val所占百分比????????????%%????xmlfile=dir(xmlfilepath);????numOfxml=length(xmlfile)-2;%減去.和..??總的數(shù)據(jù)集大小????????????trainval=sort(randperm(numOfxml,floor(numOfxml*trainval_percent)));????test=sort(setdiff(1:numOfxml,trainval));????????????trainvalsize=length(trainval);%trainval的大小????train=sort(trainval(randperm(trainvalsize,floor(trainvalsize*train_percent))));????val=sort(setdiff(trainval,train));????????????ftrainval=fopen([txtsavepath?'trainval.txt'],'w');????ftest=fopen([txtsavepath?'test.txt'],'w');????ftrain=fopen([txtsavepath?'train.txt'],'w');????fval=fopen([txtsavepath?'val.txt'],'w');????????????for?i=1:numOfxml????????if?ismember(i,trainval)????????????fprintf(ftrainval,'%s\n',xmlfile(i+2).name(1:end-4));????????????if?ismember(i,train)????????????????fprintf(ftrain,'%s\n',xmlfile(i+2).name(1:end-4));????????????else????????????????fprintf(fval,'%s\n',xmlfile(i+2).name(1:end-4));????????????end????????else????????????fprintf(ftest,'%s\n',xmlfile(i+2).name(1:end-4));????????end????end????fclose(ftrainval);????fclose(ftrain);????fclose(fval);????fclose(ftest);</span>?? 4.文件保存路徑
jpg,txt,xml分別保存到data\VOCdevkit2007\VOC2007\下的JPEGImages、ImageSets\Main、Annotations文件夾
二、根據(jù)自己的數(shù)據(jù)集修改文件
1.模型配置文件
我用end2end的方式訓(xùn)練,這里我用vgg_cnn_m_1024為例說明。所以我們先打開models\pascal_voc\VGG_CNN_M_1024\faster_rcnn_end2end\train.prototxt,有4處需要修改
[plain] view plaincopy
<span?style="font-size:14px;">layer?{????name:?'input-data'????type:?'Python'????top:?'data'????top:?'im_info'????top:?'gt_boxes'????python_param?{??????module:?'roi_data_layer.layer'??????layer:?'RoIDataLayer'??????param_str:?"'num_classes':?3"?#這里改為你訓(xùn)練類別數(shù)+1????}??}</span>?? [plain] view plaincopy
<span?style="font-size:14px;">layer?{????name:?'roi-data'????type:?'Python'????bottom:?'rpn_rois'????bottom:?'gt_boxes'????top:?'rois'????top:?'labels'????top:?'bbox_targets'????top:?'bbox_inside_weights'????top:?'bbox_outside_weights'????python_param?{??????module:?'rpn.proposal_target_layer'??????layer:?'ProposalTargetLayer'??????param_str:?"'num_classes':?3"?#這里改為你訓(xùn)練類別數(shù)+1????}??}</span>?? [plain] view plaincopy
<span?style="font-size:14px;">layer?{????name:?"cls_score"????type:?"InnerProduct"????bottom:?"fc7"????top:?"cls_score"????param?{??????lr_mult:?1????}????param?{??????lr_mult:?2????}????inner_product_param?{??????num_output:?3??#這里改為你訓(xùn)練類別數(shù)+1??????weight_filler?{????????type:?"gaussian"????????std:?0.01??????}??????bias_filler?{????????type:?"constant"????????value:?0??????}????}??}??layer?{????name:?"bbox_pred"????type:?"InnerProduct"????bottom:?"fc7"????top:?"bbox_pred"????param?{??????lr_mult:?1????}????param?{??????lr_mult:?2????}????inner_product_param?{??????num_output:?12??#這里改為你的(類別數(shù)+1)*4??????weight_filler?{????????type:?"gaussian"????????std:?0.001??????}??????bias_filler?{????????type:?"constant"????????value:?0??????}????}??}</span>?? 然后我們修改
models\pascal_voc\VGG_CNN_M_1024\faster_rcnn_end2end\test.prototxt。 [plain] view plaincopy
<span?style="font-size:14px;">layer?{????name:?"relu7"????type:?"ReLU"????bottom:?"fc7"????top:?"fc7"??}??layer?{????name:?"cls_score"????type:?"InnerProduct"????bottom:?"fc7"????top:?"cls_score"????param?{??????lr_mult:?1??????decay_mult:?1????}????param?{??????lr_mult:?2??????decay_mult:?0????}????inner_product_param?{??????num_output:?3?</span><span?style="font-size:14px;">?#這里改為你訓(xùn)練類別數(shù)+1</span><span?style="font-size:14px;">??</span><span?style="font-size:14px;"></span><pre?name="code"?class="plain"><span?style="font-size:14px;">????weight_filler?{????????type:?"gaussian"????????std:?0.01??????}??????bias_filler?{????????type:?"constant"????????value:?0??????}????}??}??layer?{????name:?"bbox_pred"????type:?"InnerProduct"????bottom:?"fc7"????top:?"bbox_pred"????param?{??????lr_mult:?1??????decay_mult:?1????}????param?{??????lr_mult:?2??????decay_mult:?0????}????inner_product_param?{??????num_output:?12?</span><span?style="font-size:14px;">?#這里改為你的(類別數(shù)+1)*4</span><span?style="font-size:14px;">??</span></pre><span?style="font-size:14px"></span><pre?name="code"?class="plain"><span?style="font-size:14px;">????weight_filler?{????????type:?"gaussian"????????std:?0.001??????}??????bias_filler?{????????type:?"constant"????????value:?0??????}????}??}</span></pre>??<pre></pre>??
另外在 solver里可以調(diào)訓(xùn)練的學(xué)習(xí)率等參數(shù),在這篇文章里不做說明
==================以下修改lib中的文件==================
2.修改imdb.py
[python] view plaincopy
<span?style="font-size:14px;">????def?append_flipped_images(self):????????????num_images?=?self.num_images????????????widths?=?[PIL.Image.open(self.image_path_at(i)).size[0]??????????????????????for?i?in?xrange(num_images)]????????????for?i?in?xrange(num_images):????????????????boxes?=?self.roidb[i]['boxes'].copy()????????????????oldx1?=?boxes[:,?0].copy()????????????????oldx2?=?boxes[:,?2].copy()????????????????boxes[:,?0]?=?widths[i]?-?oldx2?-?1????????????????boxes[:,?2]?=?widths[i]?-?oldx1?-?1??????????????for?b?in?range(len(boxes)):??????????????????if?boxes[b][2]<?boxes[b][0]:??????????????????????boxes[b][0]?=?0???????????????????????assert?(boxes[:,?2]?>=?boxes[:,?0]).all()????????????????entry?=?{'boxes'?:?boxes,?????????????????????????'gt_overlaps'?:?self.roidb[i]['gt_overlaps'],?????????????????????????'gt_classes'?:?self.roidb[i]['gt_classes'],?????????????????????????'flipped'?:?True}????????????????self.roidb.append(entry)????????????self._image_index?=?self._image_index?*?2?</span>?? 找到這個函數(shù),并修改為如上 3、修改rpn層的5個文件
在如下目錄下,將文件中param_str_全部改為param_str
4、修改config.py
將訓(xùn)練和測試的proposals改為gt
[plain] view plaincopy
<span?style="font-size:14px;">#?Train?using?these?proposals??__C.TRAIN.PROPOSAL_METHOD?=?'gt'??#?Test?using?these?proposals??__C.TEST.PROPOSAL_METHOD?=?'gt</span>?? 5、修改pascal_voc.py
因?yàn)槲覀兪褂肰OC來訓(xùn)練,所以這個是我們主要修改的訓(xùn)練的文件。
[plain] view plaincopy
<span?style="font-size:14px;">?def?__init__(self,?image_set,?year,?devkit_path=None):??????????imdb.__init__(self,?'voc_'?+?year?+?'_'?+?image_set)??????????self._year?=?year??????????self._image_set?=?image_set??????????self._devkit_path?=?self._get_default_path()?if?devkit_path?is?None?\??????????????????????????????else?devkit_path??????????self._data_path?=?os.path.join(self._devkit_path,?'VOC'?+?self._year)??????????self._classes?=?('__background__',?#?always?index?0??????????????????????????????'cn-character','seal')??????????self._class_to_ind?=?dict(zip(self.classes,?xrange(self.num_classes)))??????????self._image_ext?=?'.jpg'??????????self._image_index?=?self._load_image_set_index()??????????#?Default?to?roidb?handler??????????self._roidb_handler?=?self.selective_search_roidb??????????self._salt?=?str(uuid.uuid4())??????????self._comp_id?=?'comp4'</span>?? 在self.classes這里,'__background__'使我們的背景類,不要動他。下面的改為你自己標(biāo)簽的內(nèi)容。 修改以下2段內(nèi)容。否則你的test部分一定會出問題。
[python] view plaincopy
def?_get_voc_results_file_template(self):??????????????????filename?=?self._get_comp_id()?+?'_det_'?+?self._image_set?+?'_{:s}.txt'?????????path?=?os.path.join(?????????????self._devkit_path,?????????????'VOC'?+?self._year,??????????ImageSets,?????????????'Main',?????????????'{}'?+?'_test.txt')?????????return?path?? [python] view plaincopy
def?_write_voc_results_file(self,?all_boxes):?????????for?cls_ind,?cls?in?enumerate(self.classes):?????????????if?cls?==?'__background__':?????????????????continue?????????????print?'Writing?{}?VOC?results?file'.format(cls)?????????????filename?=?self._get_voc_results_file_template().format(cls)?????????????with?open(filename,?'w+')?as?f:?????????????????for?im_ind,?index?in?enumerate(self.image_index):?????????????????????dets?=?all_boxes[cls_ind][im_ind]?????????????????????if?dets?==?[]:?????????????????????????continue??????????????????????????????????????????for?k?in?xrange(dets.shape[0]):?????????????????????????f.write('{:s}?{:.3f}?{:.1f}?{:.1f}?{:.1f}?{:.1f}\n'.?????????????????????????????????format(index,?dets[k,?-1],????????????????????????????????????????dets[k,?0]?+?1,?dets[k,?1]?+?1,????????????????????????????????????????dets[k,?2]?+?1,?dets[k,?3]?+?1))??
三、end2end訓(xùn)練
1、刪除緩存文件
每次訓(xùn)練前將data\cache 和 data\VOCdevkit2007\annotations_cache中的文件刪除。
2、開始訓(xùn)練
在py-faster-rcnn的根目錄下打開git bash輸入
[plain] view plaincopy
<span?style="font-size:18px;">./experiments/scripts/faster_rcnn_end2end.sh?0?VGG_CNN_M_1024?pascal_voc</span>?? 當(dāng)然你可以去experiments\scripts\faster_rcnn_end2end.sh中調(diào)自己的訓(xùn)練的一些參數(shù),也可以中VGG16、ZF模型去訓(xùn)練。我這里就用默認(rèn)給的參數(shù)說明。
出現(xiàn)了這種東西的話,那就是訓(xùn)練成功了。用vgg1024的話還是很快的,還是要看你的配置,我用1080ti的話也就85min左右。我就沒有讓他訓(xùn)練結(jié)束了。
四、測試
1、創(chuàng)建自己的demo.py
如果想方便的話,直接把已經(jīng)有的demo.py復(fù)制一份,并把它的標(biāo)簽改為自己的標(biāo)簽,把模型改為自己的模型。
這是我的demo,類別和模型部分,供參考
[python] view plaincopy
<span?style="font-size:14px;">CLASSES?=?('__background__',?????????????'cn-character','seal')????NETS?=?{'vgg16':?('VGG16',????????????????????'vgg16_faster_rcnn_iter_70000.caffemodel'),??????'vgg1024':('VGG_CNN_M_1024',???????????'vgg_cnn_m_1024_faster_rcnn_iter_70000.caffemodel'),??????????'zf':?('ZF',????????????????????'ZF_faster_rcnn_final.caffemodel')}</span>?? [python] view plaincopy
if?__name__?==?'__main__':??????cfg.TEST.HAS_RPN?=?True??????????args?=?parse_args()????????prototxt?=?os.path.join(cfg.MODELS_DIR,?NETS[args.demo_net][0],??????????????????????????????'faster_rcnn_end2end',?'test.prototxt')??????caffemodel?=?os.path.join(cfg.DATA_DIR,?'faster_rcnn_models',????????????????????????????????NETS[args.demo_net][1])????????if?not?os.path.isfile(caffemodel):??????????raise?IOError(('{:s}?not?found.\nDid?you?run?./data/script/'?????????????????????????'fetch_faster_rcnn_models.sh?').format(caffemodel))????????if?args.cpu_mode:??????????caffe.set_mode_cpu()??????else:??????????caffe.set_mode_gpu()??????????caffe.set_device(args.gpu_id)??????????cfg.GPU_ID?=?args.gpu_id??????net?=?caffe.Net(prototxt,?caffemodel,?caffe.TEST)????????print?'\n\nLoaded?network?{:s}'.format(caffemodel)??????????????im?=?128?*?np.ones((300,?500,?3),?dtype=np.uint8)??????for?i?in?xrange(2):??????????_,?_=?im_detect(net,?im)????????im_names?=?['f1.jpg','f8.jpg','f7.jpg','f6.jpg','f5.jpg','f4.jpg','f3.jpg','f2.jpg',]??????for?im_name?in?im_names:??????????print?'~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~'??????????print?'Demo?for?data/demo/{}'.format(im_name)??????????demo(net,?im_name)????????plt.show()?? 在這個部分,將你要測試的圖片寫在im_names里,并把圖片放在data\demo這個文件夾下。
2、輸出的模型
將output\里你剛剛訓(xùn)練好的caffemodel復(fù)制到data\faster_rcnn_models
3、結(jié)果
運(yùn)行你自己的demo.py即可得到結(jié)果
我這個中文文字的識別初步還是可以的,但還需要再加強(qiáng)一下
總結(jié)
以上是生活随笔為你收集整理的py-faster-rcnn在Windows下的end2end训练的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
如果覺得生活随笔網(wǎng)站內(nèi)容還不錯,歡迎將生活随笔推薦給好友。