當(dāng)前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

FCN-数据篇

發(fā)布時(shí)間：2023/12/10 编程问答 32 豆豆

生活随笔收集整理的這篇文章主要介紹了 FCN-数据篇小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

從本篇開始，我們來記錄一下全卷積網(wǎng)絡(luò)用來做語義分割的全過程。
代碼：https://github.com/shelhamer/fcn.berkeleyvision.org

下面我們將描述三方面的內(nèi)容：
1. 官方提供的公開數(shù)據(jù)集
2. 自己的數(shù)據(jù)集如何準(zhǔn)備，主要是如何標(biāo)注label
3. 訓(xùn)練結(jié)束后如何對(duì)結(jié)果著色。

公開數(shù)據(jù)集

這里分別說一下SiftFlowDataset與pascal voc數(shù)據(jù)集。
1. pascal voc
根據(jù)FCN代碼中的data文件夾下的pascal說明：

# PASCAL VOC and SBDPASCAL VOC is a standard recognition dataset and benchmark with detection and semantic segmentation challenges. The semantic segmentation challenge annotates 20 object classes and background. The Semantic Boundary Dataset (SBD) is a further annotation of the PASCAL VOC data that provides more semantic segmentation and instance segmentation masks.PASCAL VOC has a private test set and [leaderboard for semantic segmentation](http://host.robots.ox.ac.uk:8080/leaderboard/displaylb.php?challengeid=11&compid=6).The train/val/test splits of PASCAL VOC segmentation challenge and SBD diverge. Most notably VOC 2011 segval intersects with SBD train. Care must be taken for proper evaluation by excluding images from the train or val splits.We train on the 8,498 images of SBD train. We validate on the non-intersecting set defined in the included `seg11valid.txt`.Refer to `classes.txt` for the listing of classes in model output order. Refer to `../voc_layers.py` for the Python data layer for this dataset.See the dataset sites for download:- PASCAL VOC 2012: http://host.robots.ox.ac.uk/pascal/VOC/voc2012/ - SBD: see [homepage](http://home.bharathh.info/home/sbd) or [direct download](http://www.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/semantic_contours/benchmark.tgz)

我們可以下載訓(xùn)練數(shù)據(jù)集：SBD 以及測試集：PASCAL VOC 2012
然后進(jìn)入fcn/data，新建sbdd文件夾（如果沒有），將benchmark的dataset解壓到sbdd中，將VOC2012解壓到data下的pascal文件夾下。這兩個(gè)文件夾已經(jīng)準(zhǔn)備好了train.txt用于訓(xùn)練，seg11valid.txt用于測試。
2. SIFT-Flow
下載數(shù)據(jù)集：下載地址。
并解壓至/fcn.berkeleyvision.org/data/下，并覆蓋名為sift-flow的文件夾。
由于FCN源代碼已經(jīng)為我們準(zhǔn)備好了train.txt等文件了，所以不需要重新生成。

準(zhǔn)備自己的數(shù)據(jù)集

深度學(xué)習(xí)圖像分割（FCN）訓(xùn)練自己的模型大致可以以下三步：

1.為自己的數(shù)據(jù)制作label；

2.將自己的數(shù)據(jù)分為train,val和test集；

3.仿照voc_lyaers.py編寫自己的輸入數(shù)據(jù)層。

在FCN中，圖像的大小是不限的，此時(shí)如果數(shù)據(jù)集的圖片大小不一，則每次只能訓(xùn)一張圖片。這是FCN代碼的默認(rèn)設(shè)置。即batch_size=1.但是如果批量訓(xùn)練，則應(yīng)該要求所有的數(shù)據(jù)集大小相同。此時(shí)我們需要使用resize進(jìn)行縮放。一般情況下，我們將原圖縮放到256*256，或者500*500.

1. 縮放圖像

下面給出幾個(gè)縮放函數(shù)，來自網(wǎng)上：http://blog.csdn.net/u010402786/article/details/72883421
（1）單張圖片的resize

import Image def convert(width,height):im = Image.open("C:\\xxx\\test.jpg")out = im.resize((width, height),Image.ANTIALIAS)out.save("C:\\xxx\\test.jpg") if __name__ == '__main__':convert(256,256)

（2）resize整個(gè)文件夾里的圖片

import Image import osdef convert(dir,width,height):file_list = os.listdir(dir)print(file_list)for filename in file_list:path = ''path = dir+filenameim = Image.open(path)out = im.resize((256,256),Image.ANTIALIAS)print "%s has been resized!"%filenameout.save(path)if __name__ == '__main__':dir = raw_input('please input the operate dir:')convert(dir,256,256)

(3)按比例resize

import Image def convert(width,height):im = Image.open("C:\\workspace\\PythonLearn1\\test_1.jpg")(x, y)= im.sizex_s = widthy_s = y * x_s / xout = im.resize((x_s, y_s), Image.ANTIALIAS)out.save("C:\\workspace\\PythonLearn1\\test_1_out.jpg") if __name__ == '__main__':convert(256,256)

圖像標(biāo)簽制作

第一步：使用github開源軟件進(jìn)行標(biāo)注

地址：https://github.com/wkentaro/labelme

Usage

Annotation

Run labelme --help for detail.

labelme # Open GUI labelme static/apc2016_obj3.jpg # Specify file labelme static/apc2016_obj3.jpg -O static/apc2016_obj3.json # Close window after the save

The annotations are saved as a JSON file. The
file includes the image itself.

Visualization

To view the json file quickly, you can use utility script:

labelme_draw_json static/apc2016_obj3.json

Convert to Dataset

To convert the json to set of image and label, you can run following:

labelme_json_to_dataset static/apc2016_obj3.json

第二步：為標(biāo)注出來的label.png進(jìn)行著色
上面的標(biāo)注軟件將生成的json文件轉(zhuǎn)化為Dataset后，會(huì)生成label.png文件。是一張灰度圖像，16位。
因此我們需要對(duì)照VOC分割的顏色進(jìn)行著色，一定要保證顏色的準(zhǔn)確性。Matlab代碼:

function cmap = labelcolormap(N)if nargin==0N=256 end cmap = zeros(N,3); for i=1:Nid = i-1; r=0;g=0;b=0;for j=0:7r = bitor(r, bitshift(bitget(id,1),7 - j));g = bitor(g, bitshift(bitget(id,2),7 - j));b = bitor(b, bitshift(bitget(id,3),7 - j));id = bitshift(id,-3);endcmap(i,1)=r; cmap(i,2)=g; cmap(i,3)=b; end cmap = cmap / 255;

或者python代碼：

import numpy as np# Get the specified bit value def bitget(byteval, idx):return ((byteval & (1 << idx)) != 0)# Create label-color map, label --- [R G B] # 0 --- [ 0 0 0], 1 --- [128 0 0], 2 --- [ 0 128 0] # 3 --- [128 128 0], 4 --- [ 0 0 128], 5 --- [128 0 128] # 6 --- [ 0 128 128], 7 --- [128 128 128], 8 --- [ 64 0 0] # 9 --- [192 0 0], 10 --- [ 64 128 0], 11 --- [192 128 0] # 12 --- [ 64 0 128], 13 --- [192 0 128], 14 --- [ 64 128 128] # 15 --- [192 128 128], 16 --- [ 0 64 0], 17 --- [128 64 0] # 18 --- [ 0 192 0], 19 --- [128 192 0], 20 --- [ 0 64 128] def labelcolormap(N=256):color_map = np.zeros((N, 3))for n in xrange(N):id_num = nr, g, b = 0, 0, 0for pos in xrange(8):r = np.bitwise_or(r, (bitget(id_num, 0) << (7-pos)))g = np.bitwise_or(g, (bitget(id_num, 1) << (7-pos)))b = np.bitwise_or(b, (bitget(id_num, 2) << (7-pos)))id_num = (id_num >> 3)color_map[n, 0] = rcolor_map[n, 1] = gcolor_map[n, 2] = breturn color_map/255if __name__=="__main__":color_map=labelcolormap(21)print color_map

上面會(huì)生成如下的矩陣,以python的結(jié)果為例：

[[ 0. 0. 0. ][ 0.50196078 0. 0. ][ 0. 0.50196078 0. ][ 0.50196078 0.50196078 0. ][ 0. 0. 0.50196078][ 0.50196078 0. 0.50196078][ 0. 0.50196078 0.50196078][ 0.50196078 0.50196078 0.50196078][ 0.25098039 0. 0. ][ 0.75294118 0. 0. ][ 0.25098039 0.50196078 0. ][ 0.75294118 0.50196078 0. ][ 0.25098039 0. 0.50196078][ 0.75294118 0. 0.50196078][ 0.25098039 0.50196078 0.50196078][ 0.75294118 0.50196078 0.50196078][ 0. 0.25098039 0. ][ 0.50196078 0.25098039 0. ][ 0. 0.75294118 0. ][ 0.50196078 0.75294118 0. ][ 0. 0.25098039 0.50196078]]

分別對(duì)應(yīng)著Pascal voc的colormap:

background 0 0 0 aeroplane 128 0 0 bicycle 0 128 0 bird 128 128 0 boat 0 0 128 bottle 128 0 128 bus 0 128 128 car 128 128 128 cat 64 0 0 chair 192 0 0 cow 64 128 0 diningtable 192 128 0 dog 64 0 128 horse 192 0 128 motorbike 64 128 128 person 192 128 128 pottedplant 0 64 0 sheep 128 64 0 sofa 0 192 0 train 128 192 0 tvmonitor 0 64 128

這里使用函數(shù)生成了label對(duì)應(yīng)的顏色，這里label就是指0,1,2，… ,21(這里pascal voc共21類)
而在第一步標(biāo)注生成的圖像label.png里面的數(shù)值就是0,1,2…21.最多256個(gè)數(shù)值。一般取為灰度圖像。
因此我們需要根據(jù)這個(gè)colormap將上面生成的灰度圖轉(zhuǎn)化為rgb圖像。

方法一：改造skimage的colormap
其實(shí)在skimage中已經(jīng)包含了部分colormap，但是不是針對(duì)于pascal voc的格式，因此我們需要單獨(dú)指定。
找到如下路徑：

/*/anaconda2/lib/python2.7/site-packages/skimage/color/

修改colorlabel.py，增加

DEFAULT_COLORS1 = ('maroon', 'lime', 'olive', 'navy', 'purple', 'teal','gray', 'fcncat', 'fcnchair', 'fcncow', 'fcndining','fcndog', 'fcnhorse', 'fcnmotor', 'fcnperson', 'fcnpotte','fcnsheep', 'fcnsofa', 'fcntrain', 'fcntv')

并且把_label2rgb_overlay函數(shù)改造：

if colors is None:colors = DEFAULT_COLORS1

最后在rgb_colors.py中新增如下變量：

fcnchair = (0.753, 0, 0) fcncat = (0.251, 0, 0) fcncow = (0.251, 0.502, 0) fcndining = (0.753, 0.502, 0) fcndog = (0.251, 0, 0.502) fcnhorse = (0.753, 0, 0.502) fcnmotor = (0.251, 0.502, 0.502) fcnperson = (0.753, 0.502, 0.502) fcnpotte = (0, 0.251, 0) fcnsheep = (0.502, 0.251, 0) fcnsofa = (0, 0.753, 0) fcntrain = (0.502, 0.753, 0) fcntv = (0, 0.251, 0.502)

如果嫌麻煩，只需要下載：https://github.com/315386775/FCN_train
然后將Add_colortoimg下的skimge-color替換skimage的color文件夾即可。
最后執(zhí)行轉(zhuǎn)換：

#!usr/bin/python # -*- coding:utf-8 -*- import PIL.Image import numpy as np from skimage import io,data,color import matplotlib.pyplot as pltimg = PIL.Image.open('xxx.png') img = np.array(img) dst = color.label2rgb(img, bg_label=0, bg_color=(0, 0, 0)) io.imsave('xxx.png', dst)

方法二： 不修改源代碼

#!usr/bin/python # -*- coding:utf-8 -*- import PIL.Image import numpy as np from skimage import io,data,color# Get the specified bit value def bitget(byteval, idx):return ((byteval & (1 << idx)) != 0)# Create label-color map, label --- [R G B] # 0 --- [ 0 0 0], 1 --- [128 0 0], 2 --- [ 0 128 0] # 4 --- [128 128 0], 5 --- [ 0 0 128], 6 --- [128 0 128] # 7 --- [ 0 128 128], 8 --- [128 128 128], 9 --- [ 64 0 0] # 10 --- [192 0 0], 11 --- [ 64 128 0], 12 --- [192 128 0] # 13 --- [ 64 0 128], 14 --- [192 0 128], 15 --- [ 64 128 128] # 16 --- [192 128 128], 17 --- [ 0 64 0], 18 --- [128 64 0] # 19 --- [ 0 192 0], 20 --- [128 192 0], 21 --- [ 0 64 128] def labelcolormap(N=256):color_map = np.zeros((N, 3))for n in xrange(N):id_num = nr, g, b = 0, 0, 0for pos in xrange(8):r = np.bitwise_or(r, (bitget(id_num, 0) << (7-pos)))g = np.bitwise_or(g, (bitget(id_num, 1) << (7-pos)))b = np.bitwise_or(b, (bitget(id_num, 2) << (7-pos)))id_num = (id_num >> 3)color_map[n, 0] = rcolor_map[n, 1] = gcolor_map[n, 2] = breturn color_map/255color_map = labelcolormap(21)img = PIL.Image.open('label.png') img = np.array(img) dst = color.label2rgb(img,colors=color_map[1:],bg_label=0, bg_color=(0, 0, 0)) io.imsave('xxx.png', dst)

這種方法直接加載了colormap，更簡單明了。

需要注意的是：第一種方法中，將部分colormap做了修改，比如DEFAULT_COLORS1的第二個(gè)color，本來應(yīng)該是(0 128 0)，即(0, 0.502, 0)，在skimge顯示為green，但是這里使用了lime = (0, 1, 0)。不過差別不大。

第三步：最關(guān)鍵的一步
把24位png圖轉(zhuǎn)換為8位png圖，直接上matlab代碼：

dirs=dir('F:/xxx/*.png'); map =labelcolormap(256); for n=1:numel(dirs)strname=strcat('F:/xxx/',dirs(n).name);img=imread(strname);x=rgb2ind(img,map);newname=strcat('F:/xxx/',dirs(n).name);imwrite(x,map,newname,'png'); end

至此我們就生成了8位的彩色圖。

需要注意的是，我們可以讀取上面的生成的圖像，看下面的輸出是否與VOC輸出一致。

In [23]: img = PIL.Image.open('F:/DL/000001_json/test/dstfcn.png') In [24]: np.unique(img) Out[24]: array([0, 1, 2], dtype=uint8)

主要關(guān)注[0, 1, 2] ，是不是有這樣的輸出，如果有，證明我們就成功地生成了label。

上面我們經(jīng)歷了生成label灰度圖像–>生成colormap–>轉(zhuǎn)化為rgb—》轉(zhuǎn)化為8位rgb。

接下來，我們需要為訓(xùn)練準(zhǔn)備如下數(shù)據(jù)：
test.txt是測試集，train.txt是訓(xùn)練集，val.txt是驗(yàn)證集，trainval.txt是訓(xùn)練和驗(yàn)證集
這時(shí)可以參考faster rcnn的比例，VOC2007中，trainval大概是整個(gè)數(shù)據(jù)集的50%，test也大概是整個(gè)數(shù)據(jù)集的50%；train大概是trainval的50%，val大概是trainval的50%。可參考以下代碼：

參考：http://blog.csdn.net/sinat_30071459/article/details/50723212

%% %該代碼根據(jù)已生成的xml，制作VOC2007數(shù)據(jù)集中的trainval.txt;train.txt;test.txt和val.txt %trainval占總數(shù)據(jù)集的50%，test占總數(shù)據(jù)集的50%；train占trainval的50%，val占trainval的50%； %上面所占百分比可根據(jù)自己的數(shù)據(jù)集修改，如果數(shù)據(jù)集比較少，test和val可少一些 %% %注意修改下面四個(gè)值 xmlfilepath='E:\Annotations'; txtsavepath='E:\ImageSets\Main\'; trainval_percent=0.5;%trainval占整個(gè)數(shù)據(jù)集的百分比，剩下部分就是test所占百分比 train_percent=0.5;%train占trainval的百分比，剩下部分就是val所占百分比%% xmlfile=dir(xmlfilepath); numOfxml=length(xmlfile)-2;%減去.和.. 總的數(shù)據(jù)集大小trainval=sort(randperm(numOfxml,floor(numOfxml*trainval_percent))); test=sort(setdiff(1:numOfxml,trainval));trainvalsize=length(trainval);%trainval的大小 train=sort(trainval(randperm(trainvalsize,floor(trainvalsize*train_percent)))); val=sort(setdiff(trainval,train));ftrainval=fopen([txtsavepath 'trainval.txt'],'w'); ftest=fopen([txtsavepath 'test.txt'],'w'); ftrain=fopen([txtsavepath 'train.txt'],'w'); fval=fopen([txtsavepath 'val.txt'],'w');for i=1:numOfxmlif ismember(i,trainval)fprintf(ftrainval,'%s\n',xmlfile(i+2).name(1:end-4));if ismember(i,train)fprintf(ftrain,'%s\n',xmlfile(i+2).name(1:end-4));elsefprintf(fval,'%s\n',xmlfile(i+2).name(1:end-4));endelsefprintf(ftest,'%s\n',xmlfile(i+2).name(1:end-4));end end fclose(ftrainval); fclose(ftrain); fclose(fval); fclose(ftest);

不過這里是利用了xml文件，我們可以直接利用img文件夾即可。

對(duì)測試結(jié)果著色

其實(shí)這一步主要就是修改infer.py
方法一：

import numpy as np from PIL import Image import caffe# load image, switch to BGR, subtract mean, and make dims C x H x W for Caffe im = Image.open('pascal/VOC2010/JPEGImages/2007_000129.jpg') in_ = np.array(im, dtype=np.float32) in_ = in_[:,:,::-1] in_ -= np.array((104.00698793,116.66876762,122.67891434)) in_ = in_.transpose((2,0,1))# load net net = caffe.Net('voc-fcn8s/deploy.prototxt', 'voc-fcn8s/fcn8s-heavy-pascal.caffemodel', caffe.TEST) # shape for input (data blob is N x C x H x W), set data net.blobs['data'].reshape(1, *in_.shape) net.blobs['data'].data[...] = in_ # run net and take argmax for prediction net.forward() out = net.blobs['score'].data[0].argmax(axis=0)arr=out.astype(np.uint8) im=Image.fromarray(arr)palette=[] for i in range(256):palette.extend((i,i,i)) palette[:3*21]=np.array([[0, 0, 0],[128, 0, 0],[0, 128, 0],[128, 128, 0],[0, 0, 128],[128, 0, 128],[0, 128, 128],[128, 128, 128],[64, 0, 0],[192, 0, 0],[64, 128, 0],[192, 128, 0],[64, 0, 128],[192, 0, 128],[64, 128, 128],[192, 128, 128],[0, 64, 0],[128, 64, 0],[0, 192, 0],[128, 192, 0],[0, 64, 128]], dtype='uint8').flatten() im.putpalette(palette) im.show() im.save('test.png')

或者采用跟準(zhǔn)備數(shù)據(jù)一樣的方法：

import numpy as np from PIL import Imageimport caffefrom scipy.misc import imread, imsave from skimage.color import label2rgb# Get the specified bit value def bitget(byteval, idx):return ((byteval & (1 << idx)) != 0)# Create label-color map, label --- [R G B] # 0 --- [ 0 0 0], 1 --- [128 0 0], 2 --- [ 0 128 0] # 4 --- [128 128 0], 5 --- [ 0 0 128], 6 --- [128 0 128] # 7 --- [ 0 128 128], 8 --- [128 128 128], 9 --- [ 64 0 0] # 10 --- [192 0 0], 11 --- [ 64 128 0], 12 --- [192 128 0] # 13 --- [ 64 0 128], 14 --- [192 0 128], 15 --- [ 64 128 128] # 16 --- [192 128 128], 17 --- [ 0 64 0], 18 --- [128 64 0] # 19 --- [ 0 192 0], 20 --- [128 192 0], 21 --- [ 0 64 128] def labelcolormap(N=256):color_map = np.zeros((N, 3))for n in xrange(N):id_num = nr, g, b = 0, 0, 0for pos in xrange(8):r = np.bitwise_or(r, (bitget(id_num, 0) << (7-pos)))g = np.bitwise_or(g, (bitget(id_num, 1) << (7-pos)))b = np.bitwise_or(b, (bitget(id_num, 2) << (7-pos)))id_num = (id_num >> 3)color_map[n, 0] = rcolor_map[n, 1] = gcolor_map[n, 2] = breturn color_mapdef main():# load image, switch to BGR, subtract mean, and make dims C x H x W for Caffeim = Image.open('data/pascal/VOCdevkit/VOC2012/JPEGImages/2007_000346.jpg')in_ = np.array(im, dtype=np.float32)in_ = in_[:,:,::-1]in_ -= np.array((104.00698793,116.66876762,122.67891434))in_ = in_.transpose((2,0,1))# load netnet = caffe.Net('voc-fcn8s/deploy.prototxt', 'ilsvrc-nets/fcn8s-heavy-pascal.caffemodel', caffe.TEST)# shape for input (data blob is N x C x H x W), set datanet.blobs['data'].reshape(1, *in_.shape)net.blobs['data'].data[...] = in_# run net and take argmax for predictionnet.forward()out = net.blobs['score'].data[0].argmax(0).astype(np.uint8)color_map = labelcolormap(21)label_mask = label2rgb(out, colors=color_map[1:], bg_label=0)label_mask[out == 0] = [0, 0, 0]imsave('data/pascal/VOCdevkit/VOC2012/JPEGImages/test_prediction.png', label_mask.astype(np.uint8))if __name__ == '__main__':main()

參考文獻(xiàn)

圖像分割 | FCN數(shù)據(jù)集制作的全流程（圖像標(biāo)注）

FCN制作自己的數(shù)據(jù)集、訓(xùn)練和測試全流程

FCN網(wǎng)絡(luò)訓(xùn)練終極版

【FCN實(shí)踐】04 預(yù)測

總結(jié)

以上是生活随笔為你收集整理的FCN-数据篇的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

数据
FCN

上一篇： Mask RCNN笔记
下一篇：【OS修炼指南目录】----《X86汇编