當(dāng)前位置：首頁(yè) > 编程资源 > 编程问答 >内容正文

编程问答

影像裁剪及数据扩充（针对DOTAv1.5进行部分优化）

發(fā)布時(shí)間：2024/3/24 编程问答 28 豆豆

生活随笔收集整理的這篇文章主要介紹了影像裁剪及数据扩充（针对DOTAv1.5进行部分优化）小編覺(jué)得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

影像裁剪及數(shù)據(jù)擴(kuò)充（針對(duì)DOTAv1.5進(jìn)行部分優(yōu)化）

概述：不進(jìn)行重疊裁剪（重疊裁剪會(huì)大量增加小汽車(chē)等小目標(biāo)的數(shù)量，造成數(shù)量差距過(guò)大）；影像中心裁剪一張（通常目標(biāo)會(huì)出現(xiàn)在影像中間附近，保留在影像中心的較大目標(biāo)不被裁剪）。
1.加影像中心裁剪的裁剪算法
2.標(biāo)簽數(shù)據(jù)自動(dòng)抓取
- 標(biāo)簽數(shù)據(jù)抓取規(guī)則：
3.樣本量少的類(lèi)別數(shù)據(jù)擴(kuò)充
- 3.1DOTAv1.5樣本量統(tǒng)計(jì)結(jié)果：
- 3.2挑選樣本量小于1000的類(lèi)別，進(jìn)行數(shù)據(jù)擴(kuò)充
- 3.3數(shù)據(jù)擴(kuò)充

概述：不進(jìn)行重疊裁剪（重疊裁剪會(huì)大量增加小汽車(chē)等小目標(biāo)的數(shù)量，造成數(shù)量差距過(guò)大）；影像中心裁剪一張（通常目標(biāo)會(huì)出現(xiàn)在影像中間附近，保留在影像中心的較大目標(biāo)不被裁剪）。

1.加影像中心裁剪的裁剪算法

caijian.py代碼如下：

# -*- coding: utf-8 -*- import cv2 import os import numpy as np from PIL import Image import matplotlib.pyplot as plt def caijian(path,path_out,size_w=1024,size_h=1024,step=768):ims_list=os.listdir(path)count = 0for im_list in ims_list:number = 0numberz = 0name = im_list[:-4]print(name)img = cv2.imread(ims_path+im_list)size = img.shapeshao_w = size[1]%stepshao_h = size[0]%step#將圖像補(bǔ)為能夠完整裁剪的大小img0 = cv2.copyMakeBorder(img,0,size_h-shao_h,0,size_w-shao_w,cv2.BORDER_CONSTANT,value=(113,113,113))size0 = img0.shapecount = count + 1for h in range(0,size[0]-(size_h-step),step):star_h = hfor w in range(0,size[1]-(size_w-step),step):star_w = wend_h = star_h + size_hend_w = star_w + size_wcropped = img0[star_h:end_h, star_w:end_w]name_img = name + '_'+ str(star_h) +'_' + str(star_w)cv2.imwrite('{}/{}.png'.format(path_out,name_img),cropped)number = number + 1 #影像中心裁剪一張 if size[0]>=size_h and size[1]>=size_w:mid_h = int(size[0]/2)mid_w = int(size[1]/2)star_h = int(mid_h-size_h/2)star_w = int(mid_w-size_w/2)end_h = star_h + size_hend_w = star_w + size_w cropped = img[star_h:end_h, star_w:end_w]name_img = name + '_'+ str(star_h) +'_' + str(star_w)cv2.imwrite('{}/{}.png'.format(path_out,name_img),cropped)numberz = numberz + 1if size[0]>=size_h and size[1]<size_w:imgy = cv2.copyMakeBorder(img,0,0,0,size_w-size[1],cv2.BORDER_CONSTANT,value=(113,113,113))sizey = imgy.shapemid_h = int(sizey[0]/2)star_h = int(mid_h-size_h/2)star_w = 0end_h = star_h + size_hend_w = star_w + size_w cropped = imgy[star_h:end_h, star_w:end_w]name_img = name + '_'+ str(star_h) +'_' + str(star_w)cv2.imwrite('{}/{}.png'.format(path_out,name_img),cropped)numberz = numberz + 1if size[0]<size_h and size[1]>=size_w:imgx = cv2.copyMakeBorder(img,0,size_h-size[0],0,0,cv2.BORDER_CONSTANT,value=(113,113,113))sizex = imgx.shapemid_w = int(sizex[1]/2)star_w = int(mid_w-size_w/2)star_h = 0end_h = star_h + size_hend_w = star_w + size_w cropped = imgx[star_h:end_h, star_w:end_w]name_img = name + '_'+ str(star_h) +'_' + str(star_w)cv2.imwrite('{}/{}.png'.format(path_out,name_img),cropped)numberz = numberz + 1print('圖片{}寬高為{}*{}'.format(name,size[1],size[0])) print('圖片{}補(bǔ)充后寬高為{}*{}'.format(name,size0[1],size0[0])) print('圖片{}切割成{}張'.format(name,number)) print('圖片{}切割中心成{}張'.format(name,numberz)) print('共完成{}張圖片'.format(count)) if __name__ == '__main__':ims_path='/media/xuejunda/2c8076c4-abf2-4d0f-89e3-4568b4f029cf/dataset/detection/DOTA/train/images/images/'# 圖像數(shù)據(jù)集的路徑path = '/home/xuejunda/data/VOCdevkit/mydataset/train_all_1024img/'#輸出路徑 caijian(ims_path,path,size_w=1024,size_h=1024,step=1024)

2.標(biāo)簽數(shù)據(jù)自動(dòng)抓取

標(biāo)簽數(shù)據(jù)抓取規(guī)則：

1.在影像上下左右四個(gè)邊上的目標(biāo)框（上圖1、2、3、4）保留在面積大于1/2的影像上
2.在影像四個(gè)角上的目標(biāo)框（上圖5、6、7、8）保留在面積大于4/9的影像上（如上圖目標(biāo)框6的放大示意圖）
txttq.py代碼如下：

# -*- coding: utf-8 -*- import cv2 import os import numpy as np from PIL import Image import matplotlib.pyplot as plt #category_set = ['backgroud','plane','small-vehicle','large-vehicle','ship'] #vehicle = ['large-vehicle','small-vehicle'] def tqtxt(path,path_txt,path_out,size_h=1024,size_w=1024):ims_list=os.listdir(path)for im_list in ims_list:name_list = []name = im_list[:-4]#print(name1)#name,h_star,w_star = name1.split('_')name_list = name.split('_')if len(name_list)<2:continueh = int(name_list[1])w = int(name_list[2])#print(name_list)#img = cv2.imread(ims_path+im_list)#size = img.shapetxtpath = path_txt + name_list[0] + '.txt'txt_outpath = path_out + name + '.txt'f = open(txt_outpath,'a')with open(txtpath, 'r') as f_in: #打開(kāi)txt文件 i = 0lines = f_in.readlines()#print(len(lines))#splitlines = [x.strip().split(' ') for x in lines] #根據(jù)空格分割for line in lines:if i in [0,1]:f.write(line)i = i+1continuesplitline = line.split(' ')label = splitline[8]kunnan = splitline[9]#if label not in category_set:#只書(shū)寫(xiě)指定的類(lèi)別 # category_set.append(label) # continue # if label in vehicle:#只書(shū)寫(xiě)指定的類(lèi)別#print('1') # label = 'vehicle' x1 = int(float(splitline[0]))y1 = int(float(splitline[1]))x2 = int(float(splitline[2]))y2 = int(float(splitline[3]))x3 = int(float(splitline[4]))y3 = int(float(splitline[5])) x4 = int(float(splitline[6]))y4 = int(float(splitline[7]))xmin = min(x1,x2,x3,x4)xmax = max(x1,x2,x3,x4)ymin = min(y1,y2,y3,y4)ymax = max(y1,y2,y3,y4)if w-int((xmax-xmin)/2.0)<x1<=w+size_w and w<x2<=w+size_w+int((xmax-xmin)/2.0) and h-int((ymax-ymin)/2.0)<y1<=h+size_h and h<y3<=h+size_h+int((ymax-ymin)/2.0): #都在圖內(nèi)if w<x1<=w+size_w and w<x2<=w+size_w and w<x3<=w+size_w and w<x4<=w+size_w and h<y1<=h+size_h and h<y2<=h+size_h and h<y3<=h+size_h and h<y4<=h+size_h: f.write('{} {} {} {} {} {} {} {} {} {}'.format(float(x1-w),float(y1-h),float(x2-w),float(y2-h),float(x3-w),float(y3-h),float(x4-w),float(y4-h),label,kunnan))#左邊超出圖if w-int((xmax-xmin)/2.0)<x1<=w and w<x2<=w+size_w: #H在圖內(nèi)if h<y1<=h+size_h and h<y3<=h+size_h: f.write('{} {} {} {} {} {} {} {} {} {}'.format(float(1),float(y1-h),float(x2-w),float(y2-h),float(x3-w),float(y3-h),float(1),float(y4-h),label,kunnan))#H在圖左上方，4/9在圖內(nèi)則保留if w-int((xmax-xmin)/3.0)<x1<=w and w<x2<=w+size_w and h-int((ymax-ymin)/3.0)<y1<=h and h<y3<=h+size_h: f.write('{} {} {} {} {} {} {} {} {} {}'.format(float(1),float(1),float(x2-w),float(1),float(x3-w),float(y3-h),float(1),float(y4-h),label,kunnan))#H在圖左下方，4/9在圖內(nèi)則保留if w-int((xmax-xmin)/3.0)<x1<=w and w<x2<=w+size_w and h<y1<=h+size_h and h+size_h<y3<=h+size_h+int((ymax-ymin)/3.0): f.write('{} {} {} {} {} {} {} {} {} {}'.format(float(1),float(y1-h),float(x2-w),float(y2-h),float(x3-w),float(size_h-1),float(1),float(size_h-1),label,kunnan))#上面超出圖if h-int((ymax-ymin)/2.0)<y1<=h and h<y3<=h+size_h: #w在圖內(nèi)if w<x1<=w+size_w and w<x2<=w+size_w:f.write('{} {} {} {} {} {} {} {} {} {}'.format(float(x1-w),float(1),float(x2-w),float(1),float(x3-w),float(y3-h),float(x4-w),float(y4-h),label,kunnan))#w在圖右上 if w<x1<=w+size_w and w+size_w<x2<=w+size_w+int((xmax-xmin)/3.0) and h-int((ymax-ymin)/3.0)<y1<=h and h<y3<=h+size_h:f.write('{} {} {} {} {} {} {} {} {} {}'.format(float(x1-w),float(1),float(size_w-1),float(1),float(size_w-1),float(y3-h),float(x4-w),float(y4-h),label,kunnan))#右面超出圖if w<x1<=w+size_w and w+size_w<x2<=w+size_w+int((xmax-xmin)/2.0): #H在圖內(nèi)if h<y1<=h+size_h and h<y3<=h+size_h:f.write('{} {} {} {} {} {} {} {} {} {}'.format(float(x1-w),float(y1-h),float(size_w-1),float(y2-h),float(size_w-1),float(y3-h),float(x4-w),float(y4-h),label,kunnan))#w在圖右下 if w<x1<=w+size_w and w+size_w<x2<=w+size_w+int((xmax-xmin)/3.0) and h+size_h<y3<=h+size_h+int((ymax-ymin)/3.0) and h<y1<=h+size_h:f.write('{} {} {} {} {} {} {} {} {} {}'.format(float(x1-w),float(y1-h),float(size_w-1),float(y2-h),float(size_w-1),float(size_h-1),float(x4-w),float(size_h-1),label,kunnan))#下面超出圖if h+size_h<y3<=h+size_h+int((ymax-ymin)/2.0) and h<y1<=h+size_h: #w在圖內(nèi)if w<x1<=w+size_w and w<x2<=w+size_w:f.write('{} {} {} {} {} {} {} {} {} {}'.format(float(x1-w),float(y1-h),float(x2-w),float(y2-h),float(x3-w),float(size_h-1),float(x4-w),float(size_h-1),label,kunnan))f.close()#print(category_set)if __name__ == '__main__':ims_path='/media/xuejunda/2c8076c4-abf2-4d0f-89e3-4568b4f029cf/dataset/detection/DOTA/train/images/img512/'# 圖像數(shù)據(jù)集的路徑txt_path = '/media/xuejunda/2c8076c4-abf2-4d0f-89e3-4568b4f029cf/dataset/detection/DOTA/train/hbb/'#標(biāo)簽數(shù)據(jù)路徑path = '/media/xuejunda/2c8076c4-abf2-4d0f-89e3-4568b4f029cf/dataset/detection/DOTA/train/txt512/'#txt輸出路徑 tqtxt(ims_path,txt_path,path,size_h=512,size_w=512)

3.樣本量少的類(lèi)別數(shù)據(jù)擴(kuò)充

3.1DOTAv1.5樣本量統(tǒng)計(jì)結(jié)果：

plane: 8072.0
small-vehicle: 126501.0
large-vehicle: 22218.0
roundabout: 437.0
bridge: 2075.0
soccer-ball-field: 338.0
helicopter: 635.0
ground-track-field: 331.0
baseball-diamond: 412.0
storage-tank: 5346.0
tennis-court: 2425.0
swimming-pool: 2181.0
ship: 32973.0
harbor: 6016.0
basketball-court: 529.0
container-crane: 142.0

3.2挑選樣本量小于1000的類(lèi)別，進(jìn)行數(shù)據(jù)擴(kuò)充

我是挑選的裁剪之后的影像，挑選代碼如下：

# -*- coding: utf-8 -*- import os import shutil from PIL import Image import matplotlib.pyplot as plt category_set = ['roundabout','soccer-ball-field','helicopter','ground-track-field','baseball-diamond','basketball-court','container-crane'] def shaixuan(path_txt,path_txt_out,path_img,path_img_out):ims_list=os.listdir(path_txt)for im_list in ims_list:name = im_list[:-4]txtpath = path_txt + im_listtxtpathout = path_txt_out + im_listimgpath = path_img + name + '.png'imgpathout = path_img_out + name + '.png'with open(txtpath, 'r') as f: #打開(kāi)txt文件 i = 0lines = f.readlines()#print(len(lines))#splitlines = [x.strip().split(' ') for x in lines] #根據(jù)空格分割for line in lines:if i in [0,1]:i = i+1continuesplitline = line.split(' ')label = splitline[8]kunnan = splitline[9]if label in category_set:#只書(shū)寫(xiě)指定的類(lèi)別 shutil.copyfile(txtpath, txtpathout)shutil.copyfile(imgpath, imgpathout) breakif __name__ == '__main__':path_txt = '/home/xuejunda/data/VOCdevkit/mydataset/train_all_1024txt/'path_txt_out = '/home/xuejunda/data/VOCdevkit/mydataset/kuochongtxt/'path_img = '/home/xuejunda/data/VOCdevkit/mydataset/train_all_1024img/'path_img_out = '/home/xuejunda/data/VOCdevkit/mydataset/kuochongimg/' shaixuan(path_txt,path_txt_out,path_img,path_img_out)

3.3數(shù)據(jù)擴(kuò)充

利用圖像旋轉(zhuǎn)以及鏡像的方式進(jìn)行數(shù)據(jù)擴(kuò)充，上圖表明共可以產(chǎn)生除原圖外額外7張不同的圖像（分別為原圖的3次旋轉(zhuǎn)以及任意一種鏡像之后的4次旋轉(zhuǎn)）。
數(shù)據(jù)擴(kuò)充代碼如下：

# -*- coding: utf-8 -*- import cv2 import os import numpy as np from PIL import Image import matplotlib.pyplot as plt def kuochong(img1,img_out,txt,txt_out,size_h=1024,size_w=1024):ims_list=os.listdir(img1)for im_list in ims_list:name = im_list[:-4]txtpath = txt + name + '.txt'img = cv2.imread(img1+im_list)img_zz = cv2.transpose(img) #zztxtout_zz = txt_out + name + '_zz' '.txt'imgout_zz = img_out + name + '_zz' + '.png'cv2.imwrite(imgout_zz,img_zz)f = open(txtout_zz,'a')with open(txtpath, 'r') as f_in: i = 0lines = f_in.readlines()for line in lines:if i in [0,1]:f.write(line)i = i+1continuesplitline = line.split(' ')label = splitline[8]kunnan = splitline[9] x1 = int(float(splitline[0]))y1 = int(float(splitline[1]))x2 = int(float(splitline[2]))y2 = int(float(splitline[3]))x3 = int(float(splitline[4]))y3 = int(float(splitline[5])) x4 = int(float(splitline[6]))y4 = int(float(splitline[7])) f.write('{} {} {} {} {} {} {} {} {} {}'.format(float(y1),float(x1),float(y4),float(x4),float(y3),float(x3),float(y2),float(x2),label,kunnan))f.close() #zs90txtout_zs90 = txt_out + name + '_zs90' '.txt'imgout_zs90 = img_out + name + '_zs90' + '.png'img_zs90 = cv2.flip(img_zz,1)cv2.imwrite(imgout_zs90,img_zs90)f = open(txtout_zs90,'a')with open(txtpath, 'r') as f_in: i = 0lines = f_in.readlines()for line in lines:if i in [0,1]:f.write(line)i = i+1continuesplitline = line.split(' ')label = splitline[8]kunnan = splitline[9] x1 = int(float(splitline[0]))y1 = int(float(splitline[1]))x2 = int(float(splitline[2]))y2 = int(float(splitline[3]))x3 = int(float(splitline[4]))y3 = int(float(splitline[5])) x4 = int(float(splitline[6]))y4 = int(float(splitline[7])) f.write('{} {} {} {} {} {} {} {} {} {}'.format(float(size_h-y4),float(x4),float(size_h-y1),float(x1),float(size_h-y2),float(x2),float(size_h-y3),float(x3),label,kunnan))f.close() #zs180txtout_zs180 = txt_out + name + '_zs180' '.txt'imgout_zs180 = img_out + name + '_zs180' + '.png'img_zs180 = cv2.flip(img,-1)cv2.imwrite(imgout_zs180,img_zs180)f = open(txtout_zs180,'a')with open(txtpath, 'r') as f_in: i = 0lines = f_in.readlines()for line in lines:if i in [0,1]:f.write(line)i = i+1continuesplitline = line.split(' ')label = splitline[8]kunnan = splitline[9] x1 = int(float(splitline[0]))y1 = int(float(splitline[1]))x2 = int(float(splitline[2]))y2 = int(float(splitline[3]))x3 = int(float(splitline[4]))y3 = int(float(splitline[5])) x4 = int(float(splitline[6]))y4 = int(float(splitline[7])) f.write('{} {} {} {} {} {} {} {} {} {}'.format(float(size_w-x3),float(size_h-y3),float(size_w-x4),float(size_h-y4),float(size_w-x1),float(size_h-y1),float(size_w-x2),float(size_h-y2),label,kunnan))f.close() #zs270txtout_zs270 = txt_out + name + '_zs270' '.txt'imgout_zs270 = img_out + name + '_zs270' + '.png'img_zs270 = cv2.flip(img_zz,0)cv2.imwrite(imgout_zs270,img_zs270)f = open(txtout_zs270,'a')with open(txtpath, 'r') as f_in: i = 0lines = f_in.readlines()for line in lines:if i in [0,1]:f.write(line)i = i+1continuesplitline = line.split(' ')label = splitline[8]kunnan = splitline[9] x1 = int(float(splitline[0]))y1 = int(float(splitline[1]))x2 = int(float(splitline[2]))y2 = int(float(splitline[3]))x3 = int(float(splitline[4]))y3 = int(float(splitline[5])) x4 = int(float(splitline[6]))y4 = int(float(splitline[7])) f.write('{} {} {} {} {} {} {} {} {} {}'.format(float(y2),float(size_w-x2),float(y3),float(size_w-x3),float(y4),float(size_w-x4),float(y1),float(size_w-x1),label,kunnan))f.close() #zz90txtout_zz90 = txt_out + name + '_zz90' '.txt'imgout_zz90 = img_out + name + '_zz90' + '.png'img_zz90 = cv2.flip(img,1)cv2.imwrite(imgout_zz90,img_zz90)f = open(txtout_zz90,'a')with open(txtpath, 'r') as f_in: i = 0lines = f_in.readlines()for line in lines:if i in [0,1]:f.write(line)i = i+1continuesplitline = line.split(' ')label = splitline[8]kunnan = splitline[9] x1 = int(float(splitline[0]))y1 = int(float(splitline[1]))x2 = int(float(splitline[2]))y2 = int(float(splitline[3]))x3 = int(float(splitline[4]))y3 = int(float(splitline[5])) x4 = int(float(splitline[6]))y4 = int(float(splitline[7])) f.write('{} {} {} {} {} {} {} {} {} {}'.format(float(size_w-x2),float(y2),float(size_w-x1),float(y1),float(size_w-x4),float(y4),float(size_w-x3),float(y3),label,kunnan))f.close() #zz180txtout_zz180 = txt_out + name + '_zz180' '.txt'imgout_zz180 = img_out + name + '_zz180' + '.png'img_zz180 = cv2.flip(img_zz,-1)cv2.imwrite(imgout_zz180,img_zz180)f = open(txtout_zz180,'a')with open(txtpath, 'r') as f_in: i = 0lines = f_in.readlines()for line in lines:if i in [0,1]:f.write(line)i = i+1continuesplitline = line.split(' ')label = splitline[8]kunnan = splitline[9] x1 = int(float(splitline[0]))y1 = int(float(splitline[1]))x2 = int(float(splitline[2]))y2 = int(float(splitline[3]))x3 = int(float(splitline[4]))y3 = int(float(splitline[5])) x4 = int(float(splitline[6]))y4 = int(float(splitline[7])) f.write('{} {} {} {} {} {} {} {} {} {}'.format(float(size_h-y3),float(size_w-x3),float(size_h-y2),float(size_w-x2),float(size_h-y1),float(size_w-x1),float(size_h-y4),float(size_w-x4),label,kunnan))f.close() #zz270txtout_zz270 = txt_out + name + '_zz270' '.txt'imgout_zz270 = img_out + name + '_zz270' + '.png'img_zz270 = cv2.flip(img,0)cv2.imwrite(imgout_zz270,img_zz270)f = open(txtout_zz270,'a')with open(txtpath, 'r') as f_in: i = 0lines = f_in.readlines()for line in lines:if i in [0,1]:f.write(line)i = i+1continuesplitline = line.split(' ')label = splitline[8]kunnan = splitline[9] x1 = int(float(splitline[0]))y1 = int(float(splitline[1]))x2 = int(float(splitline[2]))y2 = int(float(splitline[3]))x3 = int(float(splitline[4]))y3 = int(float(splitline[5])) x4 = int(float(splitline[6]))y4 = int(float(splitline[7])) f.write('{} {} {} {} {} {} {} {} {} {}'.format(float(x4),float(size_h-y4),float(x3),float(size_h-y3),float(x2),float(size_h-y2),float(x1),float(size_h-y1),label,kunnan))f.close() if __name__ == '__main__':img1='/home/xuejunda/data/VOCdevkit/mydataset/kuochongimg/'# 圖像數(shù)據(jù)集的路徑img_out = '/home/xuejunda/data/VOCdevkit/mydataset/kuochonghouimg/'txt = '/home/xuejunda/data/VOCdevkit/mydataset/kuochongtxt/'txt_out = '/home/xuejunda/data/VOCdevkit/mydataset/kuochonghoutxt/' kuochong(img1,img_out,txt,txt_out,size_h=1024,size_w=1024)

擴(kuò)充后樣本量：

plane: 9862.0
small-vehicle: 179726.0
large-vehicle: 30701.0
roundabout: 3768.0
bridge: 2619.0
soccer-ball-field: 2432.0
helicopter: 5040.0
ground-track-field: 2320.0
baseball-diamond: 3624.0
storage-tank: 6410.0
tennis-court: 7256.0
swimming-pool: 2932.0
ship: 39350.0
harbor: 6734.0
basketball-court: 4824.0
container-crane: 1208.0

總結(jié)

以上是生活随笔為你收集整理的影像裁剪及数据扩充（针对DOTAv1.5进行部分优化）的全部?jī)?nèi)容，希望文章能夠幫你解決所遇到的問(wèn)題。

如果覺(jué)得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇：索尼WH-1000XM3不能与win10
下一篇： python 带声音屏幕录制