win下使用TensorFlow object detection训练自己模型
win下使用TensorFlow object detection訓練自己模型
- 1. 環境
- 2.xml生成csv文件,再生成record文件
- 2.1 對訓練文件和測試文件都使用以下兩個文件分別生成自己的csv文件
- 2.1 對生成的兩個csv文件分別生成自己的record文件
- 3. 修改配置文件
- 4.訓練保存模型
- 5.進行模型驗證
- 6.使用zed相機實時檢測
- 7.Android端使用實時檢測
- 7.1 將pb文件轉換成tflite文件
- 7.2安裝android studio
- 8.對訓練的圖像進行數據增強
- 9.模型應用于網絡攝像頭
- 工程文件
- 參考
1. 環境
1.1 創建虛擬環境python3.7,安裝tensorflow-gpu==1.13.1,安裝PIL(pip install pillow)。
1.2 下載labelimg,使用labelimg對自己的圖片進行標注,保存,生成xml文件(使用這三個快捷鍵:ctrl+s保存,d下一張,w畫筆工具,標注最好是字符串形式的標簽)。
1.3 建立4個文件夾(train訓練圖片,train_xml訓練圖片經過labelimg標注的xml文件,test測試文件同上)。
1.4 克隆tensorflow的models文件,就是用這里的模型和配置文件來訓練自己的數據。
1.5.1 下載自己對應版本的protoc,解壓后將bin文件夾中的【protoc.exe】放到C:\Windows,
1.5.2 在models\research\目錄下打開命令行窗口,輸入
1.5.3 在 ‘此電腦’-‘屬性’- ‘高級系統設置’ -‘環境變量’-‘系統變量’ 中新建名為‘PYTHONPATH’的變量,將
models/research/ 及 models/research/slim 兩個文件夾的完整目錄添加,分號隔開。
1.5.4 將slim文件夾下的nets文件夾復制,粘貼到research/object_detection文件夾下.
1.5.5 在slim位置打開終端,輸入
如果有問題,將slim文件夾下的bulid文件改名。
1.5.6 測試API,輸入
不報錯說明運行成功。
2.xml生成csv文件,再生成record文件
2.1 對訓練文件和測試文件都使用以下兩個文件分別生成自己的csv文件
''' xml文件生成csv文件,生成到了各自的xml文件夾里 更改三個地方 ''' import os import glob import pandas as pd import xml.etree.ElementTree as ETos.chdir(r'F:\bomb\test_xml') ## 1、更改路徑到訓練(或者測試)的xml文件夾 path = r'F:\bomb\test_xml' ## 2、同上def xml_to_csv(path):xml_list = []for xml_file in glob.glob(path + '\\*.xml'):tree = ET.parse(xml_file)#print(tree)root = tree.getroot()#hh=root.findall('object')#print(hh[0][0].text)#print(root.find('size')[0])for member in root.findall('object'):#print(member,member[0].text)try:value = (#root.find('filename').text,xml_file.split('\\')[-1].split('.')[0]+'.jpg',int(root.find('size')[0].text),int(root.find('size')[1].text),member[0].text,int(member[4][0].text),int(member[4][1].text),int(member[4][2].text),int(member[4][3].text))except:pass#print(value)xml_list.append(value)column_name = ['filename', 'width', 'height', 'class', 'xmin', 'ymin', 'xmax', 'ymax']xml_df = pd.DataFrame(xml_list, columns=column_name)return xml_dfdef main():image_path = pathxml_df = xml_to_csv(image_path)xml_df.to_csv('test.csv', index=None) #3、輸出文件名稱,生成train.csv或者test.csvprint('Successfully converted xml to csv.')main()2.1 對生成的兩個csv文件分別生成自己的record文件
新建一個generate_tfrecord.py文件,修改里面的自己的路徑
# -*- coding: utf-8 -*- """ csv文件生成record文件,生成到了各自的xml文件夾里 """ """ Usage:# From tensorflow/models/# Create train data:python C:\\Users\\YFZX\\Desktop\\image_augment\\image_augment\\generate_tfrecord.py --csv_input=train.csv --output_path=train.recordpython generate_tfrecord.py --csv_input=F:\\bomb\\test.csv --output_path=F:\\bomb\\test.record """ #改3處 import sys sys.path.append(r'C:\\models-r1.13.0\research\object_detection\utils')#1.改成自己下載的tensorflow的model文件夾里面的research\object_detection\utils文件夾路徑 import dataset_utilimport os import io import pandas as pd import tensorflow as tffrom PIL import Image #from object_detection.utils import dataset_util from collections import namedtuple, OrderedDict# os.chdir(r'C:\Users\YFZX\Desktop\dz') # print(sys.argv[0]) flags = tf.app.flags flags.DEFINE_string('csv_input', '', 'Path to the CSV input')#這里是運行py文件的輸入文件名叫csv_input flags.DEFINE_string('output_path', '', 'Path to output TFRecord')## 第一個是參數名稱,第二個參數是默認值,第三個是參數描述 FLAGS = flags.FLAGS# TO-DO replace this with label map #2.注意將對應的label改成自己的類別!!!!!!!!!! def class_text_to_int(row_label):if row_label == 'class3':return 1elif row_label == 'class4':return 2elif row_label == 'class5':return 3elif row_label == 'class6':return 4elif row_label == 'class7':return 5else:return 0def split(df, group):data = namedtuple('data', ['filename', 'object'])gb = df.groupby(group)return [data(filename, gb.get_group(x)) for filename, x in zip(gb.groups.keys(), gb.groups)]def create_tf_example(group, path):with tf.gfile.GFile(os.path.join(path, '{}'.format(group.filename)), 'rb') as fid:encoded_jpg = fid.read()encoded_jpg_io = io.BytesIO(encoded_jpg)image = Image.open(encoded_jpg_io)width, height = image.sizefilename = group.filename.encode('utf8')image_format = b'jpg'xmins = []xmaxs = []ymins = []ymaxs = []classes_text = []classes = []for index, row in group.object.iterrows():xmins.append(row['xmin'] / width)xmaxs.append(row['xmax'] / width)ymins.append(row['ymin'] / height)ymaxs.append(row['ymax'] / height)classes_text.append(row['class'].encode('utf8'))classes.append(class_text_to_int(row['class']))tf_example = tf.train.Example(features=tf.train.Features(feature={'image/height': dataset_util.int64_feature(height),'image/width': dataset_util.int64_feature(width),'image/filename': dataset_util.bytes_feature(filename),'image/source_id': dataset_util.bytes_feature(filename),'image/encoded': dataset_util.bytes_feature(encoded_jpg),'image/format': dataset_util.bytes_feature(image_format),'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),'image/object/class/text': dataset_util.bytes_list_feature(classes_text),'image/object/class/label': dataset_util.int64_list_feature(classes),}))return tf_exampledef main(_):writer = tf.python_io.TFRecordWriter(FLAGS.output_path)path = os.path.join(os.getcwd(), r'F:\bomb\train') #3.改為自己的train(或者test)圖片的存放路徑examples = pd.read_csv(FLAGS.csv_input)grouped = split(examples, 'filename')for group in grouped:tf_example = create_tf_example(group, path)writer.write(tf_example.SerializeToString())writer.close()output_path = os.path.join(os.getcwd(), FLAGS.output_path)print('Successfully created the TFRecords: {}'.format(output_path))if __name__ == '__main__':tf.app.run()保存此py文件,然后在終端的虛擬環境(activate tf1.13)中運行這個py文件(在這位置安裝shift再鼠標右鍵,選擇在此處打開命令窗口),如下
csv_input=自己的csv文件地址。output_path=生成的record文件的地址和名稱。(如果出錯,可以試著吧地址寫成絕對路徑)
3. 修改配置文件
1.建立一個自己的pbtxt文件(bomb.pbtxt),我自己是手寫的,就按照research\object_detection\data文件夾里面已有的文件改成自己的分類類別。
2.在object_detection文件夾下新建一個自己的工程目錄文件夾(bomb),在object_detection/samples/config文件夾里面,找到你自己想要的模型的config文件,復制粘貼到自己剛才建立的工程目錄里,并按自己的模型分類進行修改
3.主要的修改是num_classes分類的類別數,batch_size按自己數據的大小選擇適合的,學習率與batchsize應該按比例增加或減少。num_steps看自己的訓練時間,fine_tune_checkpoint是遷移學習的,如果沒有model.ckpt文件就要注釋掉,input_path是自己的record文件的位置,label_map_path是自己的pbtxt文件的位置)。
4.訓練保存模型
首先需在models/research/目錄下執行:python setup.py install,在在models/research/slim目錄下執行:python setup.py install
1.訓練模型:在object detection 文件夾打開虛擬環境
train_dir是自己的工程文件夾,pipeline_config_path是工程文件夾下剛才修改的配置文件
這里我遇到了顯存不足的問題,該config里的batchsize還是不行,于是在legacy下的train.py文件開頭添加了
#放在代碼頂部的導入包的位置 import os os.environ["CUDA_VISIBLE_DEVICES"] = "0"或者
from tensorflow.compat.v1 import InteractiveSession import tensorflow as tf config = tf.compat.v1.ConfigProto(gpu_options=tf.compat.v1.GPUOptions(allow_growth=True))sess = tf.compat.v1.Session(config=config)查看tensorboard,
tensorboard --logdir=bomb831\2.保存模型
訓練完成后,在object detection 文件夾打開虛擬環境,bomb2/ssdlite_mobilenet_v1_coco.config是自己工程項目下生成的config文件,bomb2/model.ckpt-100是上一步訓練的結果文件,bomb2_model是模型的保存文件夾。
5.進行模型驗證
將自己的驗證圖片放在文件夾下,我放在了object_detection/test_images下。
import numpy as np import os import six.moves.urllib as urllib import sys import tarfile import tensorflow as tf config = tf.compat.v1.ConfigProto(gpu_options=tf.compat.v1.GPUOptions(allow_growth=True)) sess = tf.compat.v1.Session(config=config) import zipfile from distutils.version import StrictVersion from collections import defaultdict from io import StringIO from matplotlib import pyplot as plt from PIL import Image import cv2 #主要改五處,剩下的自行查看修改。 # This is needed since the notebook is stored in the object_detection folder. sys.path.append("..") from object_detection.utils import ops as utils_opsfrom utils import label_map_utilfrom utils import visualization_utils as vis_util# # Model preparation # ## Variables # # Any model exported using the `export_inference_graph.py` tool can be loaded here simply by changing `PATH_TO_FROZEN_GRAPH` to point to a new .pb file. # # By default we use an "SSD with Mobilenet" model here. See the [detection model zoo](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md) for a list of other models that can be run out-of-the-box with varying speeds and accuracies.# In[ ]:# What model to download. MODEL_NAME = 'bomb2_model'#1.改為自己生成模型的文件夾# Path to frozen detection graph. This is the actual model that is used for the object detection. PATH_TO_FROZEN_GRAPH = MODEL_NAME + '/frozen_inference_graph.pb'# List of the strings that is used to add correct label for each box. PATH_TO_LABELS = os.path.join('data', 'bomb2.pbtxt')#2.bptxt文件,我把他放在了object—_detection\data\下的bomb2.pbtxt # ## Load a (frozen) Tensorflow model into memory.# In[ ]:detection_graph = tf.Graph() with detection_graph.as_default():od_graph_def = tf.GraphDef()with tf.gfile.GFile(PATH_TO_FROZEN_GRAPH, 'rb') as fid:serialized_graph = fid.read()od_graph_def.ParseFromString(serialized_graph)tf.import_graph_def(od_graph_def, name='')# ## Loading label map # Label maps map indices to category names, so that when our convolution network predicts `5`, we know that this corresponds to `airplane`. Here we use internal utility functions, but anything that returns a dictionary mapping integers to appropriate string labels would be fine# In[ ]:category_index = label_map_util.create_category_index_from_labelmap(PATH_TO_LABELS, use_display_name=True)# ## Helper code# In[ ]:def load_image_into_numpy_array(image):(im_width, im_height) = image.sizereturn np.array(image.getdata()).reshape((im_height, im_width, 3)).astype(np.uint8)# # Detection# In[ ]:PATH_TO_TEST_IMAGES_DIR = 'test_images'#3.測試圖片的文件夾目錄,在object_detection\test_images里放測試圖片 TEST_IMAGE_PATHS = [ os.path.join(PATH_TO_TEST_IMAGES_DIR, '{}.jpg'.format(i)) for i in range(1, 3)]#4.圖片的名稱,這里是1.jpg和2.jpg # TEST_IMAGE_PATHS = [os.path.join(PATH_TO_TEST_IMAGES_DIR, '1.png')]# Size, in inches, of the output images. IMAGE_SIZE = (20, 14)## In[ ]:def run_inference_for_single_image(image, graph):with graph.as_default():with tf.Session() as sess:# Get handles to input and output tensorsops = tf.get_default_graph().get_operations()all_tensor_names = {output.name for op in ops for output in op.outputs}tensor_dict = {}for key in ['num_detections', 'detection_boxes', 'detection_scores','detection_classes', 'detection_masks']:tensor_name = key + ':0'if tensor_name in all_tensor_names:tensor_dict[key] = tf.get_default_graph().get_tensor_by_name(tensor_name)if 'detection_masks' in tensor_dict:# The following processing is only for single imagedetection_boxes = tf.squeeze(tensor_dict['detection_boxes'], [0])detection_masks = tf.squeeze(tensor_dict['detection_masks'], [0])# Reframe is required to translate mask from box coordinates to image coordinates and fit the image size.real_num_detection = tf.cast(tensor_dict['num_detections'][0], tf.int32)detection_boxes = tf.slice(detection_boxes, [0, 0], [real_num_detection, -1])detection_masks = tf.slice(detection_masks, [0, 0, 0], [real_num_detection, -1, -1])detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks(detection_masks, detection_boxes, image.shape[1], image.shape[2])detection_masks_reframed = tf.cast(tf.greater(detection_masks_reframed, 0.5), tf.uint8)# Follow the convention by adding back the batch dimensiontensor_dict['detection_masks'] = tf.expand_dims(detection_masks_reframed, 0)image_tensor = tf.get_default_graph().get_tensor_by_name('image_tensor:0')# Run inferenceoutput_dict = sess.run(tensor_dict,feed_dict={image_tensor: image})# print(output_dict)# all outputs are float32 numpy arrays, so convert types as appropriateoutput_dict['num_detections'] = int(output_dict['num_detections'][0])output_dict['detection_classes'] = output_dict['detection_classes'][0].astype(np.int64)output_dict['detection_boxes'] = output_dict['detection_boxes'][0]output_dict['detection_scores'] = output_dict['detection_scores'][0]if 'detection_masks' in output_dict:output_dict['detection_masks'] = output_dict['detection_masks'][0]return output_dict# In[ ]:i = 40 for image_path in TEST_IMAGE_PATHS:image = Image.open(image_path)# the array based representation of the image will be used later in order to prepare the# result image with boxes and labels on it.image_np = load_image_into_numpy_array(image)# Expand dimensions since the model expects images to have shape: [1, None, None, 3]image_np_expanded = np.expand_dims(image_np, axis=0)# Actual detection.output_dict = run_inference_for_single_image(image_np_expanded, detection_graph)# print(output_dict)# Visualization of the results of a detection.image = vis_util.visualize_boxes_and_labels_on_image_array(image_np,output_dict['detection_boxes'],output_dict['detection_classes'],output_dict['detection_scores'],category_index,# instance_masks=output_dict.get('detection_masks'),use_normalized_coordinates=True,min_score_thresh=0.1,#5.可信度閾值line_thickness=4)# print(coordinate,score)cv2.imwrite(f'test_images\\{i}.jpg', image)# i += 1cv2.namedWindow('1',0)cv2.imshow('1',image_np)cv2.waitKey(0)# plt.figure(figsize=IMAGE_SIZE)# plt.imshow(image_np)# plt.show()6.使用zed相機實時檢測
zed相機可以參考我之前的博客,傳送門,在下載的zed_sdk文件夾下打開自己的虛擬環境,運行python的api文件:
python get_python_api.py這時會讓你接著下載whl文件,如圖所示
按照他的提示,進行輸入下載,然后就可以調用zed的python接口了。
下面用自己訓練的文件使用zed實現實時的目標檢測功能
7.Android端使用實時檢測
7.1 將pb文件轉換成tflite文件
2.下載Bazel的win版本
3.將下載的exe文件改名為bazel.exe,然后將它放在一個文件夾下,并將位置添加到系統的環境變量path中,就可以成功運行bazel了。
然后用bazel對文件build,獲取.pb模型輸入輸出節點array名稱和相關矩陣參數。這里我沒有跑成功,具體可以參考我師父的博客,不過如果用tensorflow自己的模型的話也可以不用build,因為名稱和相關矩陣可以找到。
4.運行以下py文件就可以生成tflite文件。
7.2安裝android studio
安裝java,安裝android studio,并運行之后,發現報錯,沒有sdk,于是在Android Studio的安裝目錄下,找到\bin\idea.properties,在尾行添加disable.android.first.run=true,表示初次啟動不檢測SDK。
然后運行,并下載sdk,下載jdk,最后打開官方給的工程文件。
7.2.1將自己生成的tflite文件放在android/app/src/main/assets中,并在這個目錄下新建一個txt文件存放自己的識別標簽,如下圖
7.2.2去掉gradle文件里的一行注釋,如圖所示
7.2.3修改三處內容(復制關鍵字,右鍵app>>find infiles>>輸入關鍵字并打開就可以找到文件)
在運行程序時,出現了sdk不兼容問題,于是app那個小綠圖標有叉號,而且運行報錯No variants found for ‘app’. Check build files to ensure at least one variant exists.。具體的解決方法是:SDK Manager中選中Android 10進行下載,然后 File -> Sync Project with Gradle Files
然后就可以運行到手機,打開設置中的開發人員操作。
8.對訓練的圖像進行數據增強
首先pip install imgaug,然后運行程序,增強訓練數據集,我途中遇到了編碼問題,解決方法是在open方式里增加encoding=‘utf-8’,我改的好像是elementtree.py文件。這個py文件新開一個工程在main中粘貼運行,他在這一個工程下運行會出問題。
import xml.etree.ElementTree as ET import pickle import os from os import getcwd import numpy as np from PIL import Imageimport imgaug as ia from imgaug import augmenters as iaaia.seed(1)def read_xml_annotation(root, image_id):in_file = open(os.path.join(root, image_id),encoding='UTF-8')tree = ET.parse(in_file)root = tree.getroot()bndboxlist = []for object in root.findall('object'): # 找到root節點下的所有country節點bndbox = object.find('bndbox') # 子節點下節點rank的值xmin = int(bndbox.find('xmin').text)xmax = int(bndbox.find('xmax').text)ymin = int(bndbox.find('ymin').text)ymax = int(bndbox.find('ymax').text)# print(xmin,ymin,xmax,ymax)bndboxlist.append([xmin,ymin,xmax,ymax])# print(bndboxlist)bndbox = root.find('object').find('bndbox')return bndboxlistdef change_xml_list_annotation(root, image_id, new_target,saveroot,id,h,w):in_file = open(os.path.join(root, str(image_id) + '.xml'),encoding='UTF-8') # 這里root分別由兩個意思tree = ET.parse(in_file)xmlroot = tree.getroot()index = 0aaa = xmlroot.find('path')#print(aaa)#aaa.text = 'C:\\Users\\YFZX\\Desktop\\6#\\img_aug\\' + str(image_id) + "_aug_" + str(id) + '.jpg'for object in xmlroot.findall('object'): # 找到root節點下的所有object節點bndbox = object.find('bndbox') # 子節點下節點rank的值# xmin = int(bndbox.find('xmin').text)# xmax = int(bndbox.find('xmax').text)# ymin = int(bndbox.find('ymin').text)# ymax = int(bndbox.find('ymax').text)new_xmin = new_target[index][0]new_ymin = new_target[index][1]new_xmax = new_target[index][2]new_ymax = new_target[index][3]xmin = bndbox.find('xmin')xmin.text = str(new_xmin)ymin = bndbox.find('ymin')ymin.text = str(new_ymin)xmax = bndbox.find('xmax')xmax.text = str(new_xmax)ymax = bndbox.find('ymax')ymax.text = str(new_ymax)index = index + 1if new_xmin>0 and new_ymin >0 and new_xmax<w and new_ymax<h:tree.write(os.path.join(saveroot, str(image_id) + "_aug_" + str(id) + '.xml'))def mkdir(path):# 去除首位空格path = path.strip()# 去除尾部 \ 符號path = path.rstrip("\\")# 判斷路徑是否存在# 存在 True# 不存在 FalseisExists = os.path.exists(path)# 判斷結果if not isExists:# 如果不存在則創建目錄# 創建目錄操作函數os.makedirs(path)print(path + ' 創建成功')return Trueelse:# 如果目錄存在則不創建,并提示目錄已存在print(path + ' 目錄已存在')return Falseif __name__ == "__main__":IMG_DIR = r"D:\models-r1.13.0\models-r1.13.0\research\object_detection\bomb819\test" ## 原始圖片XML_DIR = r"D:\models-r1.13.0\models-r1.13.0\research\object_detection\bomb819\test_xml" ## 原始xmlAUG_XML_DIR = r"D:\models-r1.13.0\models-r1.13.0\research\object_detection\bomb819\test_xml_aug" # 存儲增強后的XML文件夾路徑mkdir(AUG_XML_DIR)AUG_IMG_DIR = r"D:\models-r1.13.0\models-r1.13.0\research\object_detection\bomb819\test_aug" # 存儲增強后的影像文件夾路徑mkdir(AUG_IMG_DIR)AUGLOOP = 60 # 每張影像增強的數量boxes_img_aug_list = []new_bndbox = []new_bndbox_list = []# 影像增強seq = iaa.Sequential([iaa.Flipud(0.5), # vertically flip 20% of all imagesiaa.Fliplr(0.5), # 鏡像#iaa.Multiply((1.2, 1.5),per_channel=0.2), # change brightness, doesn't affect BBs#iaa.GaussianBlur(sigma=(0, 3.0)),#iaa.AdditiveGaussianNoise(loc=0, scale=(0.0, 0.3*255), per_channel=0.5), #### loc 噪聲均值,scale噪聲方差,50%的概率,對圖片進行添加白噪聲并應用于每個通道iaa.Multiply((0.75, 1.5), per_channel=1), ####20%的圖片像素值乘以0.5-2中間的數值,用以增加圖片明亮度或改變顏色#iaa.Affine(# translate_px={"x": 15, "y": 15},# scale=(0.8, 0.95),# rotate=(-30, 30)#), # translate by 40/60px on x/y axis, and scale to 50-70%, affects BBsiaa.Crop(percent=(0, 0.1),keep_size=True),# 0-0.1的數值,分別乘以圖片的寬和高為剪裁的像素個數,保持原尺寸iaa.Affine(scale=(0.8, 1.5),translate_percent=None,translate_px=None,rotate=(-180, 180),shear=0.0,order=1,cval=0,mode='constant',)],random_order= True)for root, sub_folders, files in os.walk(XML_DIR):for name in files:bndbox = read_xml_annotation(XML_DIR, name)for epoch in range(AUGLOOP):seq_det = seq.to_deterministic() # 保持坐標和圖像同步改變,而不是隨機# 讀取圖片img = Image.open(os.path.join(IMG_DIR, name[:-4] + '.jpg'))img = np.array(img)#print(img.shape)(h,w,c)h=img.shape[0]w=img.shape[1]# bndbox 坐標增強for i in range(len(bndbox)):bbs = ia.BoundingBoxesOnImage([ia.BoundingBox(x1=bndbox[i][0], y1=bndbox[i][1], x2=bndbox[i][2], y2=bndbox[i][3]),], shape=img.shape)bbs_aug = seq_det.augment_bounding_boxes([bbs])[0]boxes_img_aug_list.append(bbs_aug)# new_bndbox_list:[[x1,y1,x2,y2],...[],[]]new_bndbox_list.append([int(bbs_aug.bounding_boxes[0].x1),int(bbs_aug.bounding_boxes[0].y1),int(bbs_aug.bounding_boxes[0].x2),int(bbs_aug.bounding_boxes[0].y2)])# 存儲變化后的圖片image_aug = seq_det.augment_images([img])[0]path = os.path.join(AUG_IMG_DIR, str(name[:-4]) + "_aug_" + str(epoch) + '.jpg')# image_auged = bbs.draw_on_image(image_aug, thickness=0)Image.fromarray(image_aug).save(path)# 存儲變化后的XMLchange_xml_list_annotation(XML_DIR, name[:-4], new_bndbox_list, AUG_XML_DIR, epoch,h,w)print(str(name[:-4]) + "_aug_" + str(epoch) + '.jpg')new_bndbox_list = []畫出增強后的圖像的畫框圖像
import os import cv2 as cv import xml.etree.ElementTree as ETdef xml_to_jpg(imgs_path, xmls_path, out_path):imgs_list = os.listdir(imgs_path) #讀取圖片列表xmls_list = os.listdir(xmls_path) # 讀取xml列表if len(imgs_list) <= len(xmls_list): #若圖片個數小于或等于xml個數,從圖片里面找與xml匹配的for imgName in imgs_list:temp1 = imgName.split('.')[0] #圖片名 例如123.jpg 分割之后 temp1 = 123temp1_ = imgName.split('.')[1] #圖片后綴if temp1_!='jpg':continuefor xmlName in xmls_list: #遍歷xml列表,temp2 = xmlName.split('.')[0] #xml名temp2_ = xmlName.split('.')[1]if temp2_ != 'xml':continueif temp2!=temp1: #判斷圖片名與xml名是否相同,不同的話跳過下面的步驟 繼續找continueelse: #相同的話 開始讀取xml坐標信息,并在對應的圖片上畫框img_path = os.path.join(imgs_path, imgName)xml_path = os.path.join(xmls_path, xmlName)img = cv.imread(img_path)labelled = imgroot = ET.parse(xml_path).getroot()for obj in root.iter('object'):bbox = obj.find('bndbox')xmin = int(bbox.find('xmin').text.strip())ymin = int(bbox.find('ymin').text.strip())xmax = int(bbox.find('xmax').text.strip())ymax = int(bbox.find('ymax').text.strip())labelled = cv.rectangle(labelled, (xmin, ymin), (xmax, ymax), (0, 0, 255), 2)cv.imwrite(out_path + '\\' +imgName, labelled)breakelse: # 若xml個數小于圖片個數,從xml里面找與圖片匹配的。下面操作與上面差不多for xmlName in xmls_list:temp1 = xmlName.split('.')[0]temp1_ = xmlName.split('.')[1]if temp1_ != 'xml':continuefor imgName in imgs_list:temp2 = imgName.split('.')[0]temp2_ = imgName.split('.')[1] # 圖片后綴if temp2_ != 'jpg':continueif temp2 != temp1:continueelse:img_path = os.path.join(imgs_path, imgName)xml_path = os.path.join(xmls_path, xmlName)img = cv.imread(img_path)labelled = imgroot = ET.parse(xml_path).getroot()for obj in root.iter('object'):bbox = obj.find('bndbox')xmin = int(bbox.find('xmin').text.strip())ymin = int(bbox.find('ymin').text.strip())xmax = int(bbox.find('xmax').text.strip())ymax = int(bbox.find('ymax').text.strip())labelled = cv.rectangle(labelled, (xmin, ymin), (xmax, ymax), (0, 0, 255), 1)cv.imwrite(out_path + '\\' +imgName, labelled)break if __name__ == '__main__': # 使用英文路徑,中文路徑讀不進來imgs_path =r'C:\Users\YFZX\Desktop\models-r1.13.0\models-r1.13.0\research\object_detection\bomb819\train' #圖片路徑xmls_path = r'C:\Users\YFZX\Desktop\models-r1.13.0\models-r1.13.0\research\object_detection\bomb819\train_xml' #xml路徑retangele_img_path =r'C:\Users\YFZX\Desktop\models-r1.13.0\models-r1.13.0\research\object_detection\bomb819\train_xml_tojpg' #保存畫框后圖片的路徑xml_to_jpg(imgs_path, xmls_path, retangele_img_path)9.模型應用于網絡攝像頭
#!/usr/bin/env python # -*- coding: utf-8 -*- from time import sleepimport numpy as np import os import sys import tensorflow as tf config = tf.compat.v1.ConfigProto(gpu_options=tf.compat.v1.GPUOptions(allow_growth=True)) sess = tf.compat.v1.Session(config=config) import cv2 os.chdir(r'E:\models-r1.13.0\models-r1.13.0\research\object_detection') sys.path.append("..") # Object detection imports from utils import label_map_util from utils import visualization_utils as vis_util # Model preparation MODEL_NAME = 'bomb819_model' PATH_TO_CKPT = MODEL_NAME + '/1_frozen_inference_graph.pb' PATH_TO_LABELS = os.path.join('data', '1_bomb.pbtxt')NUM_CLASSES = 5 # Load a (frozen) Tensorflow model into memory. detection_graph = tf.Graph() with detection_graph.as_default():od_graph_def = tf.GraphDef()with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:serialized_graph = fid.read()od_graph_def.ParseFromString(serialized_graph)tf.import_graph_def(od_graph_def, name='')# Loading label map label_map = label_map_util.load_labelmap(PATH_TO_LABELS) categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES,use_display_name=True) category_index = label_map_util.create_category_index(categories)# Helper code def load_image_into_numpy_array(image):(im_width, im_height) = image.sizereturn np.array(image.getdata()).reshape((im_height, im_width, 3)).astype(np.uint8)with detection_graph.as_default():with tf.Session(graph=detection_graph) as sess:# Definite input and output Tensors for detection_graphimage_tensor = detection_graph.get_tensor_by_name('image_tensor:0')# Each box represents a part of the image where a particular object was detected.detection_boxes = detection_graph.get_tensor_by_name('detection_boxes:0')# Each score represent how level of confidence for each of the objects.# Score is shown on the result image, together with the class label.detection_scores = detection_graph.get_tensor_by_name('detection_scores:0')detection_classes = detection_graph.get_tensor_by_name('detection_classes:0')num_detections = detection_graph.get_tensor_by_name('num_detections:0')# the video to be detected, eg, "test.mp4" here# url = "rtsp://admin:yzwlgzw123@192.168.0.99/12"url = "rtsp://admin:Yfzx6666@192.168.3.144/11"vidcap = cv2.VideoCapture(url)# Default resolutions of the frame are obtained.The default resolutions are system dependent.# We convert the resolutions from float to integer.while (1):sleep(0.001)ret, image = vidcap.read()if ret == True:image_np = image#x, y = image_np.shape[0:2]#image_np= cv2.resize(image_np, (int(y * 2), int(x * 2)))# Expand dimensions since the model expects images to have shape: [1, None, None, 3]image_np_expanded = np.expand_dims(image_np, axis=0)# Actual detection.(boxes, scores, classes, num) = sess.run([detection_boxes, detection_scores, detection_classes, num_detections],feed_dict={image_tensor: image_np_expanded})# Visualization of the results of a detection.vis_util.visualize_boxes_and_labels_on_image_array(image_np,np.squeeze(boxes),np.squeeze(classes).astype(np.int32),np.squeeze(scores),category_index,use_normalized_coordinates=True,line_thickness=2)#print(scores)cv2.imshow("capture",image_np)if cv2.waitKey(1) == ord('q'):break vidcap.release() cv2.destroyAllWindows()聲明下,本人初入職場,菜鳥小白,這個項目是工作接觸的第二個任務,全程在小楊師傅的指導下完成,哈哈哈哈,要向優秀的的小楊師傅努力學習呀!加油啦!
工程文件
文件太大,可以留言或私信找我要。
參考
【1】https://blog.csdn.net/weixin_42232538/article/details/111141445
【2】https://my.oschina.net/u/3732258/blog/4698658
總結
以上是生活随笔為你收集整理的win下使用TensorFlow object detection训练自己模型的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 应用ESP8266控制433M无线遥控电
- 下一篇: 微信小程序--查看变量类型的方法(简易)