當(dāng)前位置：首頁 > 人工智能 > 目标检测 >内容正文

目标检测

realsense D455深度相机+YOLO V5结合实现目标检测（二）

發(fā)布時(shí)間：2024/3/24 目标检测 127 豆豆

生活随笔收集整理的這篇文章主要介紹了 realsense D455深度相机+YOLO V5结合实现目标检测（二）小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

realsense D455深度相機(jī)+YOLO V5結(jié)合實(shí)現(xiàn)目標(biāo)檢測（二）

1.代碼來源
2.環(huán)境配置
3.代碼分析：
- 3.1 主要展示在將detect.py轉(zhuǎn)換為realsensedetect.py的文件部分，大家也可以直接將自己的detect.py 文件改成下面的文件，直接執(zhí)行即可。
- 3.2 文件或者文件夾里面文件的對比差異分析軟件介紹：
4. 思考與結(jié)束語

realsense D455深度相機(jī)+YOLO V5結(jié)合實(shí)現(xiàn)目標(biāo)檢測（一）第一篇鏈接

為什么會(huì)出現(xiàn)關(guān)于realsense D455 +YOLO V5結(jié)合的第二篇文章呢，因?yàn)樯弦黄恼率菑膅ithub上面找到并且跑通之后寫的，后來發(fā)現(xiàn)怎么也用不到我自己git下來的YOLO V5代碼之中，發(fā)現(xiàn)還是缺一點(diǎn)東西，所以從各種途徑中學(xué)習(xí)后將原汁原味的從github上找到的YOLO v5代碼應(yīng)用到了里面，最后可以很好的檢測啦！

可以實(shí)現(xiàn)將D435,D455深度相機(jī)和yolo v5結(jié)合到一起，在識(shí)別物體的同時(shí)，還能測到物體相對與相機(jī)的距離。

說明一下為什么需要做這個(gè)事情？1.首先為什么需要用到realsense D455深度相機(jī)? 因?yàn)樗瞧胀ǖ南鄼C(jī)還加了一個(gè)紅外測距的東西，所以其他二維圖像一樣，能夠得到三維世界在二維像素平面的投影，也就是圖片，但是我們損失了一個(gè)深度的維度以后得到的都是投影的東西，比如說蘋果可以和足球一樣大，因?yàn)槲覀儾恢郎疃纫簿褪俏矬w距離相機(jī)的距離信息，所以我們需要一個(gè)深度相機(jī)來實(shí)現(xiàn)測距離。2.為什么需要用到y(tǒng)olo算法？因?yàn)樗趯?shí)時(shí)性和準(zhǔn)確率方面都可以，可以應(yīng)用于工農(nóng)業(yè)生產(chǎn)當(dāng)中，所以肯定很需要。所以才會(huì)有這二者的結(jié)合的必要性！

1.代碼來源

這是我第一次將代碼更改后放在了github上，希望大家多多star,主要重寫了detect.py文件為realsensedetect.py.首先大家如果想用這個(gè)代碼的話可以去這里git clone 這是代碼鏈接（為了防止鏈接不過去還是再寫在這里 https://github.com/wenyishengkingkong/realsense-D455-YOLOV5.git）。

2.環(huán)境配置

大家按照YOLO V5環(huán)境配置方法配置環(huán)境就可以，或者是向前面的一篇一樣前面的一篇，有一個(gè)簡單的配置。

然后cd到進(jìn)入工程文件夾下執(zhí)行：

python realsensedetect.py

主要重寫了detect.py部分為realsensedetect.py文件。運(yùn)行結(jié)果如下：

3.代碼分析：

3.1 主要展示在將detect.py轉(zhuǎn)換為realsensedetect.py的文件部分，大家也可以直接將自己的detect.py 文件改成下面的文件，直接執(zhí)行即可。

import argparse import os import shutil import time from pathlib import Pathimport cv2 import torch import torch.backends.cudnn as cudnn from numpy import random import numpy as np import pyrealsense2 as rsfrom models.experimental import attempt_load from utils.general import (check_img_size, non_max_suppression, apply_classifier, scale_coords,xyxy2xywh, plot_one_box, strip_optimizer, set_logging) from utils.torch_utils import select_device, load_classifier, time_synchronized from utils.datasets import letterboxdef detect(save_img=False):out, source, weights, view_img, save_txt, imgsz = \opt.save_dir, opt.source, opt.weights, opt.view_img, opt.save_txt, opt.img_sizewebcam = source == '0' or source.startswith(('rtsp://', 'rtmp://', 'http://')) or source.endswith('.txt')# Initializeset_logging()device = select_device(opt.device)if os.path.exists(out): # output dirshutil.rmtree(out) # delete diros.makedirs(out) # make new dirhalf = device.type != 'cpu' # half precision only supported on CUDA# Load modelmodel = attempt_load(weights, map_location=device) # load FP32 modelimgsz = check_img_size(imgsz, s=model.stride.max()) # check img_sizeif half:model.half() # to FP16# Set Dataloadervid_path, vid_writer = None, Noneview_img = Truecudnn.benchmark = True # set True to speed up constant image size inference#dataset = LoadStreams(source, img_size=imgsz)# Get names and colorsnames = model.module.names if hasattr(model, 'module') else model.namescolors = [[random.randint(0, 255) for _ in range(3)] for _ in range(len(names))]# Run inferencet0 = time.time()img = torch.zeros((1, 3, imgsz, imgsz), device=device) # init img_ = model(img.half() if half else img) if device.type != 'cpu' else None # run oncepipeline = rs.pipeline()# 創(chuàng)建 config 對象：config = rs.config()# config.enable_stream(rs.stream.depth, 640, 480, rs.format.z16, 30)config.enable_stream(rs.stream.depth, 640, 480, rs.format.z16, 60)config.enable_stream(rs.stream.color, 640, 480, rs.format.bgr8, 60)# Start streamingpipeline.start(config)align_to_color = rs.align(rs.stream.color)while True:start = time.time()# Wait for a coherent pair of frames（一對連貫的幀）: depth and colorframes = pipeline.wait_for_frames()frames = align_to_color.process(frames)# depth_frame = frames.get_depth_frame()depth_frame = frames.get_depth_frame()color_frame = frames.get_color_frame()color_image = np.asanyarray(color_frame.get_data())depth_image = np.asanyarray(depth_frame.get_data())mask = np.zeros([color_image.shape[0], color_image.shape[1]], dtype=np.uint8)mask[0:480, 320:640] = 255sources = [source]imgs = [None]path = sourcesimgs[0] = color_imageim0s = imgs.copy()img = [letterbox(x, new_shape=imgsz)[0] for x in im0s]img = np.stack(img, 0)img = img[:, :, :, ::-1].transpose(0, 3, 1, 2) # BGR to RGB, to 3x416x416, uint8 to float32img = np.ascontiguousarray(img, dtype=np.float16 if half else np.float32)img /= 255.0 # 0 - 255 to 0.0 - 1.0# Get detectionsimg = torch.from_numpy(img).to(device)if img.ndimension() == 3:img = img.unsqueeze(0)t1 = time_synchronized()pred = model(img, augment=opt.augment)[0]# Apply NMSpred = non_max_suppression(pred, opt.conf_thres, opt.iou_thres, classes=opt.classes, agnostic=opt.agnostic_nms)t2 = time_synchronized()for i, det in enumerate(pred): # detections per imagep, s, im0 = path[i], '%g: ' % i, im0s[i].copy()s += '%gx%g ' % img.shape[2:] # print stringgn = torch.tensor(im0.shape)[[1, 0, 1, 0]] # normalization gain whwhif det is not None and len(det):# Rescale boxes from img_size to im0 sizedet[:, :4] = scale_coords(img.shape[2:], det[:, :4], im0.shape).round()# Print resultsfor c in det[:, -1].unique():n = (det[:, -1] == c).sum() # detections per classs += '%g %ss, ' % (n, names[int(c)]) # add to string# Write resultsfor *xyxy, conf, cls in reversed(det):xywh = (xyxy2xywh(torch.tensor(xyxy).view(1, 4)) / gn).view(-1).tolist() # normalized xywhline = (cls, conf, *xywh) if opt.save_conf else (cls, *xywh) # label formatdistance_list = []mid_pos = [int((int(xyxy[0]) + int(xyxy[2])) / 2), int((int(xyxy[1]) + int(xyxy[3])) / 2)] # 確定索引深度的中心像素位置左上角和右下角相加在/2min_val = min(abs(int(xyxy[2]) - int(xyxy[0])), abs(int(xyxy[3]) - int(xyxy[1]))) # 確定深度搜索范圍# print(box,)randnum = 40for i in range(randnum):bias = random.randint(-min_val // 4, min_val // 4)dist = depth_frame.get_distance(int(mid_pos[0] + bias), int(mid_pos[1] + bias))# print(int(mid_pos[1] + bias), int(mid_pos[0] + bias))if dist:distance_list.append(dist)distance_list = np.array(distance_list)distance_list = np.sort(distance_list)[randnum // 2 - randnum // 4:randnum // 2 + randnum // 4] # 冒泡排序+中值濾波label = '%s %.2f%s' % (names[int(cls)], np.mean(distance_list), 'm')plot_one_box(xyxy, im0, label=label, color=colors[int(cls)], line_thickness=3)# Print time (inference + NMS)print('%sDone. (%.3fs)' % (s, t2 - t1))# Stream resultsif view_img:cv2.imshow(p, im0)if cv2.waitKey(1) == ord('q'): # q to quitraise StopIterationprint('Done. (%.3fs)' % (time.time() - t0))if __name__ == '__main__':parser = argparse.ArgumentParser()parser.add_argument('--weights', nargs='+', type=str, default='yolov5m.pt', help='model.pt path(s)')parser.add_argument('--source', type=str, default='inference/images', help='source') # file/folder, 0 for webcamparser.add_argument('--img-size', type=int, default=640, help='inference size (pixels)')parser.add_argument('--conf-thres', type=float, default=0.25, help='object confidence threshold')parser.add_argument('--iou-thres', type=float, default=0.45, help='IOU threshold for NMS')parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')parser.add_argument('--view-img', action='store_true', help='display results')parser.add_argument('--save-txt', action='store_true', help='save results to *.txt')parser.add_argument('--save-conf', action='store_true', help='save confidences in --save-txt labels')parser.add_argument('--save-dir', type=str, default='inference/output', help='directory to save results')parser.add_argument('--classes', nargs='+', type=int, help='filter by class: --class 0, or --class 0 2 3')parser.add_argument('--agnostic-nms', action='store_true', help='class-agnostic NMS')parser.add_argument('--augment', action='store_true', help='augmented inference')parser.add_argument('--update', action='store_true', help='update all models')opt = parser.parse_args()print(opt)with torch.no_grad(): # 一個(gè)上下文管理器，被該語句wrap起來的部分將不會(huì)track梯度detect()

相信大家看到這么多代碼已經(jīng)覺得頭疼了，其實(shí)更改的就不多的幾行，只不過是將順序的和位置更改了一下。大家如果覺得麻煩，有兩個(gè)軟件可以輔助大家對文件進(jìn)行對比（說明上面用的到是YOLO V5代碼中的v3.1版本，相信換成其他版本應(yīng)該不會(huì)有任何問題，對于其他的目標(biāo)檢測算法沒有進(jìn)行試驗(yàn)，相信應(yīng)該都是換湯不換藥）。

3.2 文件或者文件夾里面文件的對比差異分析軟件介紹：

無論是在windows上或者是在ubuntu上面，好用的pycharm軟件都是可以應(yīng)用的，可以在選擇文件或者文件夾然后右鍵有一個(gè)compare with的選項(xiàng)就可以進(jìn)行差異分析了，大家可以對比上面realsensedetect.py文件和detect.py文件兩者的差異部分就可以知道到底更改了多少。第二是在Windows上面可以應(yīng)用diffnity的軟件，按道理來說挺好用的！

4. 思考與結(jié)束語

為什么需要用到這個(gè)realsense深度相機(jī)呢，正如上一篇講述的一樣，他會(huì)增加一個(gè)維度，就是距離，那多的這個(gè)維度到底有什么應(yīng)用呢？首先第一個(gè)就是在社交距離檢測中，比如你發(fā)現(xiàn)檢測到一個(gè)人沒有戴口罩，那么你可以直接檢測到他距離攝像頭的距離，你就可以提前通知他帶好口罩，以避免在入口處人員多的時(shí)候交叉感染。這是一個(gè)實(shí)際的例子。其次，主要應(yīng)用在三維重建中，我們得到了物體的二維像素點(diǎn)和距離值，就可以通過三維重建或者數(shù)學(xué)建模來實(shí)現(xiàn)三維物體的重新建模，這是很重要的！最后，我們都可以利用已經(jīng)得到的信息進(jìn)行三維建模和用pcl庫進(jìn)行更加準(zhǔn)確的距離計(jì)算，實(shí)現(xiàn)在現(xiàn)實(shí)世界中的應(yīng)用！

這是第一在github上git自己的代碼，希望能夠幫助到您，對我感興趣的童鞋可以關(guān)注我，說不定那一天就可以幫到您！

總結(jié)

以上是生活随笔為你收集整理的realsense D455深度相机+YOLO V5结合实现目标检测（二）的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇： PTX JIT complied fai
下一篇： ip 路由选项