當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

Waymo自动驾驶数据集介绍与使用教程

發布時間：2023/12/10 编程问答 36 豆豆

生活随笔收集整理的這篇文章主要介紹了 Waymo自动驾驶数据集介绍与使用教程小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

本文將對Waymo自動駕駛數據集（Waymo Open Dataset）進行介紹。

論文鏈接為：https://arxiv.org/abs/1912.04838v7

項目鏈接為：https://github.com/waymo-research/waymo-open-dataset

數據集鏈接為：https://waymo.com/open

1. 自動駕駛感知可擴展性：Waymo開放數據集

這里首先對論文進行解讀。

1.1 Abstract

整個數據集包含1150個場景，每個場景時長為20秒，且LiDAR和Camera是經過同步和標定處理過的。對圖像和激光雷達的bounding box進行了仔細的標注，并在各幀之間使用了一致的標識符。

1.2 Introduction

整個數據集包含約1200萬個LiDAR注釋框和約1200萬個圖像注釋框，從而產生了約113k個LiDAR物體軌跡和約25萬個圖像軌跡。整個數據集劃分為1000個訓練集和150個測試集。

1.3 Waymo Open Dataset

1）Sensor Specifications

數據收集使用五個LiDAR和五個高分辨率針孔相機完成。限制了LiDAR數據范圍，并且每個激光脈沖有兩個回波（雙回波）。相機圖像是通過卷簾快門掃描捕獲的，其中確切的掃描模式可能因場景而異。所有相機圖像都將進行降采樣并從原始圖像中裁剪出來。

下圖和表格是傳感器安裝布置圖和規格介紹。

2）Coordinate Systems

整個數據集坐標系統遵守右手法則。全局坐標系為ENU坐標：Up(z)軸與重力方向一直，向上為正；East(x)沿著緯度指向正東，North(y)指向北極。車輛坐標系隨汽車移動，x軸指向前方，y軸指向左側，z軸向上為正。傳感器坐標可以通過旋轉矩陣從車輛坐標獲得，可以把這看作是外參矩陣。

圖像坐標是二維坐標，x軸表示圖像寬度，y軸表示圖像高度，圖像左上角為坐標原點。激光雷達得到的點的坐標可以表示為 $(x, y, z)$ ，轉換成距離，方位角，傾角公式為：

$range?=x2+y2+z2azimuth?=atan?2(y,x)inclination?=atan?2(z,x2+y2)\begin{aligned}\text { range } &=\sqrt{x^{2}+y^{2}+z^{2}} \\\text { azimuth } &=\operatorname{atan} 2(y, x) \\\text { inclination } &=\operatorname{atan} 2\left(z, \sqrt{x^{2}+y^{2}}\right) \end{aligned}$

3）Ground Truth Labels

數據集中對汽車、行人、交通標志、自行車人員進行了詳細標注。對于激光雷達數據，將每一個物體標注為7自由度3D bbox： $(cx,cy,cz,l,w,h,θ)(cx,cy,cz,l,w,h,\theta)$ 。其中 $c x, c y, c z$ 表示為bbox中心坐標， $l, w, h$ 表示為物體長寬高， $θ\theta$ 表示為物體偏航角，此外對于每一個物體還標注了一個唯一的追蹤ID編號。

圖像標注中將每一個物體標注為4自由度2D bbox： $(c x, c y, l, w)$ 。其中 $c x, c y$ 表示為bbox中心圖像坐標， $l$ 表示為物體長度， $w$ 表示為物體寬度。

此外，還將標注物體劃分為了兩個難度：LEVEL_2為物體對應激光雷達點數少于5個，其余則劃分為LEVEL_1。

4）Sensor Data

LiDAR 數據在數據集中被編碼為距離圖像，每個 LiDAR 返回一張；提供前兩次回波的數據。距離圖像格式類似于卷簾快門相機圖像，從左到右逐列填充。每個距離圖像像素對應一個 LiDAR 返回波。高度和寬度由 LiDAR 傳感器框架中傾角和方位角的分辨率決定。

此外，對于圖像和激光雷達，投影如下：

5）Dataset Analysis

下面兩張表格給出了數據集分布情況，不同城市，不同天氣，不同類別目標的統計情況。

1.4 Tasks

數據集任務劃分為2D和3D物體檢測和追蹤任務，訓練集場景有798個，驗證集場景有202個，測試集場景有150個。

1.5 Experiments

對于3D物體檢測，Waymo提供了一個Baseline，其使用的檢測方法是PointPillars。對于2D物體檢測，使用了Faster R-CNN作為baseline，對于3D物體追蹤，使用了AB3DMOT作為Baseline。

下面幾張表格給出了Baseline結果：行人和汽車檢測和追蹤結果，域適應檢測結果，數據集大小對檢測結果的影響。

Baseline APH and AP for vehicles and pedestriansBaseline multi-object tracking metrics for vehicles and pedestrians

2. Waymo Open Dataset Tutorial

下面介紹Waymo官放提供的數據使用教程，以下程序是在Colab上運行的。

2.1 Install waymo_open_dataset package

首先是安裝數據包：

!rm -rf waymo-od > /dev/null !git clone https://github.com/waymo-research/waymo-open-dataset.git waymo-od !cd waymo-od && git branch -a !cd waymo-od && git checkout remotes/origin/master !pip3 install --upgrade pip!pip3 install waymo-open-dataset-tf-2-1-0==1.2.0

然后是導入需要用到的庫：

import os import tensorflow.compat.v1 as tf import math import numpy as np import itertoolstf.enable_eager_execution()from waymo_open_dataset.utils import range_image_utils from waymo_open_dataset.utils import transform_utils from waymo_open_dataset.utils import frame_utils from waymo_open_dataset import dataset_pb2 as open_dataset

2.2 Read one frame

數據集中的每一個文件包含幀序列數據，按照時間戳排列，這里提取了兩幀來演示：

FILENAME = '/content/waymo-od/tutorial/frames' dataset = tf.data.TFRecordDataset(FILENAME, compression_type='') for data in dataset:frame = open_dataset.Frame()frame.ParseFromString(bytearray(data.numpy()))break(range_images, camera_projections, range_image_top_pose) = frame_utils.parse_range_image_and_camera_projection(frame)

可以查看每一幀包含的信息：

print(frame.context)

2.3 Visualize Camera Images and Camera Labels

圖片數據及其標簽可視化：

import matplotlib.pyplot as plt import matplotlib.patches as patchesdef show_camera_image(camera_image, camera_labels, layout, cmap=None):"""Show a camera image and the given camera labels."""ax = plt.subplot(*layout)# Draw the camera labels.for camera_labels in frame.camera_labels:# Ignore camera labels that do not correspond to this camera.if camera_labels.name != camera_image.name:continue# Iterate over the individual labels.for label in camera_labels.labels:# Draw the object bounding box.ax.add_patch(patches.Rectangle(xy=(label.box.center_x - 0.5 * label.box.length,label.box.center_y - 0.5 * label.box.width),width=label.box.length,height=label.box.width,linewidth=1,edgecolor='red',facecolor='none'))# Show the camera image.plt.imshow(tf.image.decode_jpeg(camera_image.image), cmap=cmap)plt.title(open_dataset.CameraName.Name.Name(camera_image.name))plt.grid(False)plt.axis('off')plt.figure(figsize=(25, 20))for index, image in enumerate(frame.images):show_camera_image(image, frame.camera_labels, [3, 3, index+1])

2.4 Visualize Range Images

Range圖片可視化：

plt.figure(figsize=(64, 20)) def plot_range_image_helper(data, name, layout, vmin = 0, vmax=1, cmap='gray'):"""Plots range image.Args:data: range image dataname: the image titlelayout: plt layoutvmin: minimum value of the passed datavmax: maximum value of the passed datacmap: color map"""plt.subplot(*layout)plt.imshow(data, cmap=cmap, vmin=vmin, vmax=vmax)plt.title(name)plt.grid(False)plt.axis('off')def get_range_image(laser_name, return_index):"""Returns range image given a laser name and its return index."""return range_images[laser_name][return_index]def show_range_image(range_image, layout_index_start = 1):"""Shows range image.Args:range_image: the range image data from a given lidar of type MatrixFloat.layout_index_start: layout offset"""range_image_tensor = tf.convert_to_tensor(range_image.data)range_image_tensor = tf.reshape(range_image_tensor, range_image.shape.dims)lidar_image_mask = tf.greater_equal(range_image_tensor, 0)range_image_tensor = tf.where(lidar_image_mask, range_image_tensor,tf.ones_like(range_image_tensor) * 1e10)range_image_range = range_image_tensor[...,0] range_image_intensity = range_image_tensor[...,1]range_image_elongation = range_image_tensor[...,2]plot_range_image_helper(range_image_range.numpy(), 'range',[8, 1, layout_index_start], vmax=75, cmap='gray')plot_range_image_helper(range_image_intensity.numpy(), 'intensity',[8, 1, layout_index_start + 1], vmax=1.5, cmap='gray')plot_range_image_helper(range_image_elongation.numpy(), 'elongation',[8, 1, layout_index_start + 2], vmax=1.5, cmap='gray') frame.lasers.sort(key=lambda laser: laser.name) show_range_image(get_range_image(open_dataset.LaserName.TOP, 0), 1) show_range_image(get_range_image(open_dataset.LaserName.TOP, 1), 4)

2.5 Point Cloud Conversion and Visualization

點云轉換和可視化：

points, cp_points = frame_utils.convert_range_image_to_point_cloud(frame, range_images, camera_projections, range_image_top_pose) points_ri2, cp_points_ri2 = frame_utils.convert_range_image_to_point_cloud(frame, range_images, camera_projections, range_image_top_pose, ri_index=1)# 3d points in vehicle frame. points_all = np.concatenate(points, axis=0) points_all_ri2 = np.concatenate(points_ri2, axis=0) # camera projection corresponding to each point. cp_points_all = np.concatenate(cp_points, axis=0) cp_points_all_ri2 = np.concatenate(cp_points_ri2, axis=0)''' print(points_all.shape) print(cp_points_all.shape) print(points_all[0:2]) for i in range(5):print(points[i].shape)print(cp_points[i].shape)print(points_all_ri2.shape) print(cp_points_all_ri2.shape) print(points_all_ri2[0:2]) for i in range(5):print(points_ri2[i].shape)print(cp_points_ri2[i].shape) ''' from IPython.display import Image, display display(Image('/content/waymo-od/tutorial/3d_point_cloud.png'))

2.6 Visualize Camera Projection

點云數據投影：

images = sorted(frame.images, key=lambda i:i.name) cp_points_all_concat = np.concatenate([cp_points_all, points_all], axis=-1) cp_points_all_concat_tensor = tf.constant(cp_points_all_concat)# The distance between lidar points and vehicle frame origin. points_all_tensor = tf.norm(points_all, axis=-1, keepdims=True) cp_points_all_tensor = tf.constant(cp_points_all, dtype=tf.int32)mask = tf.equal(cp_points_all_tensor[..., 0], images[0].name)cp_points_all_tensor = tf.cast(tf.gather_nd(cp_points_all_tensor, tf.where(mask)), dtype=tf.float32) points_all_tensor = tf.gather_nd(points_all_tensor, tf.where(mask))projected_points_all_from_raw_data = tf.concat([cp_points_all_tensor[..., 1:3], points_all_tensor], axis=-1).numpy()def rgba(r):"""Generates a color based on range.Args:r: the range value of a given point.Returns:The color for a given range"""c = plt.get_cmap('jet')((r % 20.0) / 20.0)c = list(c)c[-1] = 0.5 # alphareturn cdef plot_image(camera_image):"""Plot a cmaera image."""plt.figure(figsize=(20, 12))plt.imshow(tf.image.decode_jpeg(camera_image.image))plt.grid("off")def plot_points_on_image(projected_points, camera_image, rgba_func,point_size=5.0):"""Plots points on a camera image.Args:projected_points: [N, 3] numpy array. The inner dims are[camera_x, camera_y, range].camera_image: jpeg encoded camera image.rgba_func: a function that generates a color from a range value.point_size: the point size."""plot_image(camera_image)xs = []ys = []colors = []for point in projected_points:xs.append(point[0]) # width, colys.append(point[1]) # height, rowcolors.append(rgba_func(point[2]))plt.scatter(xs, ys, c=colors, s=point_size, edgecolors="none")plot_points_on_image(projected_points_all_from_raw_data,images[0], rgba, point_size=5.0)

總結

以上是生活随笔為你收集整理的Waymo自动驾驶数据集介绍与使用教程的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： kencron.exe是什么进程 ken
下一篇： java hive demo_java