Waymo自动驾驶数据集介绍与使用教程
本文將對Waymo自動駕駛數據集(Waymo Open Dataset)進行介紹。
論文鏈接為:https://arxiv.org/abs/1912.04838v7
項目鏈接為:https://github.com/waymo-research/waymo-open-dataset
數據集鏈接為:https://waymo.com/open
1. 自動駕駛感知可擴展性:Waymo開放數據集
這里首先對論文進行解讀。
1.1 Abstract
整個數據集包含1150個場景,每個場景時長為20秒,且LiDAR和Camera是經過同步和標定處理過的。 對圖像和激光雷達的bounding box進行了仔細的標注,并在各幀之間使用了一致的標識符。
1.2 Introduction
整個數據集包含約1200萬個LiDAR注釋框和約1200萬個圖像注釋框,從而產生了約113k個LiDAR物體軌跡和約25萬個圖像軌跡。整個數據集劃分為1000個訓練集和150個測試集。
1.3 Waymo Open Dataset
- 1)Sensor Specifications
數據收集使用五個LiDAR和五個高分辨率針孔相機完成。 限制了LiDAR數據范圍,并且每個激光脈沖有兩個回波(雙回波)。相機圖像是通過卷簾快門掃描捕獲的,其中確切的掃描模式可能因場景而異。 所有相機圖像都將進行降采樣并從原始圖像中裁剪出來。
下圖和表格是傳感器安裝布置圖和規格介紹。
- 2)Coordinate Systems
整個數據集坐標系統遵守右手法則。全局坐標系為ENU坐標:Up(z)軸與重力方向一直,向上為正;East(x)沿著緯度指向正東,North(y)指向北極。車輛坐標系隨汽車移動,x軸指向前方,y軸指向左側,z軸向上為正。傳感器坐標可以通過旋轉矩陣從車輛坐標獲得,可以把這看作是外參矩陣。
圖像坐標是二維坐標,x軸表示圖像寬度,y軸表示圖像高度,圖像左上角為坐標原點。激光雷達得到的點的坐標可以表示為(x,y,z)(x,y,z)(x,y,z),轉換成距離,方位角,傾角公式為:
range?=x2+y2+z2azimuth?=atan?2(y,x)inclination?=atan?2(z,x2+y2)\begin{aligned}\text { range } &=\sqrt{x^{2}+y^{2}+z^{2}} \\\text { azimuth } &=\operatorname{atan} 2(y, x) \\\text { inclination } &=\operatorname{atan} 2\left(z, \sqrt{x^{2}+y^{2}}\right) \end{aligned}?range??azimuth??inclination??=x2+y2+z2?=atan2(y,x)=atan2(z,x2+y2?)?
- 3)Ground Truth Labels
數據集中對汽車、行人、交通標志、自行車人員進行了詳細標注。對于激光雷達數據,將每一個物體標注為7自由度3D bbox:(cx,cy,cz,l,w,h,θ)(cx,cy,cz,l,w,h,\theta)(cx,cy,cz,l,w,h,θ)。其中cx,cy,czcx,cy,czcx,cy,cz表示為bbox中心坐標,l,w,hl,w,hl,w,h表示為物體長寬高,θ\thetaθ表示為物體偏航角,此外對于每一個物體還標注了一個唯一的追蹤ID編號。
圖像標注中將每一個物體標注為4自由度2D bbox:(cx,cy,l,w)(cx,cy,l,w)(cx,cy,l,w)。其中cx,cycx,cycx,cy表示為bbox中心圖像坐標,lll表示為物體長度,www表示為物體寬度。
此外,還將標注物體劃分為了兩個難度:LEVEL_2為物體對應激光雷達點數少于5個,其余則劃分為LEVEL_1。
- 4)Sensor Data
LiDAR 數據在數據集中被編碼為距離圖像,每個 LiDAR 返回一張; 提供前兩次回波的數據。 距離圖像格式類似于卷簾快門相機圖像,從左到右逐列填充。 每個距離圖像像素對應一個 LiDAR 返回波。 高度和寬度由 LiDAR 傳感器框架中傾角和方位角的分辨率決定。
此外,對于圖像和激光雷達, 投影如下:
- 5)Dataset Analysis
下面兩張表格給出了數據集分布情況,不同城市,不同天氣,不同類別目標的統計情況。
1.4 Tasks
數據集任務劃分為2D和3D物體檢測和追蹤任務,訓練集場景有798個,驗證集場景有202個,測試集場景有150個。
1.5 Experiments
對于3D物體檢測,Waymo提供了一個Baseline,其使用的檢測方法是PointPillars。對于2D物體檢測,使用了Faster R-CNN作為baseline,對于3D物體追蹤,使用了AB3DMOT作為Baseline。
下面幾張表格給出了Baseline結果:行人和汽車檢測和追蹤結果,域適應檢測結果,數據集大小對檢測結果的影響。
2. Waymo Open Dataset Tutorial
下面介紹Waymo官放提供的數據使用教程,以下程序是在Colab上運行的。
2.1 Install waymo_open_dataset package
首先是安裝數據包:
!rm -rf waymo-od > /dev/null !git clone https://github.com/waymo-research/waymo-open-dataset.git waymo-od !cd waymo-od && git branch -a !cd waymo-od && git checkout remotes/origin/master !pip3 install --upgrade pip!pip3 install waymo-open-dataset-tf-2-1-0==1.2.0然后是導入需要用到的庫:
import os import tensorflow.compat.v1 as tf import math import numpy as np import itertoolstf.enable_eager_execution()from waymo_open_dataset.utils import range_image_utils from waymo_open_dataset.utils import transform_utils from waymo_open_dataset.utils import frame_utils from waymo_open_dataset import dataset_pb2 as open_dataset2.2 Read one frame
數據集中的每一個文件包含幀序列數據,按照時間戳排列,這里提取了兩幀來演示:
FILENAME = '/content/waymo-od/tutorial/frames' dataset = tf.data.TFRecordDataset(FILENAME, compression_type='') for data in dataset:frame = open_dataset.Frame()frame.ParseFromString(bytearray(data.numpy()))break(range_images, camera_projections, range_image_top_pose) = frame_utils.parse_range_image_and_camera_projection(frame)可以查看每一幀包含的信息:
print(frame.context)2.3 Visualize Camera Images and Camera Labels
圖片數據及其標簽可視化:
import matplotlib.pyplot as plt import matplotlib.patches as patchesdef show_camera_image(camera_image, camera_labels, layout, cmap=None):"""Show a camera image and the given camera labels."""ax = plt.subplot(*layout)# Draw the camera labels.for camera_labels in frame.camera_labels:# Ignore camera labels that do not correspond to this camera.if camera_labels.name != camera_image.name:continue# Iterate over the individual labels.for label in camera_labels.labels:# Draw the object bounding box.ax.add_patch(patches.Rectangle(xy=(label.box.center_x - 0.5 * label.box.length,label.box.center_y - 0.5 * label.box.width),width=label.box.length,height=label.box.width,linewidth=1,edgecolor='red',facecolor='none'))# Show the camera image.plt.imshow(tf.image.decode_jpeg(camera_image.image), cmap=cmap)plt.title(open_dataset.CameraName.Name.Name(camera_image.name))plt.grid(False)plt.axis('off')plt.figure(figsize=(25, 20))for index, image in enumerate(frame.images):show_camera_image(image, frame.camera_labels, [3, 3, index+1])2.4 Visualize Range Images
Range圖片可視化:
plt.figure(figsize=(64, 20)) def plot_range_image_helper(data, name, layout, vmin = 0, vmax=1, cmap='gray'):"""Plots range image.Args:data: range image dataname: the image titlelayout: plt layoutvmin: minimum value of the passed datavmax: maximum value of the passed datacmap: color map"""plt.subplot(*layout)plt.imshow(data, cmap=cmap, vmin=vmin, vmax=vmax)plt.title(name)plt.grid(False)plt.axis('off')def get_range_image(laser_name, return_index):"""Returns range image given a laser name and its return index."""return range_images[laser_name][return_index]def show_range_image(range_image, layout_index_start = 1):"""Shows range image.Args:range_image: the range image data from a given lidar of type MatrixFloat.layout_index_start: layout offset"""range_image_tensor = tf.convert_to_tensor(range_image.data)range_image_tensor = tf.reshape(range_image_tensor, range_image.shape.dims)lidar_image_mask = tf.greater_equal(range_image_tensor, 0)range_image_tensor = tf.where(lidar_image_mask, range_image_tensor,tf.ones_like(range_image_tensor) * 1e10)range_image_range = range_image_tensor[...,0] range_image_intensity = range_image_tensor[...,1]range_image_elongation = range_image_tensor[...,2]plot_range_image_helper(range_image_range.numpy(), 'range',[8, 1, layout_index_start], vmax=75, cmap='gray')plot_range_image_helper(range_image_intensity.numpy(), 'intensity',[8, 1, layout_index_start + 1], vmax=1.5, cmap='gray')plot_range_image_helper(range_image_elongation.numpy(), 'elongation',[8, 1, layout_index_start + 2], vmax=1.5, cmap='gray') frame.lasers.sort(key=lambda laser: laser.name) show_range_image(get_range_image(open_dataset.LaserName.TOP, 0), 1) show_range_image(get_range_image(open_dataset.LaserName.TOP, 1), 4)2.5 Point Cloud Conversion and Visualization
點云轉換和可視化:
points, cp_points = frame_utils.convert_range_image_to_point_cloud(frame, range_images, camera_projections, range_image_top_pose) points_ri2, cp_points_ri2 = frame_utils.convert_range_image_to_point_cloud(frame, range_images, camera_projections, range_image_top_pose, ri_index=1)# 3d points in vehicle frame. points_all = np.concatenate(points, axis=0) points_all_ri2 = np.concatenate(points_ri2, axis=0) # camera projection corresponding to each point. cp_points_all = np.concatenate(cp_points, axis=0) cp_points_all_ri2 = np.concatenate(cp_points_ri2, axis=0)''' print(points_all.shape) print(cp_points_all.shape) print(points_all[0:2]) for i in range(5):print(points[i].shape)print(cp_points[i].shape)print(points_all_ri2.shape) print(cp_points_all_ri2.shape) print(points_all_ri2[0:2]) for i in range(5):print(points_ri2[i].shape)print(cp_points_ri2[i].shape) ''' from IPython.display import Image, display display(Image('/content/waymo-od/tutorial/3d_point_cloud.png'))2.6 Visualize Camera Projection
點云數據投影:
images = sorted(frame.images, key=lambda i:i.name) cp_points_all_concat = np.concatenate([cp_points_all, points_all], axis=-1) cp_points_all_concat_tensor = tf.constant(cp_points_all_concat)# The distance between lidar points and vehicle frame origin. points_all_tensor = tf.norm(points_all, axis=-1, keepdims=True) cp_points_all_tensor = tf.constant(cp_points_all, dtype=tf.int32)mask = tf.equal(cp_points_all_tensor[..., 0], images[0].name)cp_points_all_tensor = tf.cast(tf.gather_nd(cp_points_all_tensor, tf.where(mask)), dtype=tf.float32) points_all_tensor = tf.gather_nd(points_all_tensor, tf.where(mask))projected_points_all_from_raw_data = tf.concat([cp_points_all_tensor[..., 1:3], points_all_tensor], axis=-1).numpy()def rgba(r):"""Generates a color based on range.Args:r: the range value of a given point.Returns:The color for a given range"""c = plt.get_cmap('jet')((r % 20.0) / 20.0)c = list(c)c[-1] = 0.5 # alphareturn cdef plot_image(camera_image):"""Plot a cmaera image."""plt.figure(figsize=(20, 12))plt.imshow(tf.image.decode_jpeg(camera_image.image))plt.grid("off")def plot_points_on_image(projected_points, camera_image, rgba_func,point_size=5.0):"""Plots points on a camera image.Args:projected_points: [N, 3] numpy array. The inner dims are[camera_x, camera_y, range].camera_image: jpeg encoded camera image.rgba_func: a function that generates a color from a range value.point_size: the point size."""plot_image(camera_image)xs = []ys = []colors = []for point in projected_points:xs.append(point[0]) # width, colys.append(point[1]) # height, rowcolors.append(rgba_func(point[2]))plt.scatter(xs, ys, c=colors, s=point_size, edgecolors="none")plot_points_on_image(projected_points_all_from_raw_data,images[0], rgba, point_size=5.0)總結
以上是生活随笔為你收集整理的Waymo自动驾驶数据集介绍与使用教程的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: kencron.exe是什么进程 ken
- 下一篇: java hive demo_java