當(dāng)前位置：首頁(yè) > 人工智能 > 目标检测 >内容正文

目标检测

从零实现一个3D目标检测算法（3）：PointPillars主干网实现（持续更新中）

發(fā)布時(shí)間：2023/12/10 目标检测 44 豆豆

生活随笔收集整理的這篇文章主要介紹了从零实现一个3D目标检测算法（3）：PointPillars主干网实现（持续更新中）小編覺(jué)得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

在上一篇文章《從零實(shí)現(xiàn)一個(gè)3D目標(biāo)檢測(cè)算法（2）：點(diǎn)云數(shù)據(jù)預(yù)處理》我們完成了對(duì)點(diǎn)云數(shù)據(jù)的預(yù)處理。

從本篇文章，我們開(kāi)始正式實(shí)現(xiàn)PointPillars網(wǎng)絡(luò)，我們將按照本系列第一篇文章介紹的網(wǎng)絡(luò)具體結(jié)構(gòu)來(lái)實(shí)現(xiàn)。

文章目錄

- 1. Pytorch基本模塊
- - 1.1 Empty模塊
  - 1.2 Sequential網(wǎng)絡(luò)模塊
- 2. Pillar Feature Net 實(shí)現(xiàn)
- - 2.1 VFE模塊
  - 2.2 Pillar Scatter模塊
- 3.

1. Pytorch基本模塊

在工程上，為了方便對(duì)網(wǎng)絡(luò)的搭建和修改，通常會(huì)基于Pytorch實(shí)現(xiàn)兩個(gè)基本模塊，空網(wǎng)絡(luò)層模塊（Empty）和序列網(wǎng)絡(luò)模塊（Sequential），文件為pytorch_utils.py。

1.1 Empty模塊

顧名思義，就是構(gòu)造一個(gè)什么也不做的網(wǎng)絡(luò)層，當(dāng)然在這里的具體作用只是讓網(wǎng)絡(luò)更加完整（具體使用我們后面會(huì)見(jiàn)到），但在這里未參與計(jì)算，如果需要參與計(jì)算，也可對(duì)其進(jìn)行修改，其構(gòu)造也比較簡(jiǎn)單，就是創(chuàng)建一個(gè)名為Empty的類(lèi)，它會(huì)繼承nn.Module，代碼為：

import torch import torch.nn as nn import sys from collections import OrderedDictclass Empty(torch.nn.Module):def __init__(self, *args, **kwargs):super(Empty, self).__init__()def forward(self, *args, **kwargs):if len(args) == 1:return args[0]elif len(args) == 0:return Nonereturn args

在代碼中，*args用來(lái)將參數(shù)打包成tuple給函數(shù)體調(diào)用，**kwargs 打包關(guān)鍵字參數(shù)成dict給函數(shù)體調(diào)用。而在編寫(xiě)函數(shù)中，參數(shù)arg、*args、**kwargs三個(gè)參數(shù)的位置是一定的。必須是(arg,*args,**kwargs)這個(gè)順序，否則程序會(huì)報(bào)錯(cuò)，大家可以運(yùn)行下面的代碼來(lái)看看輸出結(jié)果：

def function(arg,*args,**kwargs):print(arg,args,kwargs)function(6,7,8,9,a=1, b=2, c=3)

1.2 Sequential網(wǎng)絡(luò)模塊

Pytorch本身就帶有此模塊，這里之所以要單獨(dú)介紹，是因?yàn)樵谂渲梦募?#xff0c;網(wǎng)絡(luò)的各種超參數(shù)是通過(guò)字典所給出的，這里重新編寫(xiě)，方便我們后面加載網(wǎng)路超參數(shù)，這樣我們修改網(wǎng)絡(luò)模型時(shí)，只需要修改字典中的參數(shù)即可，我們創(chuàng)建一個(gè)名為Sequential的類(lèi)，代碼為：

class Sequential(torch.nn.Module):"""A sequential container.Modules will be added to it in the order they are passed in the constructor.Alternatively, an ordered dict of modules can also be passed in."""def __init__(self, *args, **kwargs):super(Sequential, self).__init__()if len(args) == 1 and isinstance(args[0], OrderedDict):for key, module in args[0].items():self.add_module(key, module)else:for idx, module in enumerate(args):self.add_module(str(idx), module)for name, module in kwargs.items():if sys.version_info < (3, 6):raise ValueError("kwargs only supported in py36+")if name in self._modules:raise ValueError("name exists.")self.add_module(name, module)def __getitem__(self, idx):if not (-len(self) <= idx and idx < len(self)):raise IndexError('index {} is out of range'.format(idx))if idx < 0:idx += len(self)it = iter(self._modules.values())for i in range(idx):next(it)return next(it)def __len__(self):return len(self._modules)def add(self, module, name=None):if name is None:name = str(len(self._modules))if name in self._modules:raise KeyError("name exists")self.add_module(name, module)def forward(self, input):for module in self._modules.values():input = module(input)return input

下面提供 Sequential類(lèi)使用的三個(gè)例子，其效果是等價(jià)的，大家可以運(yùn)行看看輸出結(jié)果：

model = Sequential(nn.Conv2d(1,20,5), nn.ReLU(), nn.Conv2d(20,64,5), nn.ReLU())model = Sequential(OrderedDict([('conv1', nn.Conv2d(1,20,5)), ('relu1', nn.ReLU()),('conv2', nn.Conv2d(20,64,5)), ('relu2', nn.ReLU())]))model = Sequential(conv1=nn.Conv2d(1,20,5), relu1=nn.ReLU(),conv2=nn.Conv2d(20,64,5),relu2=nn.ReLU())

2. Pillar Feature Net 實(shí)現(xiàn)

現(xiàn)在，我們開(kāi)始實(shí)現(xiàn)PointPillars網(wǎng)絡(luò)的第一部分Feature Net，這一部分主要是生成偽圖像，包括兩個(gè)模塊VFE模塊和Pillar Scatter模塊，文件為vfe_utils.py。

2.1 VFE模塊

VFE模塊的作用是將散亂無(wú)序的點(diǎn)云劃分為一個(gè)個(gè)Pillar，然后對(duì)其進(jìn)行特征學(xué)習(xí)，如下圖所示。
首先我們導(dǎo)入需要的包，包括Pytorch以及上一節(jié)我們寫(xiě)的Empty類(lèi)。

import torch import torch.nn as nn import torch.nn.functional as F import sys sys.path.append('../') from ..model_utils.pytorch_utils import Empty

首先我們定義一個(gè)VoxelFeatureExtractor類(lèi)，不過(guò)這里本身并不會(huì)進(jìn)行任何操作：

class VoxelFeatureExtractor(nn.Module):def __init__(self, **kwargs):super().__init__()def get_output_feature_dim(self):raise NotImplementedErrordef forward(self, **kwargs):raise NotImplementedError

然后我們定義一個(gè)paddings_indicator函數(shù)。

def get_paddings_indicator(actual_num, max_num, axis=0):"""Create boolean mask by actually number of a padded tensor.Args:actual_num ([type]): [description]max_num ([type]): [description]Returns:[type]: [description]"""actual_num = torch.unsqueeze(actual_num, axis+1) print('actual_num shape is: ', actual_num.shape) # tiled_actual_num: [N, M, 1]max_num_shape = [1] * len(actual_num.shape)max_num_shape[axis+1] = -1max_num = torch.arange(max_num, dtype=torch.int, device=actual_num.device).view(max_num_shape)# tiled_actual_num: [[3,3,3,3,3], [4,4,4,4,4], [2,2,2,2,2]]# tiled_max_num: [[0,1,2,3,4], [0,1,2,3,4], [0,1,2,3,4]]paddings_indicator = actual_num.int() > max_num# paddings_indicator shape: [batch_size, max_num]return paddings_indicator

然后，我們定義一個(gè)PFNLayer類(lèi)，這是一個(gè)簡(jiǎn)化的PointNet層，輸入特征為10，輸出特征為64，網(wǎng)絡(luò)是論文中提出的線性網(wǎng)絡(luò)，只有一層，代碼如下：

class PFNLayer(nn.Module):def __init__(self, in_channels, out_channels, use_norm=True, last_layer=False):"""Pillar Feature Net Layer.The Pillar Feature Net could be composed of a series of these layers, but the PointPillars paper resultsonly used a single PFNLayer.:param in_channels: <int>. Number of input channels. :param out_channels: <int>. Number of output channels. :param use_norm: <bool>. Whether to include BatchNorm. :param last_layer: <bool>. If last_layer, there is no concatenation of features."""super().__init__()self.name = 'PFNLayer'self.last_vfe = last_layer if not self.last_vfe:out_channels = out_channels // 2self.units = out_channels if use_norm: self.linear = nn.Linear(in_channels, self.units, bias=False)self.norm = nn.BatchNorm1d(self.units, eps=1e-3, momentum=0.01)else:self.linear = nn.Linear(in_channels, self.units, bias=True)self.norm = Empty(self.units)def forward(self, inputs):x = self.linear(inputs)total_points, voxel_points, channels = x.shapex = self.norm(x.view(-1, channels)).view(total_points, voxel_points, channels)x = F.relu(x)x_max = torch.max(x, dim=1, keepdim=True)[0] if self.last_vfe:return x_max else:x_repeat = x_max.repeat(1, inputs_shape[1], 1)x_concatenated = torch.cat([x, x_repeat], dim=2)return x_concatenated

下面我們將實(shí)現(xiàn)PillarFeatureNetOld2類(lèi)，這里的作用是生成一個(gè)個(gè)Pillar，并將點(diǎn)云原來(lái)的4維特征 $(x, y, z, r)$ 擴(kuò)充為10維特征 $x,y,z,r, x_c,y_c,z_c,x_p,y_p,z_p)$ ，代碼如下：

class PillarFeatureNetOld2(VoxelFeatureExtractor):def __init__(self, num_input_features=4, use_norm=True, num_filters=(64, ), with_distance=False,voxel_size=(0.2, 0.2, 4), pc_range=(0, -40, -3, 70.4, 40, 1)):"""Pillar Feature Net.The network prepares the pillar features and performs forward pass through PFNLayers.:param num_input_features: <int>. Number of input features, either x, y, z or x, y, z, r. :param use_norm: <bool>. Whether to include BatchNorm.:param num_filters: (<int>: N). Number of features in each of the N PFNLayers.:param with_distance: <bool>. Whether to include Euclidean distance to points.:param voxel_size: (<float>: 3). Size of voxels, only utilize x and y size. :param pc_range: (<float>: 6). Point cloud range, only utilize x and y min. """super().__init__()self.name = 'PillarFeatureNetOld2'assert len(num_filters) > 0num_input_features +=6 if with_distance: num_input_features += 1 self.with_distance = with_distanceself.num_filters = num_filters# Create PillarFeatureNetOld layersnum_filters = [num_input_features] + list(num_filters) pfn_layers = []for i in range(len(num_filters) - 1): in_filters = num_filters[i] out_filters = num_filters[i+1] if i < len(num_filters) - 2:last_layer = Falseelse:last_layer = True pfn_layers.append(PFNLayer(in_filters, out_filters, use_norm, last_layer=last_layer))self.pfn_layers = nn.ModuleList(pfn_layers)# Need pillar (voxel) size and x/y offset in order to calculate pillar offsetself.vx = voxel_size[0]self.vy = voxel_size[1]self.vz = voxel_size[2]self.x_offset = self.vx / 2 + pc_range[0]self.y_offset = self.vy / 2 + pc_range[1]self.z_offset = self.vz / 2 + pc_range[2]def get_output_feature_dim(self):return self.num_filters[-1] # 64def forward(self, features, num_voxels, coords):""":param features: (N, max_points_of_each_voxel, 3 + C):param num_voxels: (N):param coors: (z ,y, x):return:"""dtype = features.dtype# Find distance of x, y, and z from cluster center (x, y, z mean)points_mean = features[:, :, :3].sum(dim=1, keepdim=True) / num_voxels.type_as(features).view(-1, 1, 1)print('points_mean shape is: ', points_mean.shape) f_cluster = features[:, :, :3] - points_mean# Find distance of x, y, and z from pillar centerf_center = torch.zeros_like(features[:, :, :3])f_center[:, :, 0] = features[:, :, 0] - (coords[:, 3].to(dtype).unsqueeze(1) * self.vx + self.x_offset)f_center[:, :, 1] = features[:, :, 1] - (coords[:, 2].to(dtype).unsqueeze(1) * self.vy + self.y_offset)f_center[:, :, 2] = features[:, :, 2] - (coords[:, 1].to(dtype).unsqueeze(1) * self.vz + self.z_offset)print('f_center shape is: ', f_center.shape) # Combine together feature decorationsfeatures_ls = [features, f_cluster, f_center]if self.with_distance: # Falsepoints_dist = torch.norm(features[:, :, :3], 2, 2, keepdim=True)features_ls.append(points_dist)features = torch.cat(features_ls, dim=-1)# The feature decorations were calculated without regard to whether pillar was empty. # Need to ensure that empty pillars remain set to zeros.voxel_count = features.shape[1]mask = get_paddings_indicator(num_voxels, voxel_count, axis = 0)mask = torch.unsqueeze(mask, -1).type_as(features)features *= maskprint('161 features shape is: ', features.shape) # Forward pass through PFNLayersfor pfn in self.pfn_layers:features = pfn(features)return features.squeeze()

2.2 Pillar Scatter模塊

此模塊生成偽造圖像，圖像維度為 $(1, 64, 496, 432)$ ，文件為pillar_scatter.py：

import torch import torch.nn as nnclass PointPillarsScatter(nn.Module):def __init__(self, input_channels=64, **kwargs):"""Point Pillar's Scatter.Converts learned features from dense tensor to sparse pseudo image.:param output_shape: ([int]: 4). Required output shape of features.:param num_input_features: <int>. Number of input features."""super().__init__()self.nchannels = input_channelsdef forward(self, voxel_features, coords, batch_size, **kwargs):output_shape = kwargs['output_shape']nz, ny, nx = output_shape# batch_canvas will be the final output.batch_canvas = []for batch_itt in range(batch_size):# Create the canvas for this samplecanvas = torch.zeros(self.nchannels, nz*nx*ny, dtype=voxel_features.dtype, \device=voxel_features.device)# Only include non-empty pillarsbatch_mask = coords[:, 0] == batch_itt this_coords = coords[batch_mask, :]indices = this_coords[:, 1].type(torch.long) * nz + this_coords[:, 2].type(torch.long) * nx + \this_coords[:, 3].type(torch.long)indices = indices.type(torch.long)voxels = voxel_features[batch_mask, :]voxels = voxels.t()# Now scatter the blob back to the canvas.canvas[:, indices] = voxels # Append to a list for later stacking.batch_canvas.append(canvas)# Stack to 3-dim tensor (batch-size, nchannels, nrows*ncols)batch_canvas = torch.stack(batch_canvas, 0)# Undo the column stacking to final 4-dim tensorbatch_canvas = batch_canvas.view(batch_size, self.nchannels * nz, ny, nx)return batch_canvas

至此，我們已經(jīng)實(shí)現(xiàn)了特征網(wǎng)絡(luò)部分，生成了偽圖像。

3. 總結(jié)

以上是生活随笔為你收集整理的从零实现一个3D目标检测算法（3）：PointPillars主干网实现（持续更新中）的全部?jī)?nèi)容，希望文章能夠幫你解決所遇到的問(wèn)題。

如果覺(jué)得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇：第一架翼龙-2民用无人机下线：曾在河南暴
下一篇：金砖国家中，我国原本只是GDP最高，如今