从零实现一个3D目标检测算法(3):PointPillars主干网实现(持续更新中)
在上一篇文章《從零實(shí)現(xiàn)一個(gè)3D目標(biāo)檢測(cè)算法(2):點(diǎn)云數(shù)據(jù)預(yù)處理》我們完成了對(duì)點(diǎn)云數(shù)據(jù)的預(yù)處理。
從本篇文章,我們開(kāi)始正式實(shí)現(xiàn)PointPillars網(wǎng)絡(luò),我們將按照本系列第一篇文章介紹的網(wǎng)絡(luò)具體結(jié)構(gòu)來(lái)實(shí)現(xiàn)。
文章目錄
- 1. Pytorch基本模塊
- 1.1 Empty模塊
- 1.2 Sequential網(wǎng)絡(luò)模塊
- 2. Pillar Feature Net 實(shí)現(xiàn)
- 2.1 VFE模塊
- 2.2 Pillar Scatter模塊
- 3.
1. Pytorch基本模塊
在工程上,為了方便對(duì)網(wǎng)絡(luò)的搭建和修改,通常會(huì)基于Pytorch實(shí)現(xiàn)兩個(gè)基本模塊,空網(wǎng)絡(luò)層模塊(Empty)和序列網(wǎng)絡(luò)模塊(Sequential),文件為pytorch_utils.py。
1.1 Empty模塊
顧名思義,就是構(gòu)造一個(gè)什么也不做的網(wǎng)絡(luò)層,當(dāng)然在這里的具體作用只是讓網(wǎng)絡(luò)更加完整(具體使用我們后面會(huì)見(jiàn)到),但在這里未參與計(jì)算,如果需要參與計(jì)算,也可對(duì)其進(jìn)行修改,其構(gòu)造也比較簡(jiǎn)單,就是創(chuàng)建一個(gè)名為Empty的類(lèi),它會(huì)繼承nn.Module,代碼為:
import torch import torch.nn as nn import sys from collections import OrderedDictclass Empty(torch.nn.Module):def __init__(self, *args, **kwargs):super(Empty, self).__init__()def forward(self, *args, **kwargs):if len(args) == 1:return args[0]elif len(args) == 0:return Nonereturn args在代碼中,*args用來(lái)將參數(shù)打包成tuple給函數(shù)體調(diào)用,**kwargs 打包關(guān)鍵字參數(shù)成dict給函數(shù)體調(diào)用。而在編寫(xiě)函數(shù)中,參數(shù)arg、*args、**kwargs三個(gè)參數(shù)的位置是一定的。必須是(arg,*args,**kwargs)這個(gè)順序,否則程序會(huì)報(bào)錯(cuò),大家可以運(yùn)行下面的代碼來(lái)看看輸出結(jié)果:
def function(arg,*args,**kwargs):print(arg,args,kwargs)function(6,7,8,9,a=1, b=2, c=3)1.2 Sequential網(wǎng)絡(luò)模塊
Pytorch本身就帶有此模塊,這里之所以要單獨(dú)介紹,是因?yàn)樵谂渲梦募?#xff0c;網(wǎng)絡(luò)的各種超參數(shù)是通過(guò)字典所給出的,這里重新編寫(xiě),方便我們后面加載網(wǎng)路超參數(shù),這樣我們修改網(wǎng)絡(luò)模型時(shí),只需要修改字典中的參數(shù)即可,我們創(chuàng)建一個(gè)名為Sequential的類(lèi),代碼為:
class Sequential(torch.nn.Module):"""A sequential container.Modules will be added to it in the order they are passed in the constructor.Alternatively, an ordered dict of modules can also be passed in."""def __init__(self, *args, **kwargs):super(Sequential, self).__init__()if len(args) == 1 and isinstance(args[0], OrderedDict):for key, module in args[0].items():self.add_module(key, module)else:for idx, module in enumerate(args):self.add_module(str(idx), module)for name, module in kwargs.items():if sys.version_info < (3, 6):raise ValueError("kwargs only supported in py36+")if name in self._modules:raise ValueError("name exists.")self.add_module(name, module)def __getitem__(self, idx):if not (-len(self) <= idx and idx < len(self)):raise IndexError('index {} is out of range'.format(idx))if idx < 0:idx += len(self)it = iter(self._modules.values())for i in range(idx):next(it)return next(it)def __len__(self):return len(self._modules)def add(self, module, name=None):if name is None:name = str(len(self._modules))if name in self._modules:raise KeyError("name exists")self.add_module(name, module)def forward(self, input):for module in self._modules.values():input = module(input)return input下面提供 Sequential類(lèi)使用的三個(gè)例子,其效果是等價(jià)的,大家可以運(yùn)行看看輸出結(jié)果:
model = Sequential(nn.Conv2d(1,20,5), nn.ReLU(), nn.Conv2d(20,64,5), nn.ReLU())model = Sequential(OrderedDict([('conv1', nn.Conv2d(1,20,5)), ('relu1', nn.ReLU()),('conv2', nn.Conv2d(20,64,5)), ('relu2', nn.ReLU())]))model = Sequential(conv1=nn.Conv2d(1,20,5), relu1=nn.ReLU(),conv2=nn.Conv2d(20,64,5),relu2=nn.ReLU())2. Pillar Feature Net 實(shí)現(xiàn)
現(xiàn)在,我們開(kāi)始實(shí)現(xiàn)PointPillars網(wǎng)絡(luò)的第一部分Feature Net,這一部分主要是生成偽圖像,包括兩個(gè)模塊VFE模塊和Pillar Scatter模塊,文件為vfe_utils.py。
2.1 VFE模塊
VFE模塊的作用是將散亂無(wú)序的點(diǎn)云劃分為一個(gè)個(gè)Pillar,然后對(duì)其進(jìn)行特征學(xué)習(xí),如下圖所示。
首先我們導(dǎo)入需要的包,包括Pytorch以及上一節(jié)我們寫(xiě)的Empty類(lèi)。
首先我們定義一個(gè)VoxelFeatureExtractor類(lèi),不過(guò)這里本身并不會(huì)進(jìn)行任何操作:
class VoxelFeatureExtractor(nn.Module):def __init__(self, **kwargs):super().__init__()def get_output_feature_dim(self):raise NotImplementedErrordef forward(self, **kwargs):raise NotImplementedError然后我們定義一個(gè)paddings_indicator函數(shù)。
def get_paddings_indicator(actual_num, max_num, axis=0):"""Create boolean mask by actually number of a padded tensor.Args:actual_num ([type]): [description]max_num ([type]): [description]Returns:[type]: [description]"""actual_num = torch.unsqueeze(actual_num, axis+1) print('actual_num shape is: ', actual_num.shape) # tiled_actual_num: [N, M, 1]max_num_shape = [1] * len(actual_num.shape)max_num_shape[axis+1] = -1max_num = torch.arange(max_num, dtype=torch.int, device=actual_num.device).view(max_num_shape)# tiled_actual_num: [[3,3,3,3,3], [4,4,4,4,4], [2,2,2,2,2]]# tiled_max_num: [[0,1,2,3,4], [0,1,2,3,4], [0,1,2,3,4]]paddings_indicator = actual_num.int() > max_num# paddings_indicator shape: [batch_size, max_num]return paddings_indicator然后,我們定義一個(gè)PFNLayer類(lèi),這是一個(gè)簡(jiǎn)化的PointNet層,輸入特征為10, 輸出特征為64,網(wǎng)絡(luò)是論文中提出的線性網(wǎng)絡(luò),只有一層,代碼如下:
class PFNLayer(nn.Module):def __init__(self, in_channels, out_channels, use_norm=True, last_layer=False):"""Pillar Feature Net Layer.The Pillar Feature Net could be composed of a series of these layers, but the PointPillars paper resultsonly used a single PFNLayer.:param in_channels: <int>. Number of input channels. :param out_channels: <int>. Number of output channels. :param use_norm: <bool>. Whether to include BatchNorm. :param last_layer: <bool>. If last_layer, there is no concatenation of features."""super().__init__()self.name = 'PFNLayer'self.last_vfe = last_layer if not self.last_vfe:out_channels = out_channels // 2self.units = out_channels if use_norm: self.linear = nn.Linear(in_channels, self.units, bias=False)self.norm = nn.BatchNorm1d(self.units, eps=1e-3, momentum=0.01)else:self.linear = nn.Linear(in_channels, self.units, bias=True)self.norm = Empty(self.units)def forward(self, inputs):x = self.linear(inputs)total_points, voxel_points, channels = x.shapex = self.norm(x.view(-1, channels)).view(total_points, voxel_points, channels)x = F.relu(x)x_max = torch.max(x, dim=1, keepdim=True)[0] if self.last_vfe:return x_max else:x_repeat = x_max.repeat(1, inputs_shape[1], 1)x_concatenated = torch.cat([x, x_repeat], dim=2)return x_concatenated下面我們將實(shí)現(xiàn)PillarFeatureNetOld2類(lèi),這里的作用是生成一個(gè)個(gè)Pillar,并將點(diǎn)云原來(lái)的4維特征(x,y,z,r)(x,y,z,r)(x,y,z,r)擴(kuò)充為10維特征(x,y,z,r,xc,yc,zc,xp,yp,zp)(x,y,z,r, x_c,y_c,z_c,x_p,y_p,z_p)(x,y,z,r,xc?,yc?,zc?,xp?,yp?,zp?),代碼如下:
class PillarFeatureNetOld2(VoxelFeatureExtractor):def __init__(self, num_input_features=4, use_norm=True, num_filters=(64, ), with_distance=False,voxel_size=(0.2, 0.2, 4), pc_range=(0, -40, -3, 70.4, 40, 1)):"""Pillar Feature Net.The network prepares the pillar features and performs forward pass through PFNLayers.:param num_input_features: <int>. Number of input features, either x, y, z or x, y, z, r. :param use_norm: <bool>. Whether to include BatchNorm.:param num_filters: (<int>: N). Number of features in each of the N PFNLayers.:param with_distance: <bool>. Whether to include Euclidean distance to points.:param voxel_size: (<float>: 3). Size of voxels, only utilize x and y size. :param pc_range: (<float>: 6). Point cloud range, only utilize x and y min. """super().__init__()self.name = 'PillarFeatureNetOld2'assert len(num_filters) > 0num_input_features +=6 if with_distance: num_input_features += 1 self.with_distance = with_distanceself.num_filters = num_filters# Create PillarFeatureNetOld layersnum_filters = [num_input_features] + list(num_filters) pfn_layers = []for i in range(len(num_filters) - 1): in_filters = num_filters[i] out_filters = num_filters[i+1] if i < len(num_filters) - 2:last_layer = Falseelse:last_layer = True pfn_layers.append(PFNLayer(in_filters, out_filters, use_norm, last_layer=last_layer))self.pfn_layers = nn.ModuleList(pfn_layers)# Need pillar (voxel) size and x/y offset in order to calculate pillar offsetself.vx = voxel_size[0]self.vy = voxel_size[1]self.vz = voxel_size[2]self.x_offset = self.vx / 2 + pc_range[0]self.y_offset = self.vy / 2 + pc_range[1]self.z_offset = self.vz / 2 + pc_range[2]def get_output_feature_dim(self):return self.num_filters[-1] # 64def forward(self, features, num_voxels, coords):""":param features: (N, max_points_of_each_voxel, 3 + C):param num_voxels: (N):param coors: (z ,y, x):return:"""dtype = features.dtype# Find distance of x, y, and z from cluster center (x, y, z mean)points_mean = features[:, :, :3].sum(dim=1, keepdim=True) / num_voxels.type_as(features).view(-1, 1, 1)print('points_mean shape is: ', points_mean.shape) f_cluster = features[:, :, :3] - points_mean# Find distance of x, y, and z from pillar centerf_center = torch.zeros_like(features[:, :, :3])f_center[:, :, 0] = features[:, :, 0] - (coords[:, 3].to(dtype).unsqueeze(1) * self.vx + self.x_offset)f_center[:, :, 1] = features[:, :, 1] - (coords[:, 2].to(dtype).unsqueeze(1) * self.vy + self.y_offset)f_center[:, :, 2] = features[:, :, 2] - (coords[:, 1].to(dtype).unsqueeze(1) * self.vz + self.z_offset)print('f_center shape is: ', f_center.shape) # Combine together feature decorationsfeatures_ls = [features, f_cluster, f_center]if self.with_distance: # Falsepoints_dist = torch.norm(features[:, :, :3], 2, 2, keepdim=True)features_ls.append(points_dist)features = torch.cat(features_ls, dim=-1)# The feature decorations were calculated without regard to whether pillar was empty. # Need to ensure that empty pillars remain set to zeros.voxel_count = features.shape[1]mask = get_paddings_indicator(num_voxels, voxel_count, axis = 0)mask = torch.unsqueeze(mask, -1).type_as(features)features *= maskprint('161 features shape is: ', features.shape) # Forward pass through PFNLayersfor pfn in self.pfn_layers:features = pfn(features)return features.squeeze()2.2 Pillar Scatter模塊
此模塊生成偽造圖像,圖像維度為(1,64,496,432)(1, 64, 496, 432)(1,64,496,432),文件為pillar_scatter.py:
import torch import torch.nn as nnclass PointPillarsScatter(nn.Module):def __init__(self, input_channels=64, **kwargs):"""Point Pillar's Scatter.Converts learned features from dense tensor to sparse pseudo image.:param output_shape: ([int]: 4). Required output shape of features.:param num_input_features: <int>. Number of input features."""super().__init__()self.nchannels = input_channelsdef forward(self, voxel_features, coords, batch_size, **kwargs):output_shape = kwargs['output_shape']nz, ny, nx = output_shape# batch_canvas will be the final output.batch_canvas = []for batch_itt in range(batch_size):# Create the canvas for this samplecanvas = torch.zeros(self.nchannels, nz*nx*ny, dtype=voxel_features.dtype, \device=voxel_features.device)# Only include non-empty pillarsbatch_mask = coords[:, 0] == batch_itt this_coords = coords[batch_mask, :]indices = this_coords[:, 1].type(torch.long) * nz + this_coords[:, 2].type(torch.long) * nx + \this_coords[:, 3].type(torch.long)indices = indices.type(torch.long)voxels = voxel_features[batch_mask, :]voxels = voxels.t()# Now scatter the blob back to the canvas.canvas[:, indices] = voxels # Append to a list for later stacking.batch_canvas.append(canvas)# Stack to 3-dim tensor (batch-size, nchannels, nrows*ncols)batch_canvas = torch.stack(batch_canvas, 0)# Undo the column stacking to final 4-dim tensorbatch_canvas = batch_canvas.view(batch_size, self.nchannels * nz, ny, nx)return batch_canvas至此,我們已經(jīng)實(shí)現(xiàn)了特征網(wǎng)絡(luò)部分,生成了偽圖像。
3.
總結(jié)
以上是生活随笔為你收集整理的从零实现一个3D目标检测算法(3):PointPillars主干网实现(持续更新中)的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: 第一架翼龙-2民用无人机下线:曾在河南暴
- 下一篇: 金砖国家中,我国原本只是GDP最高,如今