从PyTorch到ONNX的端到端AlexNet
從PyTorch到ONNX的端到端AlexNet
這是一個簡單的腳本,可將Torchvision中定義的經過預訓練的AlexNet導出到ONNX中。運行一輪推理Inference,然后將生成的跟蹤模型保存到alexnet.onnx:
import torch import torchvision dummy_input = torch.randn(10, 3, 224, 224, device=‘cuda’) model = torchvision.models.alexnet(pretrained=True).cuda() #
Providing input and output names sets the display names for values # within
the model’s graph. Setting these does not change the semantics # of the
graph; it is only for readability. # # The inputs to the network consist
of the flat list of inputs (i.e. # the values you would pass to the
forward() method) followed by the # flat list of parameters. You can
partially specify names, i.e. provide # a list here shorter than the
number of inputs to the model, and we will # only set that subset of names,
starting from the beginning. input_names = [ “actual_input_1” ] + [ “learned_%d” % i for i in range(16) ] output_names = [ “output1” ] torch.onnx.export(model, dummy_input, “alexnet.onnx”, verbose=True, input_names=input_names, output_names=output_names)
結果alexnet.onnx是一個二進制protobuf文件,其中包含導出的模型的網絡結構和參數(本例中為AlexNet)。關鍵字參數verbose=True使導出器打印出人類可讀的網絡表示形式:
# These are the inputs
and parameters to the network, which have taken on
# the names we
specified earlier.
graph(%actual_input_1 :
Float(10, 3, 224, 224)
%learned_0 : Float(64, 3, 11, 11)%learned_1 : Float(64)%learned_2 : Float(192, 64, 5, 5)%learned_3 : Float(192)# ----
omitted for brevity ----
%learned_14 : Float(1000, 4096)%learned_15 : Float(1000)) {
Every
statement consists of some output tensors (and their types),
the
operator to be run (with its attributes, e.g., kernels, strides,
etc.),
its input tensors (%actual_input_1, %learned_0, %learned_1)
%17 : Float(10, 64, 55, 55) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[11, 11], pads=[2, 2, 2, 2], strides=[4, 4]](%actual_input_1, %learned_0, %learned_1), scope: AlexNet/Sequential[features]/Conv2d[0]
%18 : Float(10, 64, 55, 55) = onnx::Relu(%17), scope: AlexNet/Sequential[features]/ReLU[1]
%19 : Float(10, 64, 27, 27) =
onnx::MaxPoolkernel_shape=[3, 3], pads=[0, 0, 0, 0], strides=[2, 2], scope: AlexNet/Sequential[features]/MaxPool2d[2]
----
omitted for brevity ----
%29 : Float(10, 256, 6, 6) =
onnx::MaxPoolkernel_shape=[3, 3], pads=[0, 0, 0, 0], strides=[2, 2], scope: AlexNet/Sequential[features]/MaxPool2d[12]
Dynamic means that the shape is not known. This may be because of a
limitation of our implementation (which we would like to fix in a
future
release) or shapes which are truly dynamic.
%30 : Dynamic = onnx::Shape(%29), scope: AlexNet
%31 : Dynamic = onnx::Sliceaxes=[0], ends=[1], starts=[0], scope: AlexNet
%32 : Long() = onnx::Squeezeaxes=[0], scope: AlexNet
%33 : Long() = onnx::Constantvalue={9216}, scope: AlexNet
----
omitted for brevity ----
%output1 : Float(10, 1000) = onnx::Gemm[alpha=1, beta=1, broadcast=1, transB=1](%45, %learned_14, %learned_15), scope: AlexNet/Sequential[classifier]/Linear[6]
return (%output1);
}
還可使用ONNX庫驗證protobuf 。可使用conda安裝ONNX:
conda install -c conda-forge onnx
接著,可以運行:
import onnx
# Load the ONNX model model = onnx.load(“alexnet.onnx”) # Check
that the IR is well formed onnx.checker.check_model(model) # Print
a human readable representation of the graph onnx.helper.printable_graph(model.graph)
要使用caffe2運行導出的腳本,將需要安裝caffe2:如果還沒有安裝caffe2,可按照安裝說明進行操作。
一旦安裝了這些,就可以將后端用于Caffe2:
# …continuing from above import caffe2.python.onnx.backend as backend import numpy as np rep = backend.prepare(model, device=“CUDA:0”) # or “CPU” # For
the Caffe2 backend: # rep.predict_net is the Caffe2 protobuf for the network # rep.workspace
is the Caffe2 workspace for the network # (see
the class caffe2.python.onnx.backend.Workspace) outputs = rep.run(np.random.randn(10, 3, 224, 224).astype(np.float32)) # To
run networks with more than one input, pass a tuple #
rather than a single numpy ndarray. print(outputs[0])
可以使用ONNX Runtime運行導出的模型,將需要安裝ONNX Runtime:
按照以下說明進行操作。
一旦安裝了這些,就可以將后端用于ONNX Runtime:
# …continuing from above import onnxruntime as ort ort_session = ort.InferenceSession(‘alexnet.onnx’) outputs = ort_session.run(None, {‘actual_input_1’: np.random.randn(10, 3, 224, 224).astype(np.float32)}) print(outputs[0])
這是將SuperResolution模型導出到ONNX的另一方法。
將來,其他框架也會有后端。
跟蹤與腳本
ONNX輸出可以同時trace-based and script-based。
·
基于跟蹤的trace-based意思是,通過執行一次模型,并導出在此運行期間實際運行的算子進行操作。這意味著,如果模型是動態的,例如根據輸入數據更改行為,則導出將不準確。同樣,跟蹤可能僅對特定的輸入大小才有效(這是在跟蹤時需要顯式輸入的原因之一。)建議檢查模型跟蹤,并確保跟蹤的算子看起來合理。如果模型包含諸如for循環和if條件之類的控制流,則基于跟蹤的輸出,將展開循環和if條件,并輸出與此運行完全相同的靜態圖。如果要使用動態控制流輸出模型,則需要使用基于腳本的 輸出。
·
基于腳本的意思是,要導出的模型是ScriptModule。 ScriptModule是TorchScript中的核心數據結構,而TorchScript是Python語言的子集,可從PyTorch代碼創建可序列化和可優化的模型。
允許混合跟蹤和腳本編寫。可以組合跟蹤和腳本,以匹配模型部分的特定要求。看看這個例子:
import torch #
Trace-based only class LoopModel(torch.nn.Module): def forward(self, x, y): for i in range(y): x = x + i return x model = LoopModel() dummy_input = torch.ones(2, 3, dtype=torch.long) loop_count = torch.tensor(5, dtype=torch.long) torch.onnx.export(model, (dummy_input, loop_count), ‘loop.onnx’, verbose=True)
使用基于跟蹤的導出器,我們得到結果ONNX圖,該圖展開了for循環:
graph(%0 : Long(2, 3),
%1 : Long()):
%2 : Tensor =
onnx::Constantvalue={1}
%3 : Tensor = onnx::Add(%0, %2)
%4 : Tensor =
onnx::Constantvalue={2}
%5 : Tensor = onnx::Add(%3, %4)
%6 : Tensor = onnx::Constantvalue={3}
%7 : Tensor = onnx::Add(%5, %6)
%8 : Tensor =
onnx::Constantvalue={4}
%9 : Tensor = onnx::Add(%7, %8)
return (%9)
為了利用基于腳本的輸出得到動態循環,可以在腳本中編寫循環,然后從常規nn.Module中調用它:
# Mixing tracing and scripting @torch.jit.script def loop(x, y): for i in range(int(y)): x = x + i return x class LoopModel2(torch.nn.Module): def forward(self, x, y): return loop(x, y) model = LoopModel2() dummy_input = torch.ones(2, 3, dtype=torch.long) loop_count = torch.tensor(5, dtype=torch.long) torch.onnx.export(model, (dummy_input, loop_count), ‘loop.onnx’, verbose=True, input_names=[‘input_data’, ‘loop_range’])
現在,導出的ONNX圖變為:
graph(%input_data : Long(2, 3),
%loop_range : Long()):
%2 : Long() =
onnx::Constantvalue={1}, scope:
LoopModel2/loop
%3 : Tensor = onnx::Castto=9
%4 : Long(2, 3) = onnx::Loop(%loop_range, %3, %input_data),
scope: LoopModel2/loop #
custom_loop.py:240:5
block0(%i.1 : Long(), %cond : bool, %x.6 : Long(2, 3)):
%8 : Long(2, 3) = onnx::Add(%x.6, %i.1), scope: LoopModel2/loop #
custom_loop.py:241:13
%9 : Tensor = onnx::Castto=9
-> (%9, %8)
return (%4)
動態控制流已正確得到。可以在具有不同循環范圍的后端進行驗證。
import caffe2.python.onnx.backend as backend
import numpy as np
import onnx
model = onnx.load(‘loop.onnx’)
rep = backend.prepare(model)
outputs = rep.run((dummy_input.numpy(), np.array(9).astype(np.int64)))
print(outputs[0])
#[[37 37 37]
[37 37 37]]
import onnxruntime as ort
ort_sess = ort.InferenceSession(‘loop.onnx’)
outputs = ort_sess.run(None, {‘input_data’: dummy_input.numpy(),
‘loop_range’: np.array(9).astype(np.int64)})
print(outputs)
#[array([[37, 37, 37],
[37, 37, 37]], dtype=int64)]
為避免將可變標量張量作為固定值常量,導出為ONNX模型的一部分,避免使用torch.Tensor.item()。torch支持將single-element張量隱式轉換為數字。例如:
class LoopModel(torch.nn.Module):
def forward(self, x, y):
res = []
arr = x.split(2, 0)
for i in range(int(y)):
res += [arr[i].sum(0, False)]
return torch.stack(res)
model = torch.jit.script(LoopModel())
inputs = (torch.randn(16), torch.tensor(8))
out = model(*inputs)
torch.onnx.export(model,
inputs, ‘loop_and_list.onnx’, opset_version=11,
example_outputs=out)
TorchVision支持
除量化外,所有TorchVision模型均可導出到ONNX。可以在TorchVision中找到更多詳細信息。
局限性
·
僅將元組,列表和變量作為JIT輸入/輸出支持。也接受字典和字符串,但不建議使用。用戶需要仔細驗證自己的字典輸入,并記住動態查詢不可用。
·
PyTorch和ONNX后端(Caffe2,ONNX Runtime等)通常具有某些數字差異的算子實現。根據模型結構,這些差異可能可以忽略不計,但是也可能導致性能的重大差異(尤其是在未經訓練的模型上。)允許Caffe2直接調用算子的Torch實現,在精度很重要時,幫助消除這些差異,并記錄這些差異。
總結
以上是生活随笔為你收集整理的从PyTorch到ONNX的端到端AlexNet的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 使用TENSORRT和NVIDIA-DO
- 下一篇: onnx算子大全