Skip to content

Latest commit

 

History

History
 
 

Evaluate performance of ONNX Runtime(VGG16)

ONNX runtime quantization is under active development. please use 1.6.0+ to get more quantization support.

This example load an image classification model exported from PyTorch and confirm its accuracy and speed based on ILSVR2012 validation Imagenet dataset. You need to download this dataset yourself.

Environment

onnx: 1.9.0 onnxruntime: 1.10.0

Prepare model

Please refer to pytorch official guide for detailed model export. The following is a simple example:

import torch
import torchvision
batch_size = 1
model = torchvision.models.vgg16(pretrained=True)
x = torch.randn(batch_size, 3, 224, 224, requires_grad=True)
torch_out = model(x)

# Export the model
torch.onnx.export(model,               # model being run
                  x,                         # model input (or a tuple for multiple inputs)
                  "vgg16.onnx",           # where to save the model (can be a file or file-like object)
                  export_params=True,        # store the trained parameter weights inside the model file
                  opset_version=11,          # the ONNX version to export the model to, please ensure at least 11.
                  do_constant_folding=True,  # whether to execute constant folding for optimization
                  input_names = ['input'],   # the model's input names
                  output_names = ['output'], # the model's output names
                  dynamic_axes={'input' : {0 : 'batch_size'},    # variable length axes
                                'output' : {0 : 'batch_size'}})

Quantization

Quantize model with QLinearOps:

bash run_tuning.sh --input_model=path/to/model \ # model path as *.onnx
                   --config=vgg16.yaml \
                   --output_model=path/to/save

Quantize model with QDQ mode:

bash run_tuning.sh --input_model=path/to/model \ # model path as *.onnx
                   --config=vgg16_qdq.yaml \
                   --output_model=path/to/save

Benchmark

bash run_benchmark.sh --input_model=path/to/model \  # model path as *.onnx
                      --config=vgg16.yaml \
                      --mode=performance # or accuracy