Pytorch_Quantization_EX

0. Introduction

Goal : Quantization Model PTQ & QAT
Process :
1. Pytorch model train with custom dataset
2. Pytorch-Quantization model calibration for ptq
3. Pytorch-Quantization model fine tuning for qat
4. Generation TensorRT int8 model from Pytorch-Quantization model
5. Generation TensorRT int8 model using tensorrt calibration class
Sample Model : Resnet18
Dataset : imagenet100

1. Development Environment

Device
- Windows 10 laptop
- CPU i7-11375H
- GPU RTX-3060
Dependency
- cuda 12.1
- cudnn 8.9.2
- tensorrt 8.6.1
- pytorch 2.1.0+cu121

2. Code Scheme

    Quantization_EX/
    ├── calibrator.py       # calibration class for TensorRT PTQ
    ├── common.py           # utils for TensorRT
    ├── infer.py            # base model infer
    ├── onnx_export.py      # onnx export
    ├── ptq.py              # Post Train Quantization
    ├── qat.py              # Quantization Aware Training
    ├── quant_utils.py      # utils for quantization
    ├── train.py            # base model train
    ├── trt_infer.py        # TensorRT model infer
    ├── utils.py            # utils
    ├── LICENSE
    └── README.md

3. Performance Evaluation

Calculation 10000 iteration with one input data [1, 3, 224, 224]

	TRT	TRT	TRT PTQ	PT-Q PTQ	PT-Q PTQ w bnf	PT-Q QAT	PT-Q QAT w bnf
Precision	FP32	FP16	Int8	Int8	Int8	Int8	Int8
Acc Top-1 [%]	83.08	83.04	83.12	83.18	82.64	83.42	82.80
Avg Latency [ms]	1.188 ms	0.527 ms	0.418 ms	0.566 ms	0.545 ms	0.577 ms	0.534 ms
Avg FPS [frame/sec]	841.74 fps	1896.01 fps	2388.33 fps	1764.55 fps	1834.69 fps	1730.89 fps	1870.99 fps
Gpu Memory [MB]	179 MB	135 MB	123 MB	129 MB	129 MB	129 MB	129 MB

PT : Ptorch Quantization
TRT : TensorRT
bnf : bach normalization folding (conv + bn -> conv')

4. Guide

infer -> train -> ptq -> qat -> onnx_export -> trt_infer -> trt_infer_acc

5. Reference

pytorch-quantization : https://github.com/NVIDIA/TensorRT/tree/master/tools/pytorch-quantization
imagenet100 : https://www.kaggle.com/datasets/ambityga/imagenet100

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pytorch_Quantization_EX

0. Introduction

1. Development Environment

2. Code Scheme

3. Performance Evaluation

4. Guide

5. Reference

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
calibrator.py		calibrator.py
common.py		common.py
infer.py		infer.py
infer_bn_folding.py		infer_bn_folding.py
onnx_export.py		onnx_export.py
ptq.py		ptq.py
qat.py		qat.py
quant_utils.py		quant_utils.py
train.py		train.py
trt_infer.py		trt_infer.py
trt_infer_acc.py		trt_infer_acc.py
utils.py		utils.py

License

yester31/Quantization_EX

Folders and files

Latest commit

History

Repository files navigation

Pytorch_Quantization_EX

0. Introduction

1. Development Environment

2. Code Scheme

3. Performance Evaluation

4. Guide

5. Reference

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages