Skip to content

Latest commit

 

History

History
320 lines (221 loc) · 8.99 KB

README.md

File metadata and controls

320 lines (221 loc) · 8.99 KB

YOLOv8-ONNX-TensorRT

Python PyTorch JetPack CUDA TensorRT GitHub all releases

👀 Apply YOLOv8 exported with ONNX or TensorRT(FP16, INT8) to the Real-time camera

🏆 Performance

Note

  • Tested on Nvidia Jetson Orin Nano

⭐ ONNX (CPU)

details
YOLOv8n
Model Quantization FPS Speed
(ms)
mAPval
50-95
yolov8n.pt 2 535.8 37.1
yolov8n.onnx FP16 7 146 37
YOLOv8s
Model Quantization FPS Speed
(ms)
mAPval
50-95
yolov8s.pt 1 943.9 44.7
yolov8s.onnx FP16 3 347.6 44.7
YOLOv8m
Model Quantization FPS Speed
(ms)
mAPval
50-95
yolov8m.pt 0.5 1745.2 50.1
yolov8m.onnx FP16 1.2 1126.3 50.1

YOLOv8l and YOLOv8x were too slow to measure

⭐ TensorRT (GPU)

details
YOLOv8n
Model Quantization FPS Speed
(ms)
mAPval
50-95
yolov8n.pt 36 21.9 37.1
yolov8n.engine FP16 60 7.3 37.1
yolov8n.engine INT8 63 5.8 33
YOLOv8s
Model Quantization FPS Speed
(ms)
mAPval
50-95
yolov8s.pt 27 33.1 44.7
yolov8s.engine FP16 48 11.4 44.7
yolov8s.engine INT8 57 8.2 41.2
YOLOv8m
Model Quantization FPS Speed
(ms)
mAPval
50-95
yolov8m.pt 14 66.5 50.1
yolov8m.engine FP16 30 23.6 50
yolov8m.engine INT8 38 17.1 46.2
YOLOv8l
Model Quantization FPS Speed
(ms)
mAPval
50-95
yolov8l.pt 9 103.2 52.9
yolov8l.engine FP16 22 35.5 52.8
yolov8l.engine INT8 31 22.4 50.1
YOLOv8x
Model Quantization FPS Speed
(ms)
mAPval
50-95
yolov8x.pt 6 160.2 54
yolov8x.engine FP16 15 56.6 53.9
yolov8x.engine INT8 24 33.9 51.1

Note

  • FPS is based on when an object is detected
  • Speed average and mAPval values are for single-model single-scale on COCO val2017 dataset

Tip

  • You can download the ONNX and TensorRT files from the release

Caution

  • Optimizing and exporting models on your own devices will give you the best results

✏️ Prepare

  1. Install CUDA

    🚀 CUDA official website

  2. Install PyTorch

    🚀 PyTorch official website

  3. Install if using TensorRT

    🚀 TensorRT official website

  4. Git clone and Install python requirements

    git clone https://github.com/the0807/YOLOv8-ONNX-TensorRT
    cd YOLOv8-ONNX-TensorRT
    pip install -r requirements.txt
  5. Install or upgrade ultralytics package

    # Install
    pip install ultralytics
    
    # Upgrade
    pip install -U ultralytics
  6. Prepare your own datasets with PyTorch weights such as 'yolov8n.pt '

  7. (Optional) If you want to test with YOLOv8 base model rather than custom model, please run the code and prepare the COCO dataset

    cd datasets
    
    # It will take time to download
    python3 coco_download.py

Important

⚡️ Optional (recommend for high speed)

⭐ Jetson

  • Enable MAX Power Mode and Jetson Clocks

    # MAX Power Mode
    sudo nvpmodel -m 0
    
    # Enable Clocks (Do it again when you reboot)
    sudo jetson_clocks
  • Install Jetson Stats Application

    sudo apt update
    sudo pip install jetson-stats
    sudo reboot
    jtop

📚 Usage

⭐ ONNX

details

1. Turn the PyTorch model into ONNX

python3 export_onnx.py --model 'model/yolov8n.pt' --q fp16 --data='datasets/coco.yaml'

Description of all arguments:

  • --model : required The PyTorch model you trained such as yolov8n.pt
  • --q : Quantization method [fp16]
  • --data : Path to your data.yaml
  • --batch : Specifies export model batch inference size or the max number of images the exported model will process concurrently in predict mode.

2. Real-time camera inference

python3 run_camera.py --model 'model/yolov8n.onnx' --q fp16

Description of all arguments:

  • --model : The PyTorch model you trained such as yolov8n.onnx
  • --q : Quantization method [fp16]

⭐ TensorRT

details

1. Turn the PyTorch model into TensorRT engine

python3 export_tensorrt.py --model 'model/yolov8n.pt' --q int8 --data='datasets/coco.yaml' --workspace 4 --batch 1

Description of all arguments:

  • --model : required The PyTorch model you trained such as yolov8n.pt
  • --q : Quantization method [fp16, int8]
  • --data : Path to your data.yaml
  • --batch : Specifies export model batch inference size or the max number of images the exported model will process concurrently in predict mode.
  • --workspace : Sets the maximum workspace size in GiB for TensorRT optimizations, balancing memory usage and performance.

2. Real-time camera inference

python3 run_camera.py --model 'model/yolov8n.engine' --q int8

Description of all arguments:

  • --model : The PyTorch model you trained such as yolov8n.pt or yolov8n.engine
  • --q : Quantization method [fp16, int8]

Important

  • When exporting to TensorRT(INT8), calibration process is performed using validation data of database. To minimize the loss of mAP, more than 1,000 validation data are recommended if there are at least 300.

Tip

Warning

  • If aborted or killed appears, reduce the --batch and --workspace

🧐 Validation

⭐ ONNX

details
python3 validation.py --model 'model/yolov8n.onnx' --q fp16 --data 'datasets/coco.yaml'

Description of all arguments:

  • --model : required The PyTorch model you trained such as yolov8n.onnx
  • --q : Quantization method [fp16]
  • --data : Path to your validata.yaml

⭐ TensorRT

details
python3 validation.py --model 'model/yolov8n.engine' --q int8 --data 'datasets/coco.yaml'

Description of all arguments:

  • --model : required The PyTorch model you trained such as yolov8n.engine
  • --q : Quantization method [fp16, int8]
  • --data : Path to your validata.yaml