This repository guides freshmen who does not have background of parallel programming in C++ to learn CUDA and TensorRT from the beginning.
This repository is still working in progress(~24/02/21). I will add some more samples and more detailed description in the future. Please feel free to contribute to this repository
Please pull the repository firstly
git clone git@github.com:kalfazed/tensorrt_starter.git
After clone the repository, please modify the opencv, cuda, cudnn, and TensorRT version and install directory in config/Makefile.config
located in the root direcoty of the repository. The recommaned version in this repository is opencv==4.x, cuda==11.6, cudnn==8.9, TensorRT==8.6.1.6
# Please change the cuda version if needed
# In default, cuDNN library is located in /usr/local/cuda/lib64
CXX := g++
CUDA_VER := 11
# Please modify the opencv and tensorrt install directory
OPENCV_INSTALL_DIR := /usr/local/include/opencv4
TENSORRT_INSTALL_DIR := /home/kalfazed/packages/TensorRT-8.6.1.6
Besides, please also change the ARCH
in config/Makefile.config
. This parameter will be used by nvcc
, which is a compiler for cuda program.
Inside each subfolder of each chapter, the basic directory structure is as follow: (For some chapters, it will be different)
|-config
|- Makefile.config
|-src
|- cpp
|- xxx.c
|- python
|- yyy.py
|-Makefile
Please run make
firstly, then it will generate a binary named trt-cuda
or trt-infer
, depending on different chapters. Pleae run the binary directly or run make run
command.
- 2.1-dim_and_index
- 2.2-cpp_cuda_interactive
- 2.3-matmul-basic
- 2.4-error-handler
- 2.5-device-info
- 2.6-nsight-system-and-compute
- 2.7-matmul-shared-memory
- 2.8-bank-conflict
- 2.9-stream-and-event
- 2.10-bilinear-interpolation
- 2.11-bilinear-interpolation-template
- 2.12-affine-transformation
- 2.13-implicit-gemm-conv
- 2.14-pcd-voxelization
- 3.1-generate-onnx
- 3.2-export-onnx
- 3.3-read-and-parse-onnx
- 3.4-export-unsupported-node
- 3.5-onnxsurgeon
- 3.6-export-onnx-from-oss
- 3.7-trtexec-analysis
- 5.1-mnist-sample
- 5.2-load-model
- 5.3-infer-model
- 5.4-print-structure
- 5.5-build-model
- 5.6-build-sub-graph
- 5.7-custom-basic-trt-plugin
- 5.8-plugin-unit-test