Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversion to TRT failure of TensorRT 8.6.1.6 when converting CO-DETR model on GPU RTX 4090 #4280

Open
edwardnguyen1705 opened this issue Dec 12, 2024 · 1 comment

Comments

@edwardnguyen1705
Copy link

edwardnguyen1705 commented Dec 12, 2024

Description

I tried to convert model CO-DETR to TRT, but it fails with error below

[12/12/2024-02:17:38] [E] Error[10]: Could not find any implementation for node {ForeignNode[/0/Cast_3.../0/backbone/Reshape_3 + /0/backbone/Transpose_3]}.

Environment

TensorRT Version: 8.6.1.6

NVIDIA GPU: NVIDIA GeForce RTX 4090

NVIDIA Driver Version: 555.42.06

CUDA Version: 12.0

CUDNN Version:

Operating System: Ubuntu 22.04.3 LTS

Python Version (if applicable):

Tensorflow Version (if applicable):

PyTorch Version (if applicable):

Baremetal or Container (if so, version):

Relevant Files

Model link: https://drive.google.com/file/d/1voa7liji1OJxDQ8tphnbnm6EM-MexI_v/view?usp=drive_link

Steps To Reproduce

  • Env preparation: Build docker image following TensorRT-Docker-Image
  • Docker run and go to docker container
  • PyTorch to ONNX: follow DeepStream-Yolo
  • ONNX to TRT: trtexec --onnx=co_dino_5scale_swin_large_16e_o365tococo_h1280w1280.onnx --saveEngine=co_dino_5scale_swin_large_16e_o365tococo_h1280w1280.engine --explicitBatch --minShapes=input:1x3x1280x1280 --optShapes=input:2x3x1280x1280 --maxShapes=input:4x3x1280x1280 --fp16 --memPoolSize=workspace:10000 --tacticSources=-cublasLt,+cublas --sparsity=enable --verbose

Commands or scripts:

Have you tried the latest release?: Not yet.

Can this model run on other frameworks? For example run ONNX model with ONNXRuntime (polygraphy run <model.onnx> --onnxrt): If I follow steps described in DeepStream-Yolo, then the generated engine file works, but the speed is slow. Therefore, I would like to use trtexec.

@lix19937
Copy link

Can you do a test, remove --memPoolSize=workspace:10000 ? And then reduce the size of dynamic shape ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants