FocalDistiller

This repository is the official implementation of the CVPR 2023 paper "Distilling Focal Knowledge from Imperfect Expert for 3D object Detection".

Authors: Jia Zeng, Li Chen, Hanming Deng, Lewei Lu, Junchi Yan, Yu Qiao, Hongyang Li

Abstract

Multi-camera 3D object detection blossoms in recent years and most of state-of-the-art methods are built up on the bird's-eye-view (BEV) representations. Albeit remarkable performance, these works suffer from low efficiency. Typically, knowledge distillation can be used for model compression. However, due to unclear 3D geometry reasoning, expert features usually contain some noisy and confusing areas. In this work, we investigate on how to distill the knowledge from an imperfect expert. We propose FD3D, a Focal Distiller for 3D object detection. Specifically, a set of queries are leveraged to locate the instance-level areas for masked feature generation, to intensify feature representation ability in these areas. Moreover, these queries search out the representative fine-grained positions for refined distillation. We verify the effectiveness of our method by applying it to two popular detection models, BEVFormer and DETR3D. The results demonstrate that our method achieves improvements of 4.07 and 3.17 points respectively in terms of NDS metric on nuScenes benchmark.

Usage

Installation

Environment requirements:

torch==1.9.1+cu111 
torchvision==0.10.1+cu111 
torchaudio==0.9.1
omgarcia
gcc-6
mmcv-full==1.4.0
mmdet==2.14.0
mmsegmentation==0.14.1
mmdet3d==0.17.1

We recommend you to follow the guide of BEVFormer for environment configuration and dataset preparation.
Clone the repo Birds-eye-view-Perception and move nuScenes_playground/FocalDistiller under mmdetection3d.
Download the required weights(Google disk/Baidu disk (code:8888)) and put them into folder FocalDistiller/ckpts

Training

For training a teacher network or student network, run the script tools/dist_train.sh. For instance:

# train network bevformer(base version)
tools/dist_train.sh ./projects/configs/bevformer/bevformer_base.py 8

# train network bevformer(small version)
tools/dist_train.sh ./projects/configs/bevformer/bevformer_small.py 8

Evaluation

For validating a teacher network or student network, run the script tools/test.py. For instance:

# test network bevformer(base version)
PYTHONPATH=".":$PYTHONPATH python -m torch.distributed.launch --nproc_per_node=1 tools/test.py projects/configs/bevformer/bevformer_base.py  ckpts/bevformer_r101_dcn_24ep.pth --launcher pytorch --eval bbox

# test network bevformer(small version)
PYTHONPATH=".":$PYTHONPATH python -m torch.distributed.launch --nproc_per_node=1  tools/test.py projects/configs/bevformer/bevformer_small.py ckpts/bevformer_small_ep24.pth --launcher pytorch --eval bbox

Knowlwedge distillation

For knowledge transfer between teacher and student networks, run the script tools/dist_distill.sh. For instance:

./tools/dist_distill.sh projects/configs/distiller/base_distill_small_with_pv-cwd_bev-l2-heatmap.py 8

For evaluating the student network after distillation, run the script tools/testStu.py. For instance:

PYTHONPATH=".":$PYTHONPATH python -m torch.distributed.launch --nproc_per_node=1 tools/testStu.py projects/configs/distiller/base_distill_small_with_pv-cwd_bev-l2-heatmap.py ckpts/base_distill_small_with_pv-cwd_bev-l2-heatmap_ep24.pth --launcher pytorch --eval bbox

Main results

Models and results under main metrics are provided below.

Method	Back-bone	Image Res.	BEV Res.	NDS	mAP	GFLOPS	FPS	Config	ckpt
BEVFormer-Base (T)	R101-DCN	900x1600	200x200	51.76	41.66	1323.41	1.8	config	weight
BEVFormer-Small (S)	R101-DCN	450x800	100x100	46.83	35.09	416.46	5.9	config	weight
+ HeatmapDistiller	R101-DCN	450x800	100x100	48.98	37.27	416.46	5.9	config	weight

The metric FPS is measure on RTX 2080Ti.

TODO list

Codebase for knowledge distillation in BEV perception
release implementation of FocalDistiller (BEVFormer)
release implementation of FocalDistiller (DETR3D, BEVDepth, 2D detection models)

License

All assets and code are under the Apache 2.0 license unless specified otherwise.

Citation

Please consider citing our paper if the project helps your research with the following BibTex:

@inproceedings{zeng2023distilling,
  title={Distilling Focal Knowledge from Imperfect Expert for 3D Object Detection},
  author={Zeng, Jia and Chen, Li and Deng, Hanming and Lu, Lewei and Yan, Junchi and Qiao, Yu and Li, Hongyang},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={992--1001},
  year={2023}
}

Acknowledgement

mmdet3d
BEVFormer
BEVDet
DETR3D

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

FocalDistiller

Abstract

Usage

Installation

Training

Evaluation

Knowlwedge distillation

Main results

TODO list

License

Citation

Acknowledgement

Files

README.md

Latest commit

History

README.md

File metadata and controls

FocalDistiller

Abstract

Usage

Installation

Training

Evaluation

Knowlwedge distillation

Main results

TODO list

License

Citation

Acknowledgement