Implement of CenterNet on visdrone2019 dataset. The neck is modified to fpn with deconv.
The entire project has less than 2000 lines of code.
- Python >= 3.6
- PyTorch >= 1.6
- opencv-python
- pycocotools
- numba
backbone | mAP/flip | AP50/flip | AP75/flip | inference time/flip | download |
---|---|---|---|---|---|
resnet18 | 24.70/26.26 | 49.22/51.56 | 21.33/23.10 | 0.017s/0.027s | google drive |
resnet50 | 28.13/29.46 | 53.91/55.67 | 25.36/26.75 | 0.026s/0.043s | google drive |
res2net50 | 29.93/31.05 | 56.46/58.01 | 27.47/28.58 | 0.035s/0.055s | google drive |
The inference time(pure net time) is measured on a single NVIDIA Titan V GPU.
The resolution of image is 1280*960.
Flip means using flip test.
The data structure would look like:
data/
visdrone/
annotations/
train/
test/
val/
Coco format visdrone2019 download from google drive. You can also download the original dataset from http://aiskyeye.com and use the tools in src/tools to convert the format by yourself.
python main.py \
--arch resnet18 \
--min_overlap 0.3 \
--gpus 0,1 \
--num_epochs 100 \
--lr_step 60,80 \
--batch_size 4 \
--lr 0.15625e-4 \
--exp_id <save_dir>
You can specify more parameters in src/opt.py.
Results(weights and logs) will default save to exp/default if you dont specify --exp_id.
Arch supports resnet18,resnet34,resnet50,resnet101,resnet152,res2net50,res2net101.
If you scale batch_size, lr should scale too.
python test.py \
--arch resnet18 \
--gpus 0 \
--load_model <path/to/weight_name.pt> \
--flip_test
python demo.py \
--arch resnet18 \
--gpus 0 \
--load_model <path/to/weight_name.pt> \
--image <path/to/your_picture.jpg>
https://github.com/xingyizhou/CenterNet
https://github.com/yjh0410/CenterNet-Lite
https://github.com/tjiiv-cprg/visdrone-det-toolkit-python
https://blog.csdn.net/mary_0830/article/details/103361664