Skip to content

🌕 [ICCV 2021] Multitask AET with Orthogonal Tangent Regularity for Dark Object Detection. A self-supervised learning way for low-light image object detection.

License

Notifications You must be signed in to change notification settings

cuiziteng/ICCV_MAET

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

(ICCV 2021) Multitask AET with Orthogonal Tangent Regularity for Dark Object Detection (paper) (supp) (zhihu中文解读)

Abstract

Dark environment becomes a challenge for computer vision algorithms owing to insufficient photons and undesirable noise. To enhance object detection in a dark environment, we propose a novel multitask auto encod- ing transformation (MAET) model which is able to explore the intrinsic pattern behind illumination translation. In a self-supervision manner, the MAET learns the intrinsic visual structure by encoding and decoding the realistic illumination-degrading transformation considering the physical noise model and image signal processing (ISP). Based on this representation, we achieve the object detection task by decoding the bounding box coordinates and classes. To avoid the over-entanglement of two tasks, our MAET disentangles the object and degrad- ing features by imposing an orthogonal tangent regularity. This forms a parametric manifold along which multi-task predictions can be geometrically formulated by maximizing the orthogonality between the tangents along the outputs of respective tasks. Our framework can be implemented based on the mainstream object detection ar- chitecture and directly trained end-to-end using normal target detection datasets, such as VOC and COCO. We have achieved the state-of-the-art performance using synthetic and real-world datasets.

When Human Vision Meets Machine Vision (compare with enhancement methods):

Physics-based low-light degrading transformation (unprocess -- degradation -- ISP):

Enviroment

python 3.7
pytorch 1.6.0
mmcv 1.1.5 (for example CUDA10.1 and torch 1.6.0: pip install mmcv-full==1.1.5 -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.6.0/index.html, detail see: https://github.com/open-mmlab/mmcv)
matplotlib opencv-python Pillow tqdm scipy

Pre-trained Model

dataset model size logs
MAET-COCO (ours) 80 class (google drive) (baiduyun, passwd:1234) 489.10 MB -
MAET-EXDark (ours) (77.7) 20 class (google drive) (baiduyun, passwd:1234) 470.26 MB google drive
EXDark (76.8) 20 class (google drive) (baiduyun, passwd:1234) 470.26 MB -
EXDark (MBLLEN) (76.3) 20 class (google drive) (baiduyun, passwd:1234) 470.26 MB -
EXDark (Kind) (76.3) 20 class (google drive) (baiduyun, passwd:1234) 470.26 MB -
EXDark (Zero-DCE) (76.9) 20 class (google drive) (baiduyun, passwd:1234) 470.26 MB -
MAET-UG2-DarkFace (ours) (56.2) 1 class (google drive) (baiduyun, passwd:1234) 469.81 MB -

Pre-process

Step-1:

For MS COCO Dataset (Use for Pre-train): Download COCO 2017 dataset.

For EXDark Dataset (Use for Fine-tune and Evaluation): Download EXDark (include EXDark enhancement by MBLLEN, Zero-DCE, KIND) in VOC format from google drive or baiduyun, passwd:1234. The EXDark dataset should be look like:

EXDark
│      
│
└───JPEGImages
│   │───IMGS (original low light)
│   │───IMGS_Kind (imgs enhancement by [Kind, mm2019])
│   │───IMGS_ZeroDCE (imgs enhancement by [ZeroDCE, cvpr 2020])
│   │───IMGS_MEBBLN (imgs enhancement by [MEBBLN, bmvc 2018])
│───Annotations   
│───main
│───label

For UG2-DarkFace Dataset (Use for Fine-tune and Evaluation): Download UG2 in VOC format from google drive or baiduyun, passwd:1234. The UG2-DarkFace dataset should be look like:

UG2
│      
└───main
│───xml  
│───label
│───imgs

Step-2: Cd in "your_project_path", and do set-up process (see mmdetection if you want find details):

git clone git@github.com:cuiziteng/ICCV_MAET.git
cd "your project path"
pip install -r requirements/build.txt
pip install -v -e .  # or "python setup.py develop"

Step-3: Change the data place line1 and line2 to your own COCO and EXDark path, and line3 to your own UG2-DarkFace path.

Testing

Testing MAET-YOLOV3 on (low-light) COCO dataset

python tools/test.py configs/MAET_yolo/maet_yolo_coco_ort.py [COCO model path] --eval bbox --show-dir [save dir]

Testing MAET-YOLOV3 on EXDark dataset

python tools/test.py configs/MAET_yolo/maet_yolo_exdark.py  [EXDark model path] --eval mAP --show-dir [save dir]

Testing MAET-YOLOV3 on UG2-DarkFace dataset

python tools/test.py configs/MAET_yolo/maet_yolo_ug2.py [UG2-DarkFace model path] --eval mAP --show-dir [save dir]

Comparative Experiment
Testing YOLOV3 on EXDark dataset enhancement by MEBBLN/ Kind/ Zero-DCE

python tools/test.py configs/MAET_yolo/yolo_mbllen.py (yolo_kind.py, yolo_zero_dce.py)  [MEBBLN/ Kind/ Zero-DCE model] --eval mAP --show-dir [save dir]

Training

Setp-1: Pre-train MAET-COCO model (273 epochs on 4 GPUs): (if use other GPU number, please reset learining rate), or direct download our pre-train COCO model (google drive) (baiduyun, passwd:1234).

CUDA_VISIBLE_DEVICES=0,1,2,3 PORT=[port number] bash ./tools/dist_train_maet.sh configs/MAET_yolo/maet_yolo_coco_ort.py 4

Setp-2 (EXDark): Fine-tune on EXDark datastet (25epoch on 1 GPU):

python tools/train.py configs/MAET_yolo/maet_yolo_exdark.py --gpu-ids [gpu id] --load-from [COCO model path]

Setp-2 (UG2-DarkFace): Fine-tune on UG2-DarkFace datastet (20epoch on 1 GPU):

python tools/train.py configs/MAET_yolo/maet_yolo_ug2.py --gpu-ids [gpu id] --load-from [COCO model path]

Comparative Experiment
Fine-tune EXDark dataset enhancement by MEBBLN/ Kind/ Zero-DCE (25epoch on 1 GPU) on well-trained normal COCO model (608x608) for fairness

python tools/train.py configs/MAET_yolo/yolo_mbllen.py (yolo_kind.py, yolo_zero_dce.py) --gpu-ids [gpu id]

Baselines on EXDark dataset (renew), the baseline detector is YOLO-V3:

Baselines on EXDark dataset (renew) on YOLO-V3 object detector:

class Bicycle Boat Bottle Bus Car Cat Chair Cup Dog Motorbike People Table Total
Baseline 79.8 75.3 78.1 92.3 83.0 68.0 69.0 79.0 78.0 77.3 81.5 55.5 76.4
KIND (MM 2019) 80.1 77.7 77.2 93.8 83.9 66.9 68.7 77.4 79.3 75.3 80.9 53.8 76.3
MBLLEN (BMVC 2018) 82.0 77.3 76.5 91.3 84.0 67.6 69.1 77.6 80.4 75.6 81.9 58.6 76.8
Zero-DCE (CVPR 2020) 84.1 77.6 78.3 93.1 83.7 70.3 69.8 77.6 77.4 76.3 81.0 53.6 76.9
MAET (ICCV 2021) 83.1 78.5 75.6 92.9 83.1 73.4 71.3 79.0 79.8 77.2 81.1 57.0 77.7
DENet (ACCV 2022) 80.4 79.7 77.9 91.2 82.7 72.8 69.9 80.1 77.2 76.7 82.0 57.2 77.3
IAT-YOLO (BMVC 2022) 79.8 76.9 78.6 92.5 83.8 73.6 72.4 78.6 79.0 79.0 81.1 57.7 77.8

Citation

If our work help to your research, please cite our paper, thx.

@InProceedings{Cui_2021_ICCV,
    author    = {Cui, Ziteng and Qi, Guo-Jun and Gu, Lin and You, Shaodi and Zhang, Zenghui and Harada, Tatsuya},
    title     = {Multitask AET With Orthogonal Tangent Regularity for Dark Object Detection},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {2553-2562}
}

If you also interest in low-light image enhancement & exposure correction, please refer to our BMVC2022 project Illumination adaptive transformer.

The code is largely borrow from mmdetection and unprocess, Thx to their wonderful works~
MMdetection: mmdetection (v2.7.0)
Unprocessing Images for Learned Raw Denoising: unprocess

About

🌕 [ICCV 2021] Multitask AET with Orthogonal Tangent Regularity for Dark Object Detection. A self-supervised learning way for low-light image object detection.

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages