Most existing Siamese-based tracking methods execute the classification and regression of the target object based on the similarity maps. However, they either employ a single map from the last convolutional layer which degrades the localization accuracy in complex scenarios or separately use multiple maps for decision making, introducing intractable computations for aerial mobile platforms. Thus, in this work, we propose an efficient and effective hierarchical feature transformer (HiFT) for aerial tracking. Hierarchical similarity maps generated by multi-level convolutional layers are fed into the feature transformer to achieve the interactive fusion of spatial (shallow layers) and semantics cues (deep layers). Consequently, not only the global contextual information can be raised, facilitating the target search, but also our end-to-end architecture with the transformer can efficiently learn the interdependencies among multi-level features, thereby discovering a tracking-tailored feature space with strong discriminability. Comprehensive evaluations on four aerial benchmarks have proven the effectiveness of HiFT. Real-world tests on the aerial platform have strongly validated its practicability with a real-time speed.
This figure shows the workflow of our tracker.
This code has been tested on Ubuntu 18.04, Python 3.8.3, Pytorch 0.7.0/1.6.0, CUDA 10.2. Please install related libraries before running this code:
pip install -r requirements.txt
Download pretrained model: general_model(code: c99t) general_model_googleand put it into tools/snapshot
directory.
Download testing datasets and put them into test_dataset
directory. If you want to test the tracker on a new dataset, please refer to pysot-toolkit to set test_dataset.
python test.py
--dataset UAV10fps #dataset_name
--snapshot snapshot/general_model.pth # tracker_name
The testing result will be saved in the results/dataset_name/tracker_name
directory.
Download the datasets:
Note: train_dataset/dataset_name/readme.md
has listed detailed operations about how to generate training datasets.
To train the SiamAPN model, run train.py
with the desired configs:
cd tools
python train.py
We provide the tracking results (code: tj12) results_google of UAV123@10fps, DTB70, UAV20L, and UAV123. If you want to evaluate the tracker, please put those results into results
directory.
python eval.py \
--tracker_path ./results \ # result path
--dataset UAV20 \ # dataset_name
--tracker_prefix 'general_model' # tracker_name
If you have any questions, please contact me.
Ziang Cao
Email: 1753419@tongji.edu.cn
Result on DTB70 and UAV20L
For more evaluations, please refer to our paper.
@INPROCEEDINGS{cao2021iccv,
author={Cao, Ziang and Fu, Changhong and Ye, Junjie and Li, Bowen and Li, Yiming},
booktitle={Proceedings of the IEEE International Conference on Computer Vision (ICCV)},
title={{HiFT: Hierarchical Feature Transformer for Aerial Tracking}},
year={2021},
volume={},
number={},
pages={1-10}
}
The code is implemented based on pysot. We would like to express our sincere thanks to the contributors.