The repository contains the code for this article, and is about building a custom object detection algorithm from scratch for satellite imagery.
Information about the files and folders
- download_planet_image.py: for downloading the Planet image tiles. Requires the planet api-key and AOI geometry (in geojson).
- image_to_chip.py: converts the labeled satellite tiles into chips of smaller size.
- train.txt and test.txt: contains the training and test image paths
- dataset: holds the full labeled dataset for training the model.
- ground_truth: contains the images with their ground truth bounding boxes.
The dataset was trained on the YOLOv3 model and darknet framework.
Follow these commands for setting up the darknet framework
git clone https://github.com/pjreddie/darknet
cd darknet
make
Get the pre-trained weights from here.
wget https://pjreddie.com/media/files/darknet53.conv.74
Try running the detector to confirm the installation.
./darknet detect cfg/yolov3.cfg yolov3.weights data/dog.jpg
Darknet requires certain files to know how and what to train.
- data/ship.names
- cfg/ship.data
- cfg/yolov3-ship.cfg
.names
file contains the name of the object categories you want to detect. Now, since we have got only one object category (ship), the file will be like this. This name is shown over the bounding box in the output. For more than one object, every name has to be written in a separate line.
.data
file contains information about the training data. The details in this file are pretty much self-explanatory. names
variable will contain the path to the object names file you just defined. backup
stores the checkpoint of the model during training. The train.txt
and test.txt
files will contain the path to the training and testing images.
The final step is to set up the .cfg
file which contains the information about the YOLO network architecture. For that, just copy the cfg/yolov3.cfg
file in the darknet folder, paste it as cfg/yolov3-ship.cfg
, and make the following changes:
-
The variable
batch
defines the number of images used for one training step, whilesubdivision
is the number of mini-batches. For example, withbatch=64
andsubdivision=4
, one training step will require four mini-batches with64/4=16
images each before updating the parameters. These variables can be set according to your CPU/GPU memory size. -
width
andheight
represent the size of the input image, in our case, it's 512. -
YOLOv3 outputs the boxes in 3 different resolutions, with each label represented by five numbers (i.e., probability/class confidence, x, y, width, and height). Therefore, the number of filters in the last layer is calculated by the formula
filters = (classes + 5) * 3
Since we have got only 1 class, the number of filters become 18. Now replace each occurrence of
classes=80
byclasses=1
in the file (at line 610, 696, and 783). -
Also, replace the
filters=255
line byfilters=18
each time theclasses
variable occurs (at line 603, 689, and 776).
./darknet detector train cfg/ship.data cfg/yolov3-ship.cfg darknet19_448.conv.23
./darknet detector test cfg/ship.data cfg/yolov3.cfg backup/backup_file.weights test_file.jpg