While learning YOLO I have gone through a lot of blogs, github codes, blogs, courses. I have tried to combine all of them and see how to work with my own dataset set.
I have used Anaconda and jupyter notebook. Here I have used Darkflow to detect custom object.
Also I use Windows. Therefore all my tips are likely to run well on Windows.
Python3, tensorflow 1.0, numpy, opencv 3. Links for installation below:
- Python 3.5 or 3.6, Anaconda
- Tensorflow. I recommend using the tensorflow GPU version. But if you don't have GPU, just go ahead and install the CPU versoin.
GPUs are more than 100x faster for training and testing neural networks than a CPU. Find more here - Opencv
- Click this
- Download and extract the files somewhere locally
You can choose one of the following three ways to get started with darkflow. If you are using Python 3 on windows you will need to install Microsoft Visual C++ 14.0. Here you can find installation process, why it is required, references etc or you can try stackoverflow.
-
Just build the Cython extensions in place. NOTE: If installing this way you will have to use
./flow
in the cloned darkflow directory instead offlow
as darkflow is not installed globally.python3 setup.py build_ext --inplace
-
Let pip install darkflow globally in dev mode (still globally accessible, but changes to the code immediately take effect)
pip install -e .
-
Install with pip globally
pip install .
- Download the YOLOv2 608x608 weights file here
- Read more about YOLO (in darknet) and download weight files here. In case the weight file cannot be found, you can check here, which include
yolo-full
andyolo-tiny
of v1.0,tiny-yolo-v1.1
of v1.1 andyolo
,tiny-yolo-voc
of v2. Owner of this weights is Trieu. - NOTE: there are other weights files you can try if you like
- create a
wights
folder within thedarkflow-master
folder - put the weights file in the
weights
folder
I have run the model on around 250 images. I recommend to have a much bigger dataset for better performance.
To make a dataset of objects around you
- start taking photos of the objects that you want to detect.
- make sure have pictures from different angles, different poses, in different environment etc.
- try to make the dataset as big as possible for better performance.
- To annotate images download labelImg.
- Check this video to learn how to use lebelImg.
- Github repo for labelImg can be found here
The steps below assume we want to use tiny YOLO and our dataset has 3 classes
-
Create a copy of the configuration file
tiny-yolo-voc.cfg
and rename it according to your preferencetiny-yolo-voc-3c.cfg
(It is crucial that you leave the originaltiny-yolo-voc.cfg
file unchanged, see below for explanation). Heretiny-yolo-voc-3c.cfg
is for 3 classes, you can change the name as you wish. -
In
tiny-yolo-voc-3c.cfg
, change classes in the [region] layer (the last layer) to the number of classes you are going to train for. In our case, classes are set to 3.... [region] anchors = 1.08,1.19, 3.42,4.41, 6.63,11.38, 9.42,5.11, 16.62,10.52 bias_match=1 classes=3 ## 3 classes coords=4 num=5 softmax=1 ...
-
In
tiny-yolo-voc-3c.cfg
, change filters in the [convolutional] layer (the second to last layer) to num * (classes + 5). In our case, num is 5 and classes are 3 so 5 * (3 + 5) = 40 therefore filters are set to 40.... [convolutional] size=1 stride=1 pad=1 filters=40 ## 5 * (3 + 5) = 40 activation=linear [region] anchors = 1.08,1.19, 3.42,4.41, 6.63,11.38, 9.42,5.11, 16.62,10.52 ...
-
Change
labels.txt
to include the label(s) you want to train on (number of labels should be the same as the number of classes you set intiny-yolo-voc-3c.cfg
file). In my case,labels.txt
will contain 3 labels.king ace ten
-
Reference the
tiny-yolo-voc-3c.cfg
model when you train.python flow --model cfg/tiny-yolo-voc-3c.cfg --load weights/tiny-yolo-voc.weights --train --annotation train/Annotations --dataset train/Images --gpu 1.0 --epochs 300
In windows you need to typepython
at the beginning otherwise it does not recognise the flow command. Next spesify the model--model cfg/tiny-yolo-voc-3c.cfg
and the weights--load weights/tiny-yolo-voc.weights
. After that specify the path for the annatations--annotation train/Annotations
and images--dataset train/Images
. Use--gpu 1.0
to use gpu for speed, if you do not have GPU just don't use this part. You can specify the number of epochs. By default it is 1000. However it can be stopped anytime. I recommend to keep the lose below 1.
-
Why should I leave the original
tiny-yolo-voc.cfg
file unchanged?When darkflow sees you are loading
tiny-yolo-voc.weights
it will look fortiny-yolo-voc.cfg
in your cfg/ folder and compare that configuration file to the new one you have set with--model cfg/tiny-yolo-voc-3c.cfg
. In this case, every layer will have the same exact number of weights except for the last two, so it will load the weights into all layers up to the last two because they now contain different number of weights.
Open the object-detection-with-YOLO.ipynb file. I have tried to add comments to make it easy to understand.
To detect object from images:
- Go to the Object Detection from Image section.
- Change the image name with your image name from the following line
img = cv2.imread('images/img_2386.jpg', cv2.IMREAD_COLOR)
- If you have multiple object in your image then you have to define all the
tl
(Top left),br
(Bottom right) for different ofjects and their labels.
To detect object from video:
- Go to the Object Detection from Video section.
- Change the image name with your image name from the following line
capture = cv2.VideoCapture('test2.mkv')
- Run.
- Press
Q
to quit
To detect object from webcam just run the code from Object Detection from Webcam section. If you have multiple webcams you may need to specify the number correctly for your desired webcam. I have my laptops default webcam. Thats why I have used 0. To change the nummber edit this line
capture = cv2.VideoCapture(0)
- Press
Q
to quit
My webcam results are below below.
My confidence factor is low because of lack of data (about 250 images) and having no GPU. I had to stop training after 60 epochs. It took 9 hours and the lose was around 3.8. I was just trying to learn things so that was enough for me.
-
Real-time object detection and classification. Paper: version 1, version 2.
-
Official YOLO website.
-
I have learned YOLO, how it works from coursera. Also Siraj has a nice tutorial on it.
-
To have video description of the codes and more understanding follow this videos. I have followed Mark Jay a lot whil making this project.