The data was aquired during Schanzer Almfest at Ingolstadt in 2018 by IlassAG. As a part of a practical at the Data Mining and Analytics Chair by Prof. Günnemann at TUM we were given the task to count objects at checkout. Therefore we annotated the data with bounding boxes and classes to train an object detection network.
You can find the dataset here
dataset
contains the train and test datasets including the labels- the labels can be found in
files.txt
(OpenCV style) <filename> <number of objects> <classid1> <x1> <y1> <w1> <h1> <classid2> <x2> <y2> <w2> <h2> ...
- the labels can be found in
models
contains our pretrained tensorflow models (see Preview.ipynb for an example usage)video_data_zipped
contains the raw videos from which the dataset were extracted
If you find this work useful you may consider citing our paper
@misc{tum2019oktoberfest,
title={Oktoberfest Food Dataset},
author={Alexander Ziller and Julius Hansjakob and Vitalii Rusinov and Daniel Z\"ugner and Peter Vogel and Stephan G\"unnemann},
year={2019},
eprint={1912.05007},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
Class Id | Class | Images | Annotations | average quantity |
---|---|---|---|---|
0 | Bier | 300 | 436 | 1.45 |
1 | Bier Mass | 200 | 299 | 1.50 |
2 | Weissbier | 229 | 298 | 1.30 |
3 | Cola | 165 | 210 | 1.27 |
4 | Wasser | 198 | 284 | 1.43 |
5 | Curry-Wurst | 120 | 159 | 1.32 |
6 | Weisswein | 81 | 105 | 1.30 |
7 | A-Schorle | 90 | 98 | 1.09 |
8 | Jaegermeister | 43 | 152 | 3.53 |
9 | Pommes | 110 | 126 | 1.15 |
10 | Burger | 105 | 122 | 1.16 |
11 | Williamsbirne | 50 | 121 | 2.42 |
12 | Alm-Breze | 100 | 114 | 1.14 |
13 | Brotzeitkorb | 65 | 72 | 1.11 |
14 | Kaesespaetzle | 92 | 100 | 1.09 |
Total | 1110 | 2696 | 2.43 |
For training object detection models we used tensorflow's Object Detection API. We trained several different approaches and got best results for a Single Shot Detector (SSD) with Feature Pyramid Networks (FPN). Our evaluation metric was the area under the precision-recall curve on a test set of 86 images (as our goal was to count we ignored the localization).
Approach | Backbone model | Area | Example precision@recall |
---|---|---|---|
SSD | Mobilenet | 0.86 | 0.85@0.70 |
SSD + FPN | Mobilenet | 0.98 | 0.97@0.97 |
RFCN | ResNet-101 | 0.965 | 0.90@0.95 |
The Evaluation folder contains Jupyter notebooks to evaluate the TensorFlow models.
With the Preview notebook one can try out the pretrained TensorFlow models on arbitrary images.
The CreateTFRecordFile notebook contains code to convert the dataset in to the TFRecord file format so it can be used with the TensorFlow object detection library.
The ShowAnnotations visualizes the bounding boxes of the dataset. Use 'n' for the next image, 'p' for the previous and 'q' to quit.
This was done by Vitalii Rusinov and is further explained in his fork.
In addition, the labels in the PASCAL_VOC format are available in the PASCAL_VOC folder.
Online Notebooks to train Faster RCNN and Retinanet models on the dataset using Google Colaboratory are available here
Alexander Ziller: Student of Robotics, Cognition & Intelligence (M.Sc.) at TUM
Julius Hansjakob: Student of Informatics (M.Sc.) at TUM
Vitalii Rusinov: Student of Informatics (M.Sc.) at TUM
We also want to credit Daniel Zügner for his efforts.