This page is about the code and data related to our ASE '18 paper Semantic Crash Bucketing.
Click for Citation
@inproceedings{vanTonderSCB2018,
author = {{van~Tonder}, Rijnard and Kotheimer, John and {Le~Goues}, Claire},
title = {Semantic Crash Bucketing},
booktitle = {International Conference on Automated Software Engineering},
series = {ASE '18},
year = {2018},
doi = {10.1145/3238147.3238200}
}
I have some patches, programs, and crashing inputs. Can I use this work to bucket the crashing inputs with patches?
Yes: See the bucket-by-patch directory to do this with your own programs and patches.
Our data set curates 21 unique bugs, and for each bug gives:
- the crashing input to trigger the bug
- the isolated developer bug fix
- an autogenerated patch that mimics the developer bug fix (the patch causes the input to not crash the program)
Browse the data using the table below:
Project | ID | Bug kind | Developer fix (ground truth) | Autogenned patch | Crashing input | CVE | Ref |
---|---|---|---|---|---|---|---|
SQLite | 1 | Null-deref | p01.patch | p01.patch | 01.input | - | link |
SQLite | 2 | Null-deref | p02.patch | p02.patch | 02.input | - | - |
SQLite | 3 | Null-deref | p03.patch | p03.patch | 03.input | - | - |
SQLite | 4 | Null-deref | p04.patch | p04.patch | 04.input | - | - |
SQLite | 5 | Null-deref | p05.patch | p05.patch | 05.input | - | - |
SQLite | 6 | Null-deref | p06.patch | p06.patch | 06.input | - | - |
SQLite | 7 | Null-deref | p07.patch | p07.patch | 07.input | - | - |
SQLite | 8 | Null-deref | p08.patch | p08.patch | 08.input | - | - |
SQLite | 9 | Null-deref | p09.patch | p09.patch | 09.input | - | - |
SQLite | 10 | Null-deref | p10.patch | p10.patch | 10.input | - | - |
SQLite | 11 | Null-deref | p11.patch | p11.patch | 11.input | - | - |
SQLite | 12 | Null-deref | p12.patch | p12.patch | 12.input | - | - |
w3m | 13 | Null-deref | p01.patch | p01.patch | 01.input | CVE-2016-9438 | changelog, link |
w3m | 14 | Null-deref | p02.patch | p02.patch | 02.input | CVE-2016-9443 | link |
w3m | 15 | Null-deref | p03.patch | p03.patch | 03.input | - | link |
w3m | 16 | Null-deref | p04.patch | p04.patch | 04.input | CVE-2016-9631 | link |
php-v5 | 17 | Null-deref | p01.patch | p01.patch | 01.input | CVE-2016-6292 | link |
php-v7 | 18 | Null-deref | p01.patch | p01.patch | 01.input | CVE-2016-10162 | link |
R | 19 | Buffer overflow | p01.patch | p01.patch | 01.input | CVE-2016-8714 | - |
Conntrackd | 20 | Buffer overflow | p01.patch | p01.patch | 01.input | - | link |
libmad | 21 | Buffer overflow | p01.patch | p01.patch | 01.input | - | - |
If you want to run, compile, or apply patches, consider using the VM, described next.
The VM has all of the scripts pre-configured, and all the the dependencies pre-installed. Typical use is VirtualBox, 1 vCPU, and at least 4GB RAM (6GB RAM recommended). See the README.md
in the VM for more information.
Download the VM.
Instructions for building each project locally from source, without the VM.
Install these:
sudo apt-get install libc6-dbg gdb valgrind gfortran autoconf \
libtidy-dev libedit-dev libjpeg-turbo8-dev libreadline-dev \
libcurl4-gnutls-dev libmcrypt-dev libxslt-dev libbz2-dev \
tcl libxml2-dev libgdk-pixbuf2.0-dev libglib2.0-dev libnfnetlink-dev \
libnetfilter-conntrack-dev libnetfilter-conntrack3 libmnl-dev bison flex \
libnetfilter-cttimeout-dev libssl-dev libgc-dev gettext python-pip re2c \
libicu-dev liblzma-dev
sudo pip install requests
Run ./3-build.sh
in each project directory in src/complete/ground-truth
git clone https://github.com/squaresLab/SemanticCrashBucketing.git && cd SemanticCrashBucketing
Disable userspace ASLR:
setarch $(uname -m) -R /bin/bash
Start the patch server:
make
Setup PYTHONPATH
:
cd src
export PYTHONPATH=$(pwd):$(pwd)/experiments
Run everything:
python master.py
When finished, run make clean
For each project under src/complete/<project>/ground-truth
:
GENERATED_T_HAT
: generated approximate patchestruth/patches
: ground truth patches (i.e., developer fixes)truth/all
: crashing input files to use for generating a patch{afl-tmin,bff-5,bff-1,hf,hfcov}/all/raw/*
: crashing inputs to bucket, separated by crashes generated by each fuzzer configuration