Skip to content

squaresLab/SemanticCrashBucketing

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

55 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This page is about the code and data related to our ASE '18 paper Semantic Crash Bucketing.

Click for Citation
@inproceedings{vanTonderSCB2018,
  author = {{van~Tonder}, Rijnard and Kotheimer, John and {Le~Goues}, Claire},
  title = {Semantic Crash Bucketing},
  booktitle = {International Conference on Automated Software Engineering},
  series = {ASE '18},
  year = {2018},
  doi = {10.1145/3238147.3238200}
}

FAQ

I have some patches, programs, and crashing inputs. Can I use this work to bucket the crashing inputs with patches?

Yes: See the bucket-by-patch directory to do this with your own programs and patches.

Data

Our data set curates 21 unique bugs, and for each bug gives:

  • the crashing input to trigger the bug
  • the isolated developer bug fix
  • an autogenerated patch that mimics the developer bug fix (the patch causes the input to not crash the program)

Browse the data using the table below:

Project ID Bug kind Developer fix (ground truth) Autogenned patch Crashing input CVE Ref
SQLite 1 Null-deref p01.patch p01.patch 01.input - link
SQLite 2 Null-deref p02.patch p02.patch 02.input - -
SQLite 3 Null-deref p03.patch p03.patch 03.input - -
SQLite 4 Null-deref p04.patch p04.patch 04.input - -
SQLite 5 Null-deref p05.patch p05.patch 05.input - -
SQLite 6 Null-deref p06.patch p06.patch 06.input - -
SQLite 7 Null-deref p07.patch p07.patch 07.input - -
SQLite 8 Null-deref p08.patch p08.patch 08.input - -
SQLite 9 Null-deref p09.patch p09.patch 09.input - -
SQLite 10 Null-deref p10.patch p10.patch 10.input - -
SQLite 11 Null-deref p11.patch p11.patch 11.input - -
SQLite 12 Null-deref p12.patch p12.patch 12.input - -
w3m 13 Null-deref p01.patch p01.patch 01.input CVE-2016-9438 changelog, link
w3m 14 Null-deref p02.patch p02.patch 02.input CVE-2016-9443 link
w3m 15 Null-deref p03.patch p03.patch 03.input - link
w3m 16 Null-deref p04.patch p04.patch 04.input CVE-2016-9631 link
php-v5 17 Null-deref p01.patch p01.patch 01.input CVE-2016-6292 link
php-v7 18 Null-deref p01.patch p01.patch 01.input CVE-2016-10162 link
R 19 Buffer overflow p01.patch p01.patch 01.input CVE-2016-8714 -
Conntrackd 20 Buffer overflow p01.patch p01.patch 01.input - link
libmad 21 Buffer overflow p01.patch p01.patch 01.input - -

If you want to run, compile, or apply patches, consider using the VM, described next.

VM

The VM has all of the scripts pre-configured, and all the the dependencies pre-installed. Typical use is VirtualBox, 1 vCPU, and at least 4GB RAM (6GB RAM recommended). See the README.md in the VM for more information. Download the VM.

Running from source

Instructions for building each project locally from source, without the VM.

Project Dependencies

Install these:

sudo apt-get install libc6-dbg gdb valgrind gfortran autoconf \
libtidy-dev libedit-dev libjpeg-turbo8-dev libreadline-dev \
libcurl4-gnutls-dev libmcrypt-dev libxslt-dev libbz2-dev \
tcl libxml2-dev libgdk-pixbuf2.0-dev libglib2.0-dev libnfnetlink-dev \
libnetfilter-conntrack-dev libnetfilter-conntrack3 libmnl-dev bison flex \
libnetfilter-cttimeout-dev libssl-dev libgc-dev gettext python-pip re2c \
libicu-dev liblzma-dev

sudo pip install requests

Run ./3-build.sh in each project directory in src/complete/ground-truth

Environment setup

git clone https://github.com/squaresLab/SemanticCrashBucketing.git && cd SemanticCrashBucketing

Disable userspace ASLR:

setarch $(uname -m) -R /bin/bash

Start the patch server:

make

Setup PYTHONPATH:

cd src
export PYTHONPATH=$(pwd):$(pwd)/experiments

Run everything:

python master.py

When finished, run make clean

Directory structure

For each project under src/complete/<project>/ground-truth:

  • GENERATED_T_HAT: generated approximate patches
  • truth/patches: ground truth patches (i.e., developer fixes)
  • truth/all: crashing input files to use for generating a patch
  • {afl-tmin,bff-5,bff-1,hf,hfcov}/all/raw/*: crashing inputs to bucket, separated by crashes generated by each fuzzer configuration