Chronosymbolic Learning

Artifact for the paper "Chronosymbolic Learning: Efficient CHC Solving with Symbolic Reasoning and Inductive Learning, Ziyan Luo and Xujie Si"; arXiv preprint

See ./experiment for some significant results and configurations
See ./examples for examples of how our tool works

Citing Chronosymbolic Learning

If you use our codebase, please consider citing our paper:

@InProceedings{10.1007/978-3-031-65112-0_1,
author="Luo, Ziyan
and Si, Xujie",
editor="Avni, Guy
and Giacobbe, Mirco
and Johnson, Taylor T.
and Katz, Guy
and Lukina, Anna
and Narodytska, Nina
and Schilling, Christian",
title="Chronosymbolic Learning: Efficient CHC Solving with Symbolic Reasoning and Inductive Learning",
booktitle="AI Verification",
year="2024",
publisher="Springer Nature Switzerland",
address="Cham",
pages="1--28",
abstract="Solving Constrained Horn Clauses (CHCs) is a fundamental challenge behind a wide range of verification and analysis tasks. To enhance CHC solving without the laborious task of manual heuristic creation and tuning, data-driven approaches demonstrate significant potential by extracting crucial patterns from a small set of data points. However, at present, symbolic methods generally surpass data-driven solvers in performance. In this work, we develop a simple but effective framework, Chronosymbolic Learning, which unifies symbolic information and numerical data to solve a CHC system efficiently. We also present a simple instance (The artifact is available on this link: https://github.com/Chronosymbolic/Chronosymbolic-Learning) of Chronosymbolic Learning with a data-driven learner and a BMC-styled reasoner (BMC represents Bounded Model Checking [9].). Despite its relative simplicity, experimental results show the efficacy and robustness of our tool. It outperforms state-of-the-art CHC solvers on a test suite of 288 arithmetic benchmarks, including some instances with non-linear arithmetic.",
isbn="978-3-031-65112-0"
}

Requirement (To set up our environment)

Python (3.7.0 or higher recommended, and Anaconda recommended to set up a new environment)

Install packages in requirements.txt: pip install -r requirements.txt
- Though not integrated in our artifact, you can also try to use scikit-learn-intelex to speed up CART DT if possible
May have to manually set up PYTHONPATH and PATH properly, PYTHONPATH=$Z3_BIN/python, PATH=$PATH:$Z3_BIN
If the C5.0/LIBSVM binary cannot be executed properly, may have to recompile them in your OS and specify the binary executable files in utils/dt/dt.py in class C5DT, C5_exec_path and utils/svm/svm.py in class LibSVMLearner, svm_exec_path

Our instance of Chronosymbolic Learning

Support SMT-LIB2 format (check-sat) and Datalog format (rule-query)
Have executable binaries of decision tree and SVM for Linux and MacOS, and can automatically adapt to the OS (Linux/Mac)
Control flow implemented in learner/run_agent.py run_Agent function
Hyperparameters in config.yml
Temp files generated when calling decision tree and SVM are in tmp/
Implemented some optimization for SMTLIB files generated by SeaHorn
Run: python test.py with the parameters below:
- Specify instance file name using -f FILE_NAME to run a single instance
- Specify the log file (which records how the tool solves the CHC system) using -l FILE_NAME
- Specify directory name using -r -f DIR_NAME to run a test suite (logs are automatically generated in log/DIR_NAME, see experiment/README.md to better understand the logs)
  - e.g. python test.py -f tests/safe/ -a -r -v -t 360 -o result/result.log
- Or specify a file list using -b -f FILELIST (run files specified in the file list whose format is one file path in each line)
  - e.g. python test.py -a -v -b -f tests/filtered.txt -a -t 360 -o result/result.log
- Increase log file verbosity using -v (not effective in output on screen)
- Adjust timeout using -t TIMEOUT, only effective in directory mode
- Specify the result summary log file using -o FILE_NAME; Export an additional result summary CSV FILE_NAME_prefix.csv (with success and timing statistics, and is_correct column indicates the satisfiability of the CHC system if successfully solved (is_successful=1); if is_successful=0, is_correct field is not meaningful) using -a; The summary is only available when running multiple instances (directory mode or file list mode)
- Start solving from the file index K in the folder -s K (K is the index starting from zero)
- If you want to run multiple instances, make sure to use different FILE_NAME-s in the config file to avoid clash (config.yml in default)
- More options see --help
After finishing running, the ./tmp directory can be deleted safely

To reproduce the major result: Chronosymbolic-single

Please refer to the configuration in ./experiment/result_summary.log and ./experiment/README.md (where settings for other minor experiments are also provided). Using the default config in config.yml should also be decent. Even fixed random seeds can cause minor randomness that may slightly affect the performance.

python test.py -f tests/safe -a -r -v -t 360 -o result/result_safe.log
python test.py -f tests/unsafe -a -r -v -t 360 -o result/result_unsafe.log

To run the baselines

Spacer and GSpacer

Configure z3-gspacer-branch, chmod +x z3 (pre-built binary of z3 with Spacer and GSpacer for Ubuntu)
Specify the path of pre-built z3 (with Spacer and GSpacer) binary in utils/run_spacer.py and utils/run_spacer_filtered.py, at line 5
Specify a target folder in utils/run_spacer.py or specify a file list in utils/run_spacer_filtered.py, at line 4
Use GSpacer as the solving engine: enable_global_guidance = 1; Use Spacer as the solving engine: enable_global_guidance = 0, at line 8 in utils/run_spacer.py and in utils/run_spacer_filtered.py
Check utils/run_spacer.py and utils/run_spacer_filtered.py line 4-20 for other settings
After configuration, run python utils/run_spacer.py

LinearArbitrary and FreqHorn

Refer to LinearArbitrary and FreqHorn.

A prebuilt docker image is available on Docker Hub.

The pre-built binary for FreqHorn is also provided here: freqhorn and expl.

For LinearArbitrary, you can also try our optimized data-driven learner implementation (set ClassAgent = Chronosymbolic to ClassAgent = DataDrivenLearner in test.py and run it in the same way as Chronosymbolic does)

Manually "guess" an inductive invariant (hard to scale up)

In test.py guess_manually function:

Modify s = 'v_0 == v_1' to indicate the inductive invariant
Modify db = load_horn_db_from_file(args.file_name, z3.main_ctx()) or pass the parameter in through command line to indicate SMTLIB2 file name

Benchmarks

CHC-COMP

FreqHorn

LinearArbitrary

Major Reference Tools

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Chronosymbolic Learning

Citing Chronosymbolic Learning

Requirement (To set up our environment)

Our instance of Chronosymbolic Learning

To reproduce the major result: Chronosymbolic-single

To run the baselines

Spacer and GSpacer

LinearArbitrary and FreqHorn

Manually "guess" an inductive invariant (hard to scale up)

Benchmarks

Major Reference Tools

About

Releases 4

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
examples		examples
experiment		experiment
horndb		horndb
learner		learner
result		result
tests		tests
utils		utils
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
README.pdf		README.pdf
config.yml		config.yml
requirements.txt		requirements.txt
test.py		test.py

License

Chronosymbolic/Chronosymbolic-Learning

Folders and files

Latest commit

History

Repository files navigation

Chronosymbolic Learning

Citing Chronosymbolic Learning

Requirement (To set up our environment)

Our instance of Chronosymbolic Learning

To reproduce the major result: Chronosymbolic-single

To run the baselines

Spacer and GSpacer

LinearArbitrary and FreqHorn

Manually "guess" an inductive invariant (hard to scale up)

Benchmarks

Major Reference Tools

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 4

Contributors 2

Languages