NeuralConceptBinder

This is the official repository of the Neural Concept Binder article. This repository contains all source code required to reproduce the experiments of the paper.

How to Run:

Datasets

Here we provide information on how to download the novel and evaluated datasets.

CLEVR-Sudoku

We provide our novel CLEVR-Sudoku dataset here. In the following, you can find an example of a CLEVR-Sudoku:

For the CLEVR-Easy dataset we refer to https://github.com/singhgautam/sysbinder.

We provide the used CLEVR dataset here.

We provide the datasets used to finetune NCBs hard binder (i.e., distill the concepts) at: CLEVR-Easy-1 and CLEVR-1.

These represent versions of the original datasets that contain single objects.

We provide the CLEVR-Hans classification dataset here (Please visit the CLEVR-Hans repository for instructions on how generate your own dataset and to download the original CLEVR-Hans dataset).

Model Checkpoints

We provide checkpoints of all trained models of our experiments as well as parameter files for CLEVR-Easy and CLEVR.

Docker

We have attached a Dockerfile to make reproduction easier. We further recommend to build your own docker-compose file based on the DockerFile. To run without a docker-compose file:

cd src/docker/
docker build -t neuralconceptbinder -f Dockerfile .
docker run -it -v /pathto/NeuralConceptBinder:/workspace/repositories/NeuralConceptBinder -v /pathto/CLEVR-4-1:/workspace/datasets/CLEVR-4-1 --name neuralconceptbinder --entrypoint='/bin/bash' --runtime nvidia neuralconceptbinder

Once the docker container has been generated, within the docker container please run these steps:

cd to “pathto/NeuralConceptBinder/“
run “pip install -e sysbinder”

Evaluations

The folder scripts/ contains bash scripts for training all models and for evaluations for Q1. Files for training the soft binder are in scripts/train/CLEVR-4/ and scripts/train/CLEVR-Easy/. For finetuning the hard binder and obtaining the retrieval corpus we refer to scripts/train/perform_block_clustering.sh.

E.g., for training the soft binder on the CLEVR data via the sysbinder encoder on two gpus (1,2) and for seed 0 call:

./scripts/train/CLEVR-4/train_sysbind_orig_CLEVR.sh 1,2 0

E.g., for distilling the concepts from the soft binders encodings into the hard binders retrieval corpus call:

./scripts/train/perform_block_clustering.sh 1 /workspace/datasets-local/CLEVR-4-1/ logs/sysbinder_seed_0/best_model.pt 4 16

on gpu 1, for CLEVR, with the specified path to the checkpoint of the pretrained sysbinder model, with 4 categories within the data (this information is not actually relevant for this script just required by the dataloader) and 16 blocks (corresponding to the number of blocks of the specified model checkpoint, e.g., best_model.pt).

The scripts for Q1 evaluations are in scripts/eval/.

We provide a notebook for the different inspection procedures in inspection.ipynb.

clevr_puzzle/ contains the code to generate the CLEVR-SUDOKU dataset and run the evaluation code (Q2)

We provide notebooks for GPT-4 based revision evaluations in revise_via_gpt4/ and notebooks for simulated human-based revision evaluations in revise_via_user/ in the context of Q3.

clevr_hans/ contains the code relevant for our evaluations on subsymbolic computations based on the CLEVR-Hans dataset in the context of Q4.

data_creation_scripts/ contains the json files used for creating the CLEVR-Easy-1 and CLEVR-1 datasets based on the CLEVR-Hans repository.

Citation

If you find this code useful in your research, please consider citing:

@article{stammer2024neural, title={Neural Concept Binder}, author={Wolfgang Stammer and Antonia Wüst and David Steinmann and Kristian Kersting}, journal={arXiv preprint arXiv:2406.09949}, year={2024} }

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
clevr_hans		clevr_hans
clevr_puzzle		clevr_puzzle
data_creation_scripts		data_creation_scripts
docs		docs
figures		figures
logs		logs
notebooks		notebooks
revise_via_gpt4		revise_via_gpt4
revise_via_user		revise_via_user
scripts		scripts
sysbinder		sysbinder
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
analysis_via_clf.py		analysis_via_clf.py
data.py		data.py
inspection.ipynb		inspection.ipynb
neural_concept_binder.py		neural_concept_binder.py
perform_block_clustering.py		perform_block_clustering.py
requirements.txt		requirements.txt
utils_ncb.py		utils_ncb.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NeuralConceptBinder

How to Run:

Datasets

CLEVR-Sudoku

Model Checkpoints

Docker

Evaluations

Citation

About

Releases

Packages

Contributors 2

Languages

License

ml-research/NeuralConceptBinder

Folders and files

Latest commit

History

Repository files navigation

NeuralConceptBinder

How to Run:

Datasets

CLEVR-Sudoku

Model Checkpoints

Docker

Evaluations

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages