ArtEmis: Affective Language for Visual Art

A codebase created and maintained by Panos Achlioptas.

Introduction

This work is based on the arXiv tech report which is provisionally accepted in CVPR-2021, for an Oral presentation.

Citation

If you find this work useful in your research, please consider citing:

@article{achlioptas2021artemis,
    title={ArtEmis: Affective Language for Visual Art},
    author={Achlioptas, Panos and Ovsjanikov, Maks and Haydarov, Kilichbek and
            Elhoseiny, Mohamed and Guibas, Leonidas},
    journal = {CoRR},
    volume = {abs/2101.07396},
    year={2021}
}

Dataset

To get the most out of this repo, please download the data associated with ArtEmis, by filling this form.

Installation

This code has been tested with Python 3.6.9, Pytorch 1.3.1, CUDA 10.0 on Ubuntu 16.04.

Assuming some (potentially) virtual environment and python 3x

git clone https://github.com/optas/artemis.git
cd artemis
pip install -e .

This will install the repo with all its dependencies (listed in setup.py) and will enable you to do things like:

from artemis.models import xx

(provided you add this artemis repo in your PYTHON-PATH)

Playing with ArtEmis

Step-1 (important 📌)

Preprocess the provided annotations (spell-check, patch, tokenize, make train/val/test splits, etc.).

   artemis/scripts/preprocess_artemis_data.py

This script allows you to preprocess ArtEmis according to your needs. The default arguments will do minimal preprocessing so the resulting output can be used to fairly compare ArtEmis with other datasets; and, derive most faithful statistics about ArtEmis's nature. That is what we used in our analysis and what you should use in "Step-2" below. With this in mind do:

  python artemis/scripts/preprocess_artemis_data.py -save-out-dir <ADD_YOURS> -raw-artemis-data-csv <ADD_YOURS>

If you wish to train deep-nets (speakers, emotion-classifiers etc.) exactly as we did it in our paper, then you need to rerun this script by providing only a single extra optional argument ("--preprocess-for-deep-nets True"). This will do more aggressive filtering and you should use its output for "Steps-3" and "Steps-4" below (please use a new save-out-dir to avoid overwriting).

  python artemis/scripts/preprocess_artemis_data.py -save-out-dir <ADD_YOURS> -raw-artemis-data-csv <ADD_YOURS> --preprocess-for-deep-nets True

(If you wish to understand the nature of the different hyper-parameters please read the details in the provided help messages of the used argparse.)

Step-2

Analyze & explore the dataset. 🔬

Using the minimally preprocessed version of ArtEmis which includes all (454,684) collected annotation.

This is a great place to start 🏁. Run this notebook to do basic linguistic, emotion & art-oriented analysis of the ArtEmis dataset.
Please run this notebook to analyze ArtEmis in terms of its: concreteness, subjectivity, sentiment and Parts-of-Speech. Optionally, contrast these values with with other common datasets like COCO.
Please run this notebook to extract the emotion histograms (empirical distributions) of each artwork. This in necessary for the Step-3 (1).
Please run this notebook to analyze the extracted emotion histograms (previous step) per art genre and style.

Step-3

Train and evaluate emotion-centric image & text classifiers. ♥️

(Using the preprocessed version of ArtEmis for deep-nets which includes 429,431 annotations. Training on a single GPU from scratch is a matter of minutes for these classifiers!)

Please run this notebook to train an image-to-emotion classifier.
Please run this notebook to train an LSTM-based utterance-to-emotion classifier. Or, this notebook to train a BERT-based one.

Step-4

Train & evaluate neural-speakers. 💣

To train our customized SAT model on ArtEmis (~2 hours to train in a single GPU!) do:

    python artemis/scripts/train_speaker.py -log-dir <ADD_YOURS> -data-dir <ADD_YOURS> -img-dir <ADD_YOURS>

    log-dir: where to save the output of the training process, models etc.
    data-dir: directory you used as _input_  (termed -save-dir) when you run the preprocess_artemis_data.py
              the directory should contain the ouput of preprocess_artemis_data.csv: e.g., 
               the artemis_preprocessed.csv, the vocabulary.pkl
    img-dir: the top folder containing the WikiArt image dataset in its "standard" format:
                img-dir/art_style/painting-xx.jpg

Note. The default optional arguments will create the same vanilla-speaker variant we used in our paper.

To train the emotionally-grounded variant of SAT add one parameter in the above call:

    python artemis/scripts/train_speaker.py -log-dir <ADD_YOURS> -data-dir <ADD_YOURS> -img-dir <ADD_YOURS>
                                            --use-emo-grounding True

To sample utterances for a trained speaker:

 python artemis/scripts/sample_speaker.py -arguments

For an explanation of the arguments see the argparse help messages. It worth noting that if you want to sample an emotionally-grounded variant you also need to provide a pretrained image2emotion classifier that will be used to extract the most likely emotion of each image as grounding input to the speaker. See Step-3 (1) for how to train such a net.

To evaluate the sampled utterances of a speaker (e.g., per BLEU, emotional alignment, methaphors etc.) use this notebook. As bonus you can see the neural attention placed on the different tokens/images.

Pretrained Models

Image-To-Emotion classifier (81MB)
LSTM-based Text-To-Emotion classifier (8MB)
SAT-Speaker (434MB)
SAT-Speaker-with-emotion-grounding (431MB)

Note: the above speaker links include the sampled captions for the test split. You can use them to evaluate the model without re-sampling it. Please read the also included README.txt.
Caveats: ArtEmis is a real-world dataset containing the opinion and sentiment of thousands of people. It is expected thus to contain text with biases, factual inaccuracies, and perhaps foul language. Please use responsibly. The provided models are likely to be biased and/or inaccurate in ways reflected in the training data.

News

🍾 ArtEmis has attracted already some noticeable media coverage. E.g., @ New-Scientist, HAI, MarkTechPost, KCBS-Radio, Communications of ACM, Synced Review, École Polytechnique
📞 important More code, will be added on the 1st week of April. Namely, for the ANP-baseline, the comparisons of ArtEmis with other datasets, please do a git-pull at that time. The update will be seamless! During this first months, if you have ANY question feel free to send me an email at optas@stanford.edu.
🏆 If you are developing more models with ArtEmis and you want to incorporate them here please talk to me or simply do a pull-request.

MISC

You can make a pseudo "neural speaker" by copying training-sentences to the test according to Nearest-Neighbors in a pretrained network feature space by running this 5 min. notebook.

License

This code is released under MIT License (see LICENSE file for details). In simple words, if you copy/use parts of this code please keep the copyright note in place.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

ArtEmis: Affective Language for Visual Art

Introduction

Citation

Dataset

Installation

Playing with ArtEmis

Step-1 (important 📌)

Step-2

Step-3

Step-4

Pretrained Models

News

MISC

License

Files

README.md

Latest commit

History

README.md

File metadata and controls

ArtEmis: Affective Language for Visual Art

Introduction

Citation

Dataset

Installation

Playing with ArtEmis

Step-1 (important 📌)

Step-2

Step-3

Step-4

Pretrained Models

News

MISC

License