[RAID2024] AI-Generated Faces in the Real World: A Large-Scale Case Study of Twitter Profile Images

Jonas Ricker, Dennis Assenmacher, Thorsten Holz, Asja Fischer, and Erwin Quiring
International Symposium on Research in Attacks, Intrusions and Defenses (RAID), 2024

Setup

We tested our code with Python 3.10. Install all required packages (preferably in a virtual environment) using

pip install -r requirements.txt
pip install -e .

Data and Checkpoint

The following can be downloaded from Zenodo:

raw images from FFHQ and TPDNE
images from FFHQ and TPDNE uploaded/downloaded to Twitter
trained checkpoint for our ResNet detector
IDs of the 7723 accounts we identified as using AI-generated profile images

We cannot provide the actual profile images due to Twitter's terms of service.

Detection Pipeline

To use our detection pipeline, you first need to download the necessary weights:

download the weights (anchors.npy, anchorsback.npy, blazeface.pth, blazefaceback.pth) for BlazeFace and save them in weights/blazeface
download the checkpoint for our trained classifier and save it in weights/resnet

Then, run

python detection_pipeline.py --input-dir path/to/images --det-weights weights/resnet/ffhq+pseudo_vs_tpdne_0.1.ckpt --alignment-reference-dir path/to/alignment/ref

where path/to/alignment/ref contains faces with the desired alignment that should be used as a reference. The results will be saved to results/detection_pipeline. Use -h to learn more about optional arguments.

Detector

To train your own real/fake detector, run

python train_detector.py --run-name my_run --real-dirs path/to/reals path/to/other/reals --fake-dirs path/to/fakes path/to/other/fakes

The script expects all image directories to have "train", "val", and "test" subdirectories. The trained checkpoint will be saved to results/train_detector. Use -h to learn more about optional arguments.

Duplicate Detection

To identify clusters of duplicate images based on perceptual hashing, run

python detect_duplicates.py --input-dir path/to/images

The results will be saved to results/detect_duplicates. Use -h to learn more about optional arguments.

Content Analysis

In content_analysis.ipynb we show how we analyzed the tweet contents, exemplarily for active English accounts. Unfortunately we cannot publish the actual tweets due to Twitter's terms of service.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
src		src
weights		weights
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
README.md		README.md
content_analysis.ipynb		content_analysis.ipynb
detect_duplicates.py		detect_duplicates.py
detection_pipeline.py		detection_pipeline.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
train_detector.py		train_detector.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

[RAID2024] AI-Generated Faces in the Real World: A Large-Scale Case Study of Twitter Profile Images

Setup

Data and Checkpoint

Detection Pipeline

Detector

Duplicate Detection

Content Analysis

About

Languages

jonasricker/twitter-ai-faces

Folders and files

Latest commit

History

Repository files navigation

[RAID2024] AI-Generated Faces in the Real World: A Large-Scale Case Study of Twitter Profile Images

Setup

Data and Checkpoint

Detection Pipeline

Detector

Duplicate Detection

Content Analysis

About

Topics

Resources

Stars

Watchers

Forks

Languages