Determine whether a given video sequence has been manipulated or synthetically generated
Report Bug
·
Request Feature
Table of Contents
PyTorch code for DF-Spot, a model ensemble that determines if an input video/image is real or fraudulent. To identify deepfakes, this study proposes an ensemble-based metric learning technique based on a siamese network architecture, in which four models are built beginning from a base network. This method has been validated using publicly available datasets such as Celeb-DF (v2), FaceForensics++, and DFDC.
Set up the project on your local machine by following the instructions below. You can also run the demo on Google Colab here
-
Update system and install pip3
sudo apt update sudo apt -y install python3-pip
-
Python virtual environment (optional)
sudo apt install python3-venv
- Create a python virtual environment (optional)
mkdir df_spot cd df_spot python3 -m venv df_spot_env source df_spot_env/bin/activate
- Clone the repo
git clone https://github.com/chinmaynehate/DeepFake-Spot.git
- Install dependencies
cd DFSpot-Deepfake-Recognition sudo chmod +x setup.sh
🔔 Note The following command, which runs the setup.sh file, requires the
-m
parameter, which accepts either dfdc, celeb, ffpp, or all as inputs. If the flag -m is used with the option dfdc, setup.sh will download the models trained on the dfdc dataset. The models are currently saved on Google Drive and there appears to be a limit to the number of files that may be downloaded using the command-line utility toolgdown
. As a result, it is possible that this limit has been reached and you are unable to download the models. If this occurs, try running the script again after 24 hours. You can also manually download the models by visiting the google drive link for the models from thesetup.sh
file.
Downloading the models manually is recommended.
./setup.sh -m <dataset>
For eg. If you want to download models trained on dfdc dataset, then run:
./setup.sh -m dfdc
The other options are: celeb, ffpp or all
After running the requirements, prerequisites and installation scripts, the directory structure of 'DFSpot-Deepfake-Recognition/' is as follows
|-- assets # contains images & gifs for readme
|-- examples.sh # contains example for running spot_deepfakes.py
|-- models # contains twelve .pth files. These are downloaded using gdown and extracted in setup.sh
| |-- celeb_v2.pth
| |-- dfdc_v2st.pth
| |-- ffpp_v2.pth
|-- README.md
|-- requirements.txt
|-- sample_images # contains sample images from test set of ffpp, celebdf & dfdc dataset. Save the images that have to be tested in this folder
|-- sample_output_videos # contains sample output videos that are obtained after running the code
|-- sample_videos # contains all the sample videos downloaded using gdown and extracted in setup.sh. Save the video files that have to be tested in this folder
| |-- abc.mp4 # video whose authenticity has to be tested
| |-- pqr.mp4 # video whose authenticity has to be tested
|-- setup.sh # downloads all the models, sample_videos and installs dependencies
|-- src
|-- architectures # contains definitions of models
|-- blazeface # for face extraction
|-- ensemble_model.ipynb
|-- output # contains the annotated video files generated by running spot_deepfakes.py
| |-- abc.avi # annotated video with frame-level predictions done by the ensemble of models for sample_videos/abc.mp4
| |-- pqr.avi # annotated video with frame-level predictions done by the ensemble of models for sample_videos/pqr.mp4
| |-- predictions.csv # final prediction class of abc.mp4 and pqr.mp4 i.e real or fake is stored as csv
|-- spot_deepfakes.py # main()
|-- utils # contains functions for extraction of faces from videos in sample_videos, loading models, ensemble of models and annotation
- When
setup.sh
is executed, a few example videos from the test set of datasets such as DFDC, FFPP, and CelebDF(V2) are saved insample videos/
folder. Assume you run thesetup.sh
file with the -m flag option dfdc. If so, then pass dfdc as the--dataset
argument, and the code will check for models trained on the dfdc dataset in the models directory specified by the--model dir
argument. Command to check for deepfakes in these videos using models trained on dfdc dataset is:
python3 spot_deepfakes.py --media_type video --data_dir ../sample_videos/dfdc/fake/ --dataset dfdc --model TimmV2 TimmV2ST ViT ViTST --model_dir ../models/ --video_id 2 3 4 --annotate True --device 0 --output_dir output/
The predictions are stored in output/predictions.csv
and video with frame level annotations of predictions made by individual models and ensemble of models is stored in output/
folder.
- Say you have three videos- video1.mp4, video2.mp4 and video3.mp4 and you want to check their authenticity. Place these three videos in the
sample_videos/
folder and run:
python3 spot_deepfakes.py --media_type video --data_dir ../sample_videos/ --dataset ffpp --model TimmV2 TimmV2ST ViT ViTST --model_dir ../models/ --video_id 0 1 2 --annotate True --device 0 --output_dir output/
The predictions are stored in output/predictions.csv
and video with frame level annotations of predictions made by individual models and ensemble of models is stored in output/
folder.
- By running
setup.sh
during installation, few sample images from test set of datasets like DFDC, FFPP and CelebDF(V2) are saved insample_images/
. To check the authenticity of these images, run:
python3 spot_deepfakes.py --media_type image --data_dir ../sample_images/ --dataset dfdc --model TimmV2 TimmV2ST ViT ViTST --model_dir ../models --device 0 --output_dir output/
- Say you have a few images and you need to check their authenticity. Place them in the
sample_images/
folder and run the following command:
python3 spot_deepfakes.py --media_type image --data_dir ../sample_images/ --dataset dfdc --model TimmV2 TimmV2ST ViT ViTST --model_dir ../models --device 0 --output_dir output/
The predictions are stored in output/img_predictions.json
For more examples, please refer to examples.sh
Any contributions you make are greatly appreciated.
If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement".
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature
) - Commit your Changes (
git commit -m 'Add some AmazingFeature'
) - Push to the Branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
Distributed under the MIT License. See LICENSE
for more information.
- Deepware
- Image and Sound Processing Lab - Politecnico di Milano
- Triplet loss tutorial
- Facebook's DeepFake Detection Challenge (DFDC) dataset Paper
- FaceForensics++ Paper
- Celeb-DF (v2) Paper
- Other papers
Plain text:
C. Nehate, P. Dalia, S. Naik and A. Bhan, "Exposing DeepFakes using Siamese Training," 2022 IEEE India Council International Subsections Conference (INDISCON), 2022, pp. 1-6, doi: 10.1109/INDISCON54605.2022.9862825.
Bibtex:
@INPROCEEDINGS{9862825,
author={Nehate, Chinmay and Dalia, Parth and Naik, Saket and Bhan, Aditya},
booktitle={2022 IEEE India Council International Subsections Conference (INDISCON)},
title={Exposing DeepFakes using Siamese Training},
year={2022},
volume={},
number={},
pages={1-6},
doi={10.1109/INDISCON54605.2022.9862825}}