Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Next #766

Merged
merged 11 commits into from
Jul 25, 2023
Merged

Next #766

Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 34 additions & 25 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,61 +1,70 @@
Take a video and replace the face in it with a face of your choice. You only need one image of the desired face. No dataset, no training.

You can watch some demos [here](https://drive.google.com/drive/folders/1KHv8n_rd3Lcr2v7jBq1yPSTWM554Gq8e?usp=sharing). A StableDiffusion extension is also available, [here](https://github.com/s0md3v/sd-webui-roop).
You can watch some demos [here](https://drive.google.com/drive/folders/1KHv8n_rd3Lcr2v7jBq1yPSTWM554Gq8e?usp=sharing).
A Stable Diffusion extension is also available, [here](https://github.com/s0md3v/sd-webui-roop).

![demo-gif](demo.gif)

## Disclaimer

This software is meant to be a productive contribution to the rapidly growing AI-generated media industry. It will help artists with tasks such as animating a custom character or using the character as a model for clothing etc.

The developers of this software are aware of its possible unethical applications and are committed to take preventative measures against them. It has a built-in check which prevents the program from working on inappropriate media including but not limited to nudity, graphic content, sensitive material such as war footage etc. We will continue to develop this project in the positive direction while adhering to law and ethics. This project may be shut down or include watermarks on the output if requested by law.

Users of this software are expected to use this software responsibly while abiding the local law. If face of a real person is being used, users are suggested to get consent from the concerned person and clearly mention that it is a deepfake when posting content online. Developers of this software will not be responsible for actions of end-users.

## How do I install it?
## How to install?

### Basic

It is more likely to work on your computer but it will also be very slow. You can follow instructions for the basic install [here](https://github.com/s0md3v/roop/wiki/1.-Installation).
It is more likely to work on your computer, but will be quite slow. Follow instructions for the basic installation [here](https://github.com/s0md3v/roop/wiki/1.-Installation).

### Acceleration

If you have a good GPU and are ready for solving any software issues you may face, you can enable GPU which is wayyy faster. To do this, first follow the basic install instructions given above and then follow GPU-specific instructions [here](https://github.com/s0md3v/roop/wiki/2.-Acceleration).
If you own a capable GPU and are prepared to address any software problems, you have the option to activate such acceleration, which offers significantly enhanced speed. Once you finished the basic installation, you can follow the instructions for the acceleration installation [here](https://github.com/s0md3v/roop/wiki/2.-Acceleration).

## How to use?

## How do I use it?
### UI

Executing `python run.py` command will launch this window:

![gui-demo](gui-demo.png)

Choose a face (image with desired face) and the target image/video (image/video in which you want to replace the face) and click on `Start`. Open file explorer and navigate to the directory you select your output to be in. You will find a directory named `<video_title>` where you can see the frames being swapped in realtime. Once the processing is done, it will create the output file. That's it.

Additional command line arguments are given below. To learn out what they do, check [this guide](https://github.com/s0md3v/roop/wiki/Advanced-Options).
## CLI

Additional command line arguments are given below. To learn out what they do, check the guide [here](https://github.com/s0md3v/roop/wiki/Advanced-Options).

```
options:
-h, --help show this help message and exit
-s SOURCE_PATH, --source SOURCE_PATH select an source image
-t TARGET_PATH, --target TARGET_PATH select an target image or video
-o OUTPUT_PATH, --output OUTPUT_PATH select output file or directory
--frame-processor FRAME_PROCESSOR [FRAME_PROCESSOR ...] frame processors (choices: face_swapper, face_enhancer, ...)
--keep-fps keep target fps
--keep-frames keep temporary frames
--skip-audio skip target audio
--many-faces process every face
--reference-face-position REFERENCE_FACE_POSITION position of the reference face
--reference-frame-number REFERENCE_FRAME_NUMBER number of the reference frame
--similar-face-distance SIMILAR_FACE_DISTANCE face distance used for recognition
--video-encoder {libx264,libx265,libvpx-vp9} adjust output video encoder
--video-quality [0-51] adjust output video quality
--max-memory MAX_MEMORY maximum amount of RAM in GB
--execution-provider {cpu} [{cpu} ...] available execution provider (choices: cpu, ...)
--execution-threads EXECUTION_THREADS number of execution threads
-v, --version show program's version number and exit
-h, --help show this help message and exit
-s SOURCE_PATH, --source SOURCE_PATH select an source image
-t TARGET_PATH, --target TARGET_PATH select an target image or video
-o OUTPUT_PATH, --output OUTPUT_PATH select output file or directory
--frame-processor FRAME_PROCESSOR [FRAME_PROCESSOR ...] frame processors (choices: face_swapper, face_enhancer, ...)
--keep-fps keep target fps
--keep-frames keep temporary frames
--skip-audio skip target audio
--many-faces process every face
--reference-face-position REFERENCE_FACE_POSITION position of the reference face
--reference-frame-number REFERENCE_FRAME_NUMBER number of the reference frame
--similar-face-distance SIMILAR_FACE_DISTANCE face distance used for recognition
--temp-frame-format {jpg,png} image format used for frame extraction
--temp-frame-quality [1-100] image quality used for frame extraction
--output-video-encoder {libx264,libx265,libvpx-vp9,h264_nvenc,hevc_nvenc} encoder used for the output video
--output-video-quality [1-100] quality used for the output video
--max-memory MAX_MEMORY maximum amount of RAM in GB
--execution-provider {cpu} [{cpu} ...] available execution provider (choices: cpu, ...)
--execution-threads EXECUTION_THREADS number of execution threads
-v, --version show program's version number and exit
```

Looking for a CLI mode? Using the -s/--source argument will make the run program in cli mode.
Using the `-s/--source`, `-t/--target` and `-o/--output` argument will run the program in headless mode.

## Credits

- [henryruhs](https://github.com/henryruhs): for being an irreplaceable contributor to the project
- [ffmpeg](https://ffmpeg.org/): for making video related operations easy
- [deepinsight](https://github.com/deepinsight): for their [insightface](https://github.com/deepinsight/insightface) project which provided a well-made library and models.
Expand Down
Binary file modified gui-demo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
27 changes: 18 additions & 9 deletions roop/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,8 +44,10 @@ def parse_args() -> None:
program.add_argument('--reference-face-position', help='position of the reference face', dest='reference_face_position', type=int, default=0)
program.add_argument('--reference-frame-number', help='number of the reference frame', dest='reference_frame_number', type=int, default=0)
program.add_argument('--similar-face-distance', help='face distance used for recognition', dest='similar_face_distance', type=float, default=0.85)
program.add_argument('--video-encoder', help='adjust output video encoder', dest='video_encoder', default='libx264', choices=['libx264', 'libx265', 'libvpx-vp9'])
program.add_argument('--video-quality', help='adjust output video quality', dest='video_quality', type=int, default=18, choices=range(52), metavar='[0-51]')
program.add_argument('--temp-frame-format', help='image format used for frame extraction', dest='temp_frame_format', default='png', choices=['jpg', 'png'])
program.add_argument('--temp-frame-quality', help='image quality used for frame extraction', dest='temp_frame_quality', type=int, default=0, choices=range(100), metavar='[1-100]')
program.add_argument('--output-video-encoder', help='encoder used for the output video', dest='output_video_encoder', default='libx264', choices=['libx264', 'libx265', 'libvpx-vp9', 'h264_nvenc', 'hevc_nvenc'])
program.add_argument('--output-video-quality', help='quality used for the output video', dest='output_video_quality', type=int, default=35, choices=range(100), metavar='[1-100]')
program.add_argument('--max-memory', help='maximum amount of RAM in GB', dest='max_memory', type=int)
program.add_argument('--execution-provider', help='available execution provider (choices: cpu, ...)', dest='execution_provider', default=['cpu'], choices=suggest_execution_providers(), nargs='+')
program.add_argument('--execution-threads', help='number of execution threads', dest='execution_threads', type=int, default=suggest_execution_threads())
Expand All @@ -65,8 +67,10 @@ def parse_args() -> None:
roop.globals.reference_face_position = args.reference_face_position
roop.globals.reference_frame_number = args.reference_frame_number
roop.globals.similar_face_distance = args.similar_face_distance
roop.globals.video_encoder = args.video_encoder
roop.globals.video_quality = args.video_quality
roop.globals.temp_frame_format = args.temp_frame_format
roop.globals.temp_frame_quality = args.temp_frame_quality
roop.globals.output_video_encoder = args.output_video_encoder
roop.globals.output_video_quality = args.output_video_quality
roop.globals.max_memory = args.max_memory
roop.globals.execution_providers = decode_execution_providers(args.execution_provider)
roop.globals.execution_threads = args.execution_threads
Expand Down Expand Up @@ -151,7 +155,7 @@ def start() -> None:
# process image to videos
if predict_video(roop.globals.target_path):
destroy()
update_status('Creating temp resources...')
update_status('Creating temporary resources...')
create_temp(roop.globals.target_path)
# extract frames
if roop.globals.keep_fps:
Expand All @@ -163,10 +167,14 @@ def start() -> None:
extract_frames(roop.globals.target_path)
# process frame
temp_frame_paths = get_temp_frame_paths(roop.globals.target_path)
for frame_processor in get_frame_processors_modules(roop.globals.frame_processors):
update_status('Progressing...', frame_processor.NAME)
frame_processor.process_video(roop.globals.source_path, temp_frame_paths)
frame_processor.post_process()
if temp_frame_paths:
for frame_processor in get_frame_processors_modules(roop.globals.frame_processors):
update_status('Progressing...', frame_processor.NAME)
frame_processor.process_video(roop.globals.source_path, temp_frame_paths)
frame_processor.post_process()
else:
update_status('Frames not found...')
return
# create video
if roop.globals.keep_fps:
fps = detect_fps(roop.globals.target_path)
Expand All @@ -186,6 +194,7 @@ def start() -> None:
update_status('Restoring audio might cause issues as fps are not kept...')
restore_audio(roop.globals.target_path, roop.globals.output_path)
# clean temp
update_status('Cleaning temporary resources...')
clean_temp(roop.globals.target_path)
# validate video
if is_video(roop.globals.target_path):
Expand Down
6 changes: 4 additions & 2 deletions roop/globals.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,10 @@
reference_face_position = None
reference_frame_number = None
similar_face_distance = None
video_encoder = None
video_quality = None
temp_frame_format = None
temp_frame_quality = None
output_video_encoder = None
output_video_quality = None
max_memory = None
execution_providers: List[str] = []
execution_threads = None
Expand Down
2 changes: 1 addition & 1 deletion roop/metadata.py
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
name = 'roop'
version = '1.2.0'
version = '1.3.0'
6 changes: 6 additions & 0 deletions roop/processors/frame/face_enhancer.py
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,12 @@ def post_process() -> None:

def enhance_face(target_face: Face, temp_frame: Frame) -> Frame:
start_x, start_y, end_x, end_y = map(int, target_face['bbox'])
padding_x = int((end_x - start_x) * 0.5)
padding_y = int((end_y - start_y) * 0.5)
start_x = max(0, start_x - padding_x)
start_y = max(0, start_y - padding_y)
end_x = max(0, end_x + padding_x)
end_y = max(0, end_y + padding_y)
temp_face = temp_frame[start_y:end_y, start_x:end_x]
if temp_face.size:
with THREAD_SEMAPHORE:
Expand Down
26 changes: 17 additions & 9 deletions roop/utilities.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,16 +12,16 @@

import roop.globals

TEMP_FILE = 'temp.mp4'
TEMP_DIRECTORY = 'temp'
TEMP_VIDEO_FILE = 'temp.mp4'

# monkey patch ssl for mac
if platform.system().lower() == 'darwin':
ssl._create_default_https_context = ssl._create_unverified_context


def run_ffmpeg(args: List[str]) -> bool:
commands = ['ffmpeg', '-hide_banner', '-hwaccel', 'auto', '-loglevel', roop.globals.log_level]
commands = ['ffmpeg', '-hide_banner', '-loglevel', roop.globals.log_level]
commands.extend(args)
try:
subprocess.check_output(commands, stderr=subprocess.STDOUT)
Expand All @@ -42,27 +42,35 @@ def detect_fps(target_path: str) -> float:
return 30


def extract_frames(target_path: str, fps: float = 30) -> None:
def extract_frames(target_path: str, fps: float = 30) -> bool:
temp_directory_path = get_temp_directory_path(target_path)
run_ffmpeg(['-i', target_path, '-pix_fmt', 'rgb24', '-vf', 'fps=' + str(fps), os.path.join(temp_directory_path, '%04d.png')])
temp_frame_quality = roop.globals.temp_frame_quality * 31 // 100
return run_ffmpeg(['-hwaccel', 'auto', '-i', target_path, '-q:v', str(temp_frame_quality), '-pix_fmt', 'rgb24', '-vf', 'fps=' + str(fps), os.path.join(temp_directory_path, '%04d.' + roop.globals.temp_frame_format)])


def create_video(target_path: str, fps: float = 30) -> None:
def create_video(target_path: str, fps: float = 30) -> bool:
temp_output_path = get_temp_output_path(target_path)
temp_directory_path = get_temp_directory_path(target_path)
run_ffmpeg(['-r', str(fps), '-i', os.path.join(temp_directory_path, '%04d.png'), '-c:v', roop.globals.video_encoder, '-crf', str(roop.globals.video_quality), '-pix_fmt', 'yuv420p', '-vf', 'colorspace=bt709:iall=bt601-6-625:fast=1', '-y', temp_output_path])
output_video_quality = (roop.globals.output_video_quality + 1) * 51 // 100
commands = ['-hwaccel', 'auto', '-r', str(fps), '-i', os.path.join(temp_directory_path, '%04d.' + roop.globals.temp_frame_format), '-c:v', roop.globals.output_video_encoder]
if roop.globals.output_video_encoder in ['libx264', 'libx265', 'libvpx']:
commands.extend(['-crf', str(output_video_quality)])
if roop.globals.output_video_encoder in ['h264_nvenc', 'hevc_nvenc']:
commands.extend(['-cq', str(output_video_quality)])
commands.extend(['-pix_fmt', 'yuv420p', '-vf', 'colorspace=bt709:iall=bt601-6-625:fast=1', '-y', temp_output_path])
return run_ffmpeg(commands)


def restore_audio(target_path: str, output_path: str) -> None:
temp_output_path = get_temp_output_path(target_path)
done = run_ffmpeg(['-i', temp_output_path, '-i', target_path, '-c:v', 'copy', '-map', '0:v:0', '-map', '1:a:0', '-y', output_path])
done = run_ffmpeg(['-hwaccel', 'auto', '-i', temp_output_path, '-i', target_path, '-c:v', 'copy', '-map', '0:v:0', '-map', '1:a:0', '-y', output_path])
if not done:
move_temp(target_path, output_path)


def get_temp_frame_paths(target_path: str) -> List[str]:
temp_directory_path = get_temp_directory_path(target_path)
return glob.glob((os.path.join(glob.escape(temp_directory_path), '*.png')))
return glob.glob((os.path.join(glob.escape(temp_directory_path), '*.' + roop.globals.temp_frame_format)))


def get_temp_directory_path(target_path: str) -> str:
Expand All @@ -73,7 +81,7 @@ def get_temp_directory_path(target_path: str) -> str:

def get_temp_output_path(target_path: str) -> str:
temp_directory_path = get_temp_directory_path(target_path)
return os.path.join(temp_directory_path, TEMP_FILE)
return os.path.join(temp_directory_path, TEMP_VIDEO_FILE)


def normalize_output_path(source_path: str, target_path: str, output_path: str) -> Optional[str]:
Expand Down
Loading