ENFUGUE Web UI v0.3.2
Installation and Running
A script is provided for Windows and Linux machines to install, update, and run ENFUGUE. Copy the relevant command below and answer the on-screen prompts to choose your installation type and install optional dependencies.
Windows
Access the command prompt from the start menu by searching for "command." Alternatively, hold the windows key on your keyboard and click x
, then press r
or click run
, then type cmd
and press enter or click ok
.
curl https://raw.githubusercontent.com/painebenjamin/app.enfugue.ai/main/enfugue.bat -o enfugue.bat
.\enfugue.bat
Linux
Access a command shell using your preferred method and execute the following.
curl https://raw.githubusercontent.com/painebenjamin/app.enfugue.ai/main/enfugue.sh -o enfugue.sh
chmod u+x enfugue.sh
./enfugue.sh
Both of these commands accept the same flags.
USAGE: enfugue.(bat|sh) [OPTIONS]
Options:
--help Display this help message.
--conda / --portable Automatically set installation type (do not prompt.)
--update / --no-update Automatically apply or skip updates (do not prompt.)
--mmpose / --no-mmpose Automatically install or skip installing MMPose (do not prompt.)
Windows/Linux Manual Installation
If you want to install without using the installation scripts, see this Wiki page.
MacOS
Automatic installers are coming! For now, please follow this manual installation method.
Download enfugue-server-0.3.2-macos-ventura-mps-x86_64.tar.gz, then double-click it to extract the package. When you run the application using the command below, your Mac will warn you of running downloaded packages, and you will have to perform an administrator override to allow it to run - you will be prompted to do this. To avoid this, you can run an included command like so:
./enfugue-server/unquarantine.sh
This command finds all the files in the installation and removes the com.apple.quarantine
xattr
from the file. This does not require administrator privilege. After doing this (or if you will grant the override,) run the server with:
./enfugue-server/enfugue.sh
Note: while the MacOS packages are compiled on x86 machines, they are tested and designed for the new M1/M2 ARM machines thanks to Rosetta, Apple's machine code translation system.
New Features
1. Multi-Server, Flexible Domain Routing
- ENFUGUE now supports running multiple servers at once, listening on different ports and optionally different hosts and protocols.
- By default, ENFUGUE runs two servers; one on port 45554 over HTTPS (as it has been) and a second on port 45555 on HTTP.
- If accessing via
https://app.enfugue.ai:45554
does not work for your networking setup, you can now connect to enfugue usinghttp://127.0.0.1:45555
or any other IP address/hostname that resolves to the machine running ENFUGUE. - Configuration syntax remains the same, however the
host
,domain
,port
,secure
,cert
, andkey
keys can now accept lists/arrays.
2. SDXL Turbo + Reduced Overhead (~25% Faster Single-Image Generation)
- Fully integrated into all workflows; derivatives also supported (turbo fine-tuned models like Dreamshaper Turbo, etc.)
- The travel time between browser to stable diffusion and back has been reduced by about 90%.
- This translates to significant speed gains for single-image generation, with diminishing returns for multiple/large image generations.
3. Prompt Weighting Syntax Change
- The Compel library has been implemented, which provides much better translation of prompt to embedding. This improves the control you can exert over your images using text alone.
- See here for syntax documentation.
4. Stable Video Diffusion
bddc02f8c37645c9b67dc5b2d4e68739.mp4
- Standalone img2vid integration available through canvas shortcut (enfugue-generated image sample) or through the Extras menu (external/other image sample.)
- As more workflows are enabled through SVD, it will become further integrated.
- This model is licensed under a non-commercial license. As such, output of this tool cannot be used for commercial purposes without first contacting Stability AI and acquiring a commercial license.
5. AnimateDiff V3, Sparse ControlNet, Frame Start Input
- AnimateDiff version 3 has been released and is now the default motion module when creating animations using Stable Diffusion 1.5.
- The authors of AnimateDiff additionally created Sparse ControlNet, a new kind of ControlNet that allows for placing images on keyframes, and letting the ControlNet help interpolate between the frames. Select Sparse RGB or Sparse Scribble as your ControlNet in a Control Unit to use it.
- To control on which frame the image occurs, set the Starting Frame value in the Layer Options menu. The first value for this field is one (i.e. it is 1-indexed.) This will be changed in the next update (see what's next below.)
6. FreeInit/Iterative Denoising/Ablation
output.mp4
- FreeInit has been implemented, allowing for repeated denoising of animations.
- This can stabilize animations when using long windows or large multipliers that would otherwise result in incoherent videos.
- This has only been observed to work with DDIM sampling. If you find another scheduler that works, please let me know!
7. SDXL Inpainting Improvements
- Inpainting for SDXL has been improved, removing issues with color grading.
- Additionally, outpainting for SDXL has been improved. Expect more coherent and higher quality results when extending images beyond their borders.
8. IP Adapter Full, Face Isolate
- A new IP adapter model has been added for 1.5; Full Face.
- Additionally an option has been added alongside IP adapter scale that allows you to isolate the image to the face. This will use a face detection model to remove all other image data except the face prior to sending it to IP adapter.
9. New Interpolator Implementation
- The previous interpolator, which was implemented in TensorFlow, has been replaced by an identical one implemented in PyTorch. This is very valuable as it has removed the dependence on TensorFlow entirely.
- You will not find different behavior between these two implementations.
10. Caption Upsampler
- Turns prompts into more descriptive ones using open-source LLM. - Only H4 Zephyr 7B is available for LLM models at this time. The first time you use the upsampler, this model will be downloaded. It is approximately 10 Gb in size and requires around 12 Gb of VRAM to run.
11. Split/Unified Layout
Using a split layout allows you to zoom in to inpaint on one side while still seeing the whole output on the other.
- Current behavior is now termed "Unified Layout," where the input canvas and output samples occupy the same space and you can swap between them.
- Now you can also split the viewport in half vertically or horizontally (adjustable,) one side is for the input canvas, one side is for the output samples.
12. Real-time
Screencast_from_12-18-2023_043800_PM.1.mp4
- To go along with the above, you can now also enable real-time image generation.
- This will render an image any time a change is made to the global inputs or any layered input.
- Intermediate images and progress reporting are disabled when real-time is enabled.
- You can expect images in roughly one-second intervals when using Turbo or LCM at a reasonable size with no additional processing.
- All other tools are enabled. Be aware that the same rules for processing speed apply here, so the more kinds of inputs you add (ControlNets, IP adapters) and output processing (face fixing, upscaling,) the higher the latency will be.
13. Control Improvements
- The 'Space' key now functions to pan the canvas, similar to middle-mouse or ctrl-left-mouse.
- Touch events are now properly bound on the canvas, enabling touchscreen use.
- Scroll-based zooming with touchpads has been significantly slowed down to make it more controllable.
14. New Downloadable Models
Checkpoints
- PlaygroundAI's Playground V2 model
- Alex Izquierdo's OpenDallE
- Segmind's Vega
LoRA
- Segmind's VegaRT
- SDXL Offset LoRA (sd_xl_offset_example-lora_1.0.safetensors)
- Direct Preference Optimization (DPO) for 1.5 and XL
15. Other Changes
- "Use Tiled Diffusion/VAE" is now two separate inputs, "Use Tiled UNet" and "Use Tiled VAE." There are situations where you will hit CUDA out-of-memory errors during decoding, but not during inference. This will enable you to tile the decoding (just select 'Use Tiled VAE') without also having to tile the inference.
- Classifier-free guidance (guidance scale <=1.0) was broken for SDXL, it has been fixed.
- Result menus were accidentally removed from the top menu bar, this has been fixed.
- An issue with reordering layers has been fixed.
- The prompt travel interface has been improved to perform better when frame counts are larger than 64.
- Images and video have been given a offset variables to allow you to finely position them within their frames.
- You are no longer prompted to keep state when clicking 'Edit Image' or 'Edit Video,' it is now always kept.
- You are now prompted in the front-end before you send an invocation with an image that does not have any assigned role.
- The guidance scale minimum has been returned from 1.0 to 0.0.
- The number of inference steps minimum has been reduced from 2 to 1.
- The default number of diffusion steps when using upscaling with denoising has been reduced from 100 to 40, and the default guidance scale has been reduced from 12 to 10.
Full Changelog: 0.3.1...0.3.2
What's Next
Planned for 0.3.3
1. Images/Video on the Timeline
The Prompt Travel interface will be expanded to allow images and video to be manually placed on them.
2. Audio
Audio will additionally be added to the timeline, and will be an input for audio-reactive diffusion.
3. Additional Model Support
- IP Adapter + FaceID
- SVD ControlNet
- PIA (Personalized Image Adapter)
Planned for 0.4.0
1. 3D/4D Diffusion
Support Stable123 and other 3D model generators.