stable-diffusion-webui/4.post-processing.md at a54f3596080cbf7792dcd874c961bb1b8b270e36

mirror of https://github.com/sd-webui/stable-diffusion-webui.git synced 2024-12-14 23:02:00 +03:00

pre-commit-ci[bot] a9bc7eae19 [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

2023-06-23 02:58:24 +00:00

4.8 KiB

Raw Blame History

title
Post Processing

Included with both versions of the Web UI Interface are a series of image restorers and upscalers. They are included to help users create outputs with restored features, such as better faces, or at larger resolutions than Stable Diffusion is able to natively output.

GFPGAN

GFPGAN is designed to help restore faces in Stable Diffusion outputs. If you have ever tried to generate images with people in them, you know why having a face restorer can come in handy. This is where GFPGAN comes in handy. It uses it's own GAN to detect and restore the faces of subjects within an image. It greatly helps to enhance the details in human faces, while also fixing issues with asymmetry or awkward looking eyes.

If you want to use GFPGAN to improve generated faces, you need to download the models for it separately if you are on Windows or doing so manually on Linux. Download GFPGANv1.3.pth and put it into the /sygil-webui/models/gfpgan directory after you have setup the conda environment for the first time.

RealESRGAN

RealESRGAN is a 4x upscaler built into both versions of the Web UI interface. It uses its own GAN to upscale images while retaining details of an image. Two different versions of RealESRGAN can be used, RealESRGAN 4X and RealESRGAN 4X Anime. Despite the name, don't hesitate to try either version when upscaling an image to see which works best for a given output.

If you want to use RealESRGAN to upscale your images, you need to download the models for it separately if you are on Windows or doing so manually on Linux. Download RealESRGAN_x4plus.pth and RealESRGAN_x4plus_anime_6B.pth. Put them into the sygil-webui/models/realesrgan directory after you have setup the conda environment for the first time.

GoBig (Gradio only currently)

GoBig is a 2X upscaler that uses RealESRGAN to upscale the image and then slice it into small parts, each part gets diffused further by SD to create more details, great for adding and increasing details but will change the composition, might also fix issues like eyes etc. The settings are similar to Image2Image, with regards to strength and seed of the generation.

To use GoBig, you will need to download the RealESRGAN models as directed above.

Latent Diffusion Super Resolution - LSDR (Gradio only currently)

LSDR is a 4X upscaler with high VRAM usage that uses a Latent Diffusion model to upscale the image. This will accentuate the details of an image, but won't change the composition. This might introduce sharpening, but it is great for textures or compositions with plenty of details. However, it is slower and will use more VRAM.

If you want to use LSDR to upscale your images, you need to download the models for it separately if you are on Windows or doing so manually on Linux. Download the LDSR project.yaml and model last.cpkt. Rename last.ckpt to model.ckpt and place both in the sygil-webui/models/ldsr directory after you have setup the conda environment for the first time.

GoLatent (Gradio only currently)

GoLatent is an 8X upscaler with high VRAM usage. It uses GoBig to add details and then uses a Latent Diffusion (LSDR) model to upscale the image. This will result in less artifacts and sharpening. Use the settings to feed GoBig settings that will contribute to the result. Please note, this mode is considerably slower and uses significantly more VRAM.

To use GoLatent, you will need to download the appropriate LSDR models as described above.

Future Additions

Currently, these are the 4 main enhancers and upscalers used in the project, but more may be implemented in the future. Stay Tuned!

4.8 KiB Raw Blame History