sygil-webui/docs/6.image_enhancers.md

74 lines
4.8 KiB
Markdown

---
title: Upscalers
---
<!--
This file is part of sygil-webui (https://github.com/Sygil-Dev/sygil-webui/).
Copyright 2022 Sygil-Dev team.
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU Affero General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
-->
Included with both versions of the Web UI Interface are a series of image restorers and upscalers. They are included to help users create outputs with restored features, such as better faces, or at larger resolutions than Stable Diffusion is able to natively output.
## GFPGAN
---
![](../images/GFPGAN.png)
GFPGAN is designed to help restore faces in Stable Diffusion outputs. If you have ever tried to generate images with people in them, you know why having a face restorer can come in handy. This is where GFPGAN comes in handy. It uses it's own GAN to detect and restore the faces of subjects within an image. It greatly helps to enhance the details in human faces, while also fixing issues with asymmetry or awkward looking eyes.
If you want to use GFPGAN to improve generated faces, you need to download the models for it seperately if you are on Windows or doing so manually on Linux.
Download [GFPGANv1.3.pth](https://github.com/TencentARC/GFPGAN/releases/download/v1.3.0/GFPGANv1.3.pth) and put it
into the `/sygil-webui/models/gfpgan` directory after you have setup the conda environment for the first time.
## RealESRGAN
---
![](../images/RealESRGAN.png)
RealESRGAN is a 4x upscaler built into both versions of the Web UI interface. It uses its own GAN to upscale images while retaining details of an image. Two different versions of realESRGAN can be used, `RealESRGAN 4X` and `RealESRGAN 4X Anime`. Despite the name, don't hesitate to try either version when upscaling an image to see which works bert for a given output.
If you want to use RealESRGAN to upscale your images, you need to download the models for it seperately if you are on Windows or doing so manually on Linux.
Download [RealESRGAN_x4plus.pth](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.0/RealESRGAN_x4plus.pth) and [RealESRGAN_x4plus_anime_6B.pth](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/RealESRGAN_x4plus_anime_6B.pth).
Put them into the `sygil-webui/models/realesrgan` directory after you have setup the conda environment for the first time.
## GoBig (Gradio only currently)
---
GoBig is a 2X upscaler that uses RealESRGAN to upscale the image and then slice it into small parts, each part gets diffused further by SD to create more details, great for adding and increasing details but will change the composition, might also fix issues like eyes etc. The settings are similar to Image2Image, with regards to strength and seed of the generation.
To use GoBig, you will need to download the RealESRGAN models as directed above.
## Latent Diffusion Super Resolution - LSDR (Gradio only currently)
---
LSDR is a 4X upscaler with high VRAM usage that uses a Latent Diffusion model to upscale the image. This will accentuate the details of an image, but won't change the composition. This might introduce sharpening, but it is great for textures or compositions with plenty of details. However, it is slower and will use more VRAM.
If you want to use LSDR to upscale your images, you need to download the models for it seperately if you are on Windows or doing so manually on Linux.
Download the LDSR [project.yaml](https://heibox.uni-heidelberg.de/f/31a76b13ea27482981b4/?dl=1) and [ model last.cpkt](https://heibox.uni-heidelberg.de/f/578df07c8fc04ffbadf3/?dl=1). Rename `last.ckpt` to `model.ckpt` and place both in the `sygil-webui/models/ldsr` directory after you have setup the conda environment for the first time.
## GoLatent (Gradio only currently)
---
GoLatent is an 8X upscaler with high VRAM usage. It uses GoBig to add details and then uses a Latent Diffusion (LSDR) model to upscale the image. This will result in less artifacting and sharpeninng. Use the settings to feed GoBig settings that will contribute to the result. Please note, this mode is considerably slower and uses significantly more VRAM.
To use GoLatent, you will need to download the appropriate LSDR models as described above.
---
## Future Additions
Currently, these are the 4 main enhancers and upscalers used in the project, but more may be implemented in the future. Stay Tuned!