mirror of
https://github.com/sd-webui/stable-diffusion-webui.git
synced 2024-12-16 08:22:33 +03:00
de95a09593
* added image lab * first release model loading/unloading and save procedure added, commented out unused code from frontend * bug fixes Changed the image output to a gallery to display multiple items Fixed results not showing up in output Fixed RealESRGAN 2x mode not working and hard coded the default value for the reload. * added GoBig model check * added LDSR load check * removed global statements, added model loader/unloader function * fixed optimized mode * update * update Added send to lab button Added a print out if latent-diffusion folder isn't found * brought back the fix faces and upscale in generation tab * uncommenting img lab flag * added LDSR instructions * default imgProcessorTask set to false * exposed LDSR settings to lab users need to reclone the LDSR repo to use them. * Update frontend.py moving some stuff around to make them more coherent * restored upscale and fix faces to img2img * added notice section * fixed gfpgan/upscaled pictures not showing in 2img interfaces * send to lab button now sends info as well * uncommented dimension info update * added increment buttons to sampler for that k_euler_a action * image lab settings toggle on and off with selection * removed wip settings panel * better model loading handling and removed increment buttons * explaining * disabled SD unloading in image lab upscaling with realesgan and face fix * fixed a conflict with image lab Co-authored-by: dr3amer <91037083+dr3am37@users.noreply.github.com> Co-authored-by: hlky <106811348+hlky@users.noreply.github.com>
205 lines
11 KiB
Markdown
205 lines
11 KiB
Markdown
### [MAIN REPO](https://github.com/hlky/stable-diffusion)
|
|
|
|
# This repo is for development, there may be bugs and new features
|
|
|
|
# Notice
|
|
-New LDSR settings added to Image Lab, To use the new LDSR settings please make sure to re-clone the LDSR (Instructions added below) to insure you have the latest.
|
|
|
|
## Feature request? Use [discussions](https://github.com/hlky/stable-diffusion-webui/discussions)
|
|
|
|
### Questions about **_[Upscalers](https://github.com/hlky/stable-diffusion-webui/wiki/Upscalers)_**?
|
|
### Questions about **_[Optimized mode](https://github.com/hlky/stable-diffusion-webui/wiki/Optimized-mode)_**?
|
|
|
|
|
|
|
|
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/altryne/sd-webui-colab/blob/main/Stable_Diffusion_WebUi_Altryne.ipynb)
|
|
|
|
## More documentation about features, troubleshooting, common issues very soon
|
|
### Want to help with documentation? Documented something? Use [Discussions](https://github.com/hlky/stable-diffusion-webui/discussions)
|
|
|
|
Features:
|
|
|
|
* Gradio GUI: Idiot-proof, fully featured frontend for both txt2img and img2img generation
|
|
* No more manually typing parameters, now all you have to do is write your prompt and adjust sliders
|
|
* :fire: :fire: Optimized support!! :fire: :fire:
|
|
* 🔥 NEW! [webui.cmd](https://github.com/hlky/stable-diffusion) updates with any changes in environment.yaml file so the environment will always be up to date as long as you get the new environment.yaml file 🔥
|
|
:fire: no need to remove environment, delete src folder and create again, MUCH simpler! 🔥
|
|
* GFPGAN Face Correction 🔥: [Download the model](https://github.com/hlky/stable-diffusion-webui#gfpgan)Automatically correct distorted faces with a built-in GFPGAN option, fixes them in less than half a second
|
|
* RealESRGAN Upscaling 🔥: [Download the models](https://github.com/hlky/stable-diffusion-webui#realesrgan) Boosts the resolution of images with a built-in RealESRGAN option
|
|
* :computer: esrgan/gfpgan on cpu support :computer:
|
|
* Textual inversion 🔥: [info](https://textual-inversion.github.io/) - requires enabling, see [here](https://github.com/hlky/sd-enable-textual-inversion), script works as usual without it enabled
|
|
* Advanced img2img editor :art: :fire: :art:
|
|
* :fire::fire: Mask and crop :fire::fire:
|
|
* Mask painting (NEW) 🖌️: Powerful tool for re-generating only specific parts of an image you want to change
|
|
* More k_diffusion samplers 🔥🔥 : Far greater quality outputs than the default sampler, less distortion and more accurate
|
|
* txt2img samplers: "DDIM", "PLMS", 'k_dpm_2_a', 'k_dpm_2', 'k_euler_a', 'k_euler', 'k_heun', 'k_lms'
|
|
* img2img samplers: "DDIM", 'k_dpm_2_a', 'k_dpm_2', 'k_euler_a', 'k_euler', 'k_heun', 'k_lms'
|
|
* Loopback (NEW) ➿: Automatically feed the last generated sample back into img2img
|
|
* Prompt Weighting (NEW) 🏋️: Adjust the strength of different terms in your prompt
|
|
* :fire: gpu device selectable with --gpu <id> :fire:
|
|
* Memory Monitoring 🔥: Shows Vram usage and generation time after outputting.
|
|
* Word Seeds 🔥: Use words instead of seed numbers
|
|
* CFG: Classifier free guidance scale, a feature for fine-tuning your output
|
|
* Launcher Automatic 👑🔥 shortcut to load the model, no more typing in Conda
|
|
* Lighter on Vram: 512x512 img2img & txt2img tested working on 6gb
|
|
* and ????
|
|
|
|
# Stable Diffusion web UI
|
|
A browser interface based on Gradio library for Stable Diffusion.
|
|
|
|
![](images/txt2img.jpg)
|
|
|
|
![](images/img2img.jpg)
|
|
|
|
![](images/gfpgan.jpg)
|
|
|
|
![](images/esrgan.jpg)
|
|
|
|
### GFPGAN
|
|
|
|
If you want to use GFPGAN to improve generated faces, you need to install it separately.
|
|
Download [GFPGANv1.3.pth](https://github.com/TencentARC/GFPGAN/releases/download/v1.3.0/GFPGANv1.3.pth) and put it
|
|
into the `/stable-diffusion/src/gfpgan/experiments/pretrained_models` directory.
|
|
|
|
### RealESRGAN
|
|
Download [RealESRGAN_x4plus.pth](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.0/RealESRGAN_x4plus.pth) and [RealESRGAN_x4plus_anime_6B.pth](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/RealESRGAN_x4plus_anime_6B.pth).
|
|
Put them into the `stable-diffusion/src/realesrgan/experiments/pretrained_models` directory.
|
|
|
|
### LDSR
|
|
Quadruple your resolution using Latent Diffusion, to install:
|
|
- Git clone https://github.com/devilismyfriend/latent-diffusion into your stable-diffusion-main/src/ folder
|
|
- Rename latent-diffusion-main folder to latent-diffusion
|
|
- If on windows: run download_models.bat to download the required model files
|
|
- Otherwise to manually install the model download [project.yaml](https://heibox.uni-heidelberg.de/f/31a76b13ea27482981b4/?dl=1) and [last.cpkt](https://heibox.uni-heidelberg.de/f/578df07c8fc04ffbadf3/?dl=1) and rename last.ckpt to model.ckpt
|
|
- Place both under stable-diffusion-main/src/latent-diffusion/experiments/pretrained_models/
|
|
- Make sure you have both project.yaml and model.ckpt in that folder and path.
|
|
- LDSR should be wokring now.
|
|
### Web UI
|
|
|
|
When launching, you may get a very long warning message related to some weights not being used. You may freely ignore it.
|
|
After a while, you will get a message like this:
|
|
|
|
```
|
|
Running on local URL: http://127.0.0.1:7860/
|
|
```
|
|
|
|
Open the URL in browser, and you are good to go.
|
|
|
|
## Features
|
|
The script creates a web UI for Stable Diffusion's txt2img and img2img scripts. Following are features added
|
|
that are not in original script.
|
|
|
|
### GFPGAN
|
|
Lets you improve faces in pictures using the GFPGAN model. There is a checkbox in every tab to use GFPGAN at 100%, and
|
|
also a separate tab that just allows you to use GFPGAN on any picture, with a slider that controls how strongthe effect is.
|
|
|
|
![](images/GFPGAN.png)
|
|
|
|
### RealESRGAN
|
|
Lets you double the resolution of generated images. There is a checkbox in every tab to use RealESRGAN, and you can choose between the regular upscaler and the anime version.
|
|
There is also a separate tab for using RealESRGAN on any picture.
|
|
|
|
|
|
![](images/RealESRGAN.png)
|
|
|
|
### Sampling method selection
|
|
txt2img samplers: "DDIM", "PLMS", 'k_dpm_2_a', 'k_dpm_2', 'k_euler_a', 'k_euler', 'k_heun', 'k_lms'
|
|
img2img samplers: "DDIM", 'k_dpm_2_a', 'k_dpm_2', 'k_euler_a', 'k_euler', 'k_heun', 'k_lms'
|
|
|
|
![](images/sampling.png)
|
|
|
|
### Prompt matrix
|
|
Separate multiple prompts using the `|` character, and the system will produce an image for every combination of them.
|
|
For example, if you use `a busy city street in a modern city|illustration|cinematic lighting` prompt, there are four combinations possible (first part of prompt is always kept):
|
|
|
|
- `a busy city street in a modern city`
|
|
- `a busy city street in a modern city, illustration`
|
|
- `a busy city street in a modern city, cinematic lighting`
|
|
- `a busy city street in a modern city, illustration, cinematic lighting`
|
|
|
|
Four images will be produced, in this order, all with same seed and each with corresponding prompt:
|
|
![](images/prompt-matrix.png)
|
|
|
|
Another example, this time with 5 prompts and 16 variations:
|
|
![](images/prompt_matrix.jpg)
|
|
|
|
|
|
### Prompt combinations
|
|
|
|
If you add '@' symbol at start your prompt and change text like this:
|
|
`@(moba|rpg|rts) character (2d|3d) model` it will be produce 3 * 2 combinations or prompt with same seed:
|
|
|
|
- `moba character 2d model`
|
|
- `rpg character 2d model`
|
|
- `rts character 2d model`
|
|
- `moba character 3d model`
|
|
- `rpg character 3d model`
|
|
- `rts character 3d model`
|
|
|
|
If you use this feature, batch count will be ignored, because the number of pictures to produce
|
|
depends on your prompts, but batch size will still work (generating multiple pictures at the
|
|
same time for a small speed boost).
|
|
|
|
### Flagging (Broken after UI changed to gradio.Blocks() see [Flag button missing from new UI](https://github.com/hlky/stable-diffusion-webui/issues/50))
|
|
Click the Flag button under the output section, and generated images will be saved to `log/images` directory, and generation parameters
|
|
will be appended to a csv file `log/log.csv` in the `/sd` directory.
|
|
|
|
> but every image is saved, why would I need this?
|
|
|
|
If you're like me, you experiment a lot with prompts and settings, and only few images are worth saving. You can
|
|
just save them using right click in browser, but then you won't be able to reproduce them later because you will not
|
|
know what exact prompt created the image. If you use the flag button, generation paramerters will be written to csv file,
|
|
and you can easily find parameters for an image by searching for its filename.
|
|
|
|
### Copy-paste generation parameters
|
|
A text output provides generation parameters in an easy to copy-paste form for easy sharing.
|
|
|
|
![](images/kopipe.png)
|
|
|
|
If you generate multiple pictures, the displayed seed will be the seed of the first one.
|
|
|
|
### Correct seeds for batches
|
|
If you use a seed of 1000 to generate two batches of two images each, four generated images will have seeds: `1000, 1001, 1002, 1003`.
|
|
Previous versions of the UI would produce `1000, x, 1001, x`, where x is an iamge that can't be generated by any seed.
|
|
|
|
### Resizing
|
|
There are three options for resizing input images in img2img mode:
|
|
|
|
- Just resize - simply resizes source image to target resolution, resulting in incorrect aspect ratio
|
|
- Crop and resize - resize source image preserving aspect ratio so that entirety of target resolution is occupied by it, and crop parts that stick out
|
|
- Resize and fill - resize source image preserving aspect ratio so that it entirely fits target resolution, and fill empty space by rows/columns from source image
|
|
|
|
Example:
|
|
![](images/resizing.jpg)
|
|
|
|
### Loading
|
|
Gradio's loading graphic has a very negative effect on the processing speed of the neural network.
|
|
My RTX 3090 makes images about 10% faster when the tab with gradio is not active. By default, the UI
|
|
now hides loading progress animation and replaces it with static "Loading..." text, which achieves
|
|
the same effect. Use the --no-progressbar-hiding commandline option to revert this and show loading animations.
|
|
|
|
### Prompt validation
|
|
Stable Diffusion has a limit for input text length. If your prompt is too long, you will get a
|
|
warning in the text output field, showing which parts of your text were truncated and ignored by the model.
|
|
|
|
### Loopback
|
|
A checkbox for img2img allowing to automatically feed output image as input for the next batch. Equivalent to
|
|
saving output image, and replacing input image with it. Batch count setting controls how many iterations of
|
|
this you get.
|
|
|
|
Usually, when doing this, you would choose one of many images for the next iteration yourself, so the usefulness
|
|
of this feature may be questionable, but I've managed to get some very nice outputs with it that I wasn't able
|
|
to get otherwise.
|
|
|
|
Example: (cherrypicked result; original picture by anon)
|
|
|
|
![](images/loopback.jpg)
|
|
|
|
## Development Info
|
|
There is a different directory structure on this dev repo to simplify things and a github action is used to sync things to the right place in the [main repo](https://github.com/hlky/stable-diffusion). The config for this sync is in `.github/sync.yml`.
|
|
|
|
There is a helper script for local development to replicate the actions of this github action.
|
|
- Run `python sync_local.py --dest MAIN_REPO_FOLDER` to replicate the effect of this sync.
|
|
- To copy changes back you can run `python sync_local.py --dest MAIN_REPO_FOLDER --reverse`.
|
|
|
|
You can the file `webui_playground.py`, which does not load the models, to more rapidly iterate on UI changes and then copy those changes into `webui.py`, |