stable-diffusion-webui/README.md

### [MAIN REPO](https://github.com/hlky/stable-diffusion)

## This repo is for development, there may be bugs and new features

### :warning: temporary notice: [e71d82e](https://github.com/hlky/stable-diffusion-webui/commit/e71d82e3a6617c4db00e90a4378a0f14191b5b75) fixed optimized support and also requires the changes in the main repo. [e71d82e](https://github.com/hlky/stable-diffusion-webui/commit/e71d82e3a6617c4db00e90a4378a0f14191b5b75) is synced to main repo too so just pull from main repo :warning:

### Questions about **_[Upscalers](https://github.com/hlky/stable-diffusion-webui/wiki/Upscalers)_**?
### Questions about **_[Optimized mode](https://github.com/hlky/stable-diffusion-webui/wiki/Optimized-mode)_**?


## More documentation about features, troubleshooting, common issues very soon
### Want to help with documentation? Documented something? Use [Discussions](https://github.com/hlky/stable-diffusion-webui/discussions)

Features:

* Gradio GUI: Idiot-proof, fully featured frontend for both txt2img and img2img generation
* No more manually typing parameters, now all you have to do is write your prompt and adjust sliders
* :fire: :fire: Optimized support!! :fire: :fire:
* 🔥 NEW! [webui.cmd](https://github.com/hlky/stable-diffusion) updates with any changes in environment.yaml file so the environment will always be up to date as long as you get the new environment.yaml file 🔥
:fire: no need to remove environment, delete src folder and create again, MUCH simpler! 🔥
* GFPGAN Face Correction 🔥: [Download the model](https://github.com/hlky/stable-diffusion-webui#gfpgan)Automatically correct distorted faces with a built-in GFPGAN option, fixes them in less than half a second 
* RealESRGAN Upscaling 🔥: [Download the models](https://github.com/hlky/stable-diffusion-webui#realesrgan) Boosts the resolution of images with a built-in RealESRGAN option 
* :computer: esrgan/gfpgan on cpu support :computer:
* Textual inversion 🔥: [info](https://textual-inversion.github.io/) - requires enabling, see [here](https://github.com/hlky/sd-enable-textual-inversion), script works as usual without it enabled
* Advanced img2img editor :art: :fire: :art:
* :fire::fire: Mask and crop :fire::fire:
* Mask painting (NEW) 🖌️: Powerful tool for re-generating only specific parts of an image you want to change
* More k_diffusion samplers 🔥🔥 : Far greater quality outputs than the default sampler, less distortion and more accurate
* txt2img samplers: "DDIM", "PLMS", 'k_dpm_2_a', 'k_dpm_2', 'k_euler_a', 'k_euler', 'k_heun', 'k_lms'
* img2img samplers: "DDIM", 'k_dpm_2_a', 'k_dpm_2', 'k_euler_a', 'k_euler', 'k_heun', 'k_lms'
* Loopback (NEW) ➿: Automatically feed the last generated sample back into img2img
* Prompt Weighting (NEW) 🏋️: Adjust the strength of different terms in your prompt
* :fire: gpu device selectable with --gpu <id> :fire:
* Memory Monitoring 🔥: Shows Vram usage and generation time after outputting.
* Word Seeds 🔥: Use words instead of seed numbers
* CFG: Classifier free guidance scale, a feature for fine-tuning your output
* Launcher Automatic 👑🔥 shortcut to load the model, no more typing in Conda
* Lighter on Vram: 512x512 img2img & txt2img tested working on 6gb
* and ????

# Stable Diffusion web UI
A browser interface based on Gradio library for Stable Diffusion.

Original script with Gradio UI was written by a kind anonymopus user. This is a modification.

![](images/txt2img.jpg)

![](images/img2img.jpg)

![](images/gfpgan.jpg)

![](images/esrgan.jpg)

### GFPGAN

If you want to use GFPGAN to improve generated faces, you need to install it separately.
Download [GFPGANv1.3.pth](https://github.com/TencentARC/GFPGAN/releases/download/v1.3.0/GFPGANv1.3.pth) and put it
into the `/stable-diffusion/src/gfpgan/experiments/pretrained_models` directory. 

### RealESRGAN
Download [RealESRGAN_x4plus.pth](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.0/RealESRGAN_x4plus.pth) and [RealESRGAN_x4plus_anime_6B.pth](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/RealESRGAN_x4plus_anime_6B.pth).
Put them into the `stable-diffusion/src/realesrgan/experiments/pretrained_models` directory. 

### Web UI

When launching, you may get a very long warning message related to some weights not being used. You may freely ignore it.
After a while, you will get a message like this:

```
Running on local URL:  http://127.0.0.1:7860/
```

Open the URL in browser, and you are good to go.

## Features
The script creates a web UI for Stable Diffusion's txt2img and img2img scripts. Following are features added
that are not in original script.

### GFPGAN
Lets you improve faces in pictures using the GFPGAN model. There is a checkbox in every tab to use GFPGAN at 100%, and
also a separate tab that just allows you to use GFPGAN on any picture, with a slider that controls how strongthe effect is.

![](images/GFPGAN.png)

### RealESRGAN
Lets you double the resolution of generated images. There is a checkbox in every tab to use RealESRGAN, and you can choose between the regular upscaler and the anime version.
There is also a separate tab for using RealESRGAN on any picture.

![](images/RealESRGAN.png)

### Sampling method selection
txt2img samplers: "DDIM", "PLMS", 'k_dpm_2_a', 'k_dpm_2', 'k_euler_a', 'k_euler', 'k_heun', 'k_lms'
img2img samplers: "DDIM", 'k_dpm_2_a', 'k_dpm_2', 'k_euler_a', 'k_euler', 'k_heun', 'k_lms'

![](images/sampling.png)

### Prompt matrix
Separate multiple prompts using the `|` character, and the system will produce an image for every combination of them.
For example, if you use `a busy city street in a modern city|illustration|cinematic lighting` prompt, there are four combinations possible (first part of prompt is always kept):

- `a busy city street in a modern city`
- `a busy city street in a modern city, illustration`
- `a busy city street in a modern city, cinematic lighting`
- `a busy city street in a modern city, illustration, cinematic lighting`

Four images will be produced, in this order, all with same seed and each with corresponding prompt:
![](images/prompt-matrix.png)

Another example, this time with 5 prompts and 16 variations:
![](images/prompt_matrix.jpg)


### Prompt combinations 

If you add '@' symbol at start your prompt and change text like this:
`@(moba|rpg|rts) character (2d|3d) model` it will be produce 3 * 2 combinations or prompt with same seed:

- `moba character 2d model`
- `rpg character 2d model`
- `rts character 2d model`
- `moba character 3d model`
- `rpg character 3d model`
- `rts character 3d model`

If you use this feature, batch count will be ignored, because the number of pictures to produce
depends on your prompts, but batch size will still work (generating multiple pictures at the
same time for a small speed boost).

### Flagging (Broken after UI changed to gradio.Blocks() see [Flag button missing from new UI](https://github.com/hlky/stable-diffusion-webui/issues/50))
Click the Flag button under the output section, and generated images will be saved to `log/images` directory, and generation parameters
will be appended to a csv file `log/log.csv` in the `/sd` directory.

> but every image is saved, why would I need this?

If you're like me, you experiment a lot with prompts and settings, and only few images are worth saving. You can
just save them using right click in browser, but then you won't be able to reproduce them later because you will not
know what exact prompt created the image. If you use the flag button, generation paramerters will be written to csv file,
and you can easily find parameters for an image by searching for its filename.

### Copy-paste generation parameters
A text output provides generation parameters in an easy to copy-paste form for easy sharing.

![](images/kopipe.png)

If you generate multiple pictures, the displayed seed will be the seed of the first one.

### Correct seeds for batches
If you use a seed of 1000 to generate two batches of two images each, four generated images will have seeds: `1000, 1001, 1002, 1003`.
Previous versions of the UI would produce `1000, x, 1001, x`, where x is an iamge that can't be generated by any seed.

### Resizing
There are three options for resizing input images in img2img mode:

- Just resize - simply resizes source image to target resolution, resulting in incorrect aspect ratio
- Crop and resize - resize source image preserving aspect ratio so that entirety of target resolution is occupied by it, and crop parts that stick out
- Resize and fill - resize source image preserving aspect ratio so that it entirely fits target resolution, and fill empty space by rows/columns from source image

Example:
![](images/resizing.jpg)

### Loading
Gradio's loading graphic has a very negative effect on the processing speed of the neural network.
My RTX 3090 makes images about 10% faster when the tab with gradio is not active. By default, the UI
now hides loading progress animation and replaces it with static "Loading..." text, which achieves
the same effect. Use the --no-progressbar-hiding commandline option to revert this and show loading animations.

### Prompt validation
Stable Diffusion has a limit for input text length. If your prompt is too long, you will get a
warning in the text output field, showing which parts of your text were truncated and ignored by the model.

### Loopback
A checkbox for img2img allowing to automatically feed output image as input for the next batch. Equivalent to
saving output image, and replacing input image with it. Batch count setting controls how many iterations of
this you get.

Usually, when doing this, you would choose one of many images for the next iteration yourself, so the usefulness
of this feature may be questionable, but I've managed to get some very nice outputs with it that I wasn't abble
to get otherwise.

Example: (cherrypicked result; original picture by anon)

![](images/loopback.jpg)
Update README.md 2022-08-27 20:28:24 +03:00			`### [MAIN REPO](https://github.com/hlky/stable-diffusion)`

			`## This repo is for development, there may be bugs and new features`

Update README.md 2022-08-28 19:06:02 +03:00			`### :warning: temporary notice: [e71d82e](https://github.com/hlky/stable-diffusion-webui/commit/e71d82e3a6617c4db00e90a4378a0f14191b5b75) fixed optimized support and also requires the changes in the main repo. [e71d82e](https://github.com/hlky/stable-diffusion-webui/commit/e71d82e3a6617c4db00e90a4378a0f14191b5b75) is synced to main repo too so just pull from main repo :warning:`
Update README.md 2022-08-28 19:05:34 +03:00
Update README.md 2022-08-28 07:56:15 +03:00			`### Questions about _[Upscalers](https://github.com/hlky/stable-diffusion-webui/wiki/Upscalers)_?`
			`### Questions about _[Optimized mode](https://github.com/hlky/stable-diffusion-webui/wiki/Optimized-mode)_?`

Update README.md 2022-08-28 19:05:34 +03:00

Update README.md 2022-08-28 08:00:14 +03:00			`## More documentation about features, troubleshooting, common issues very soon`
			`### Want to help with documentation? Documented something? Use [Discussions](https://github.com/hlky/stable-diffusion-webui/discussions)`

Update README.md 2022-08-25 21:59:49 +03:00			`Features:`

			`* Gradio GUI: Idiot-proof, fully featured frontend for both txt2img and img2img generation`
Update README.md 2022-08-25 22:02:04 +03:00			`* No more manually typing parameters, now all you have to do is write your prompt and adjust sliders`
Update README.md 2022-08-28 19:24:01 +03:00			`* :fire: :fire: Optimized support!! :fire: :fire:`
Update README.md 2022-08-27 04:22:07 +03:00			`* 🔥 NEW! [webui.cmd](https://github.com/hlky/stable-diffusion) updates with any changes in environment.yaml file so the environment will always be up to date as long as you get the new environment.yaml file 🔥`
			`:fire: no need to remove environment, delete src folder and create again, MUCH simpler! 🔥`
Update README.md 2022-08-27 04:10:04 +03:00			`* GFPGAN Face Correction 🔥: [Download the model](https://github.com/hlky/stable-diffusion-webui#gfpgan)Automatically correct distorted faces with a built-in GFPGAN option, fixes them in less than half a second`
			`* RealESRGAN Upscaling 🔥: [Download the models](https://github.com/hlky/stable-diffusion-webui#realesrgan) Boosts the resolution of images with a built-in RealESRGAN option`
Update README.md 2022-08-27 01:31:04 +03:00			`* :computer: esrgan/gfpgan on cpu support :computer:`
Update README to reflect RealESRGAN support 2022-08-26 12:54:36 +03:00			`* Textual inversion 🔥: [info](https://textual-inversion.github.io/) - requires enabling, see [here](https://github.com/hlky/sd-enable-textual-inversion), script works as usual without it enabled`
Update README.md 2022-08-27 04:10:04 +03:00			`* Advanced img2img editor :art: :fire: :art:`
			`* :fire::fire: Mask and crop :fire::fire:`
Update README.md 2022-08-25 22:02:04 +03:00			`* Mask painting (NEW) 🖌️: Powerful tool for re-generating only specific parts of an image you want to change`
Update README.md 2022-08-27 04:10:04 +03:00			`* More k_diffusion samplers 🔥🔥 : Far greater quality outputs than the default sampler, less distortion and more accurate`
			`* txt2img samplers: "DDIM", "PLMS", 'k_dpm_2_a', 'k_dpm_2', 'k_euler_a', 'k_euler', 'k_heun', 'k_lms'`
			`* img2img samplers: "DDIM", 'k_dpm_2_a', 'k_dpm_2', 'k_euler_a', 'k_euler', 'k_heun', 'k_lms'`
Update README.md 2022-08-25 22:02:04 +03:00			`* Loopback (NEW) ➿: Automatically feed the last generated sample back into img2img`
			`* Prompt Weighting (NEW) 🏋️: Adjust the strength of different terms in your prompt`
Update README.md 2022-08-27 04:10:04 +03:00			`* :fire: gpu device selectable with --gpu <id> :fire:`
Update README.md 2022-08-25 22:02:04 +03:00			`* Memory Monitoring 🔥: Shows Vram usage and generation time after outputting.`
			`* Word Seeds 🔥: Use words instead of seed numbers`
Update README.md 2022-08-27 04:10:04 +03:00			`* CFG: Classifier free guidance scale, a feature for fine-tuning your output`
Update README.md 2022-08-25 22:02:04 +03:00			`* Launcher Automatic 👑🔥 shortcut to load the model, no more typing in Conda`
			`* Lighter on Vram: 512x512 img2img & txt2img tested working on 6gb`
			`* and ????`
Update README.md 2022-08-25 21:59:49 +03:00
first 2022-08-22 17:15:46 +03:00			`# Stable Diffusion web UI`
			`A browser interface based on Gradio library for Stable Diffusion.`

			`Original script with Gradio UI was written by a kind anonymopus user. This is a modification.`

Update README.md 2022-08-27 01:31:45 +03:00			`![](images/txt2img.jpg)`
Update README.md 2022-08-27 01:31:04 +03:00
Update README.md 2022-08-27 01:31:45 +03:00			`![](images/img2img.jpg)`
Update README.md 2022-08-27 01:31:04 +03:00
Update README.md 2022-08-27 01:31:45 +03:00			`![](images/gfpgan.jpg)`
Update README.md 2022-08-27 01:31:04 +03:00
Update README.md 2022-08-27 01:31:45 +03:00			`![](images/esrgan.jpg)`
first 2022-08-22 17:15:46 +03:00
silence the warning from transformers add feature demonstrations to readme 2022-08-23 11:58:50 +03:00			`### GFPGAN`
first 2022-08-22 17:15:46 +03:00
			`If you want to use GFPGAN to improve generated faces, you need to install it separately.`
Update README.md 2022-08-27 01:31:04 +03:00			`Download [GFPGANv1.3.pth](https://github.com/TencentARC/GFPGAN/releases/download/v1.3.0/GFPGANv1.3.pth) and put it`
			into the `/stable-diffusion/src/gfpgan/experiments/pretrained_models` directory.
first 2022-08-22 17:15:46 +03:00
Update README to reflect RealESRGAN support 2022-08-26 12:54:36 +03:00			`### RealESRGAN`
Update README.md 2022-08-27 01:31:04 +03:00			`Download [RealESRGAN_x4plus.pth](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.0/RealESRGAN_x4plus.pth) and [RealESRGAN_x4plus_anime_6B.pth](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/RealESRGAN_x4plus_anime_6B.pth).`
			Put them into the `stable-diffusion/src/realesrgan/experiments/pretrained_models` directory.
Update README to reflect RealESRGAN support 2022-08-26 12:54:36 +03:00
silence the warning from transformers add feature demonstrations to readme 2022-08-23 11:58:50 +03:00			`### Web UI`
first 2022-08-22 17:15:46 +03:00
			`When launching, you may get a very long warning message related to some weights not being used. You may freely ignore it.`
			`After a while, you will get a message like this:`

			```
			`Running on local URL: http://127.0.0.1:7860/`
			```

			`Open the URL in browser, and you are good to go.`
silence the warning from transformers add feature demonstrations to readme 2022-08-23 11:58:50 +03:00
			`## Features`
			`The script creates a web UI for Stable Diffusion's txt2img and img2img scripts. Following are features added`
			`that are not in original script.`

			`### GFPGAN`
			`Lets you improve faces in pictures using the GFPGAN model. There is a checkbox in every tab to use GFPGAN at 100%, and`
			`also a separate tab that just allows you to use GFPGAN on any picture, with a slider that controls how strongthe effect is.`

			`![](images/GFPGAN.png)`

Update README to reflect RealESRGAN support 2022-08-26 12:54:36 +03:00			`### RealESRGAN`
			`Lets you double the resolution of generated images. There is a checkbox in every tab to use RealESRGAN, and you can choose between the regular upscaler and the anime version.`
			`There is also a separate tab for using RealESRGAN on any picture.`

			`![](images/RealESRGAN.png)`

silence the warning from transformers add feature demonstrations to readme 2022-08-23 11:58:50 +03:00			`### Sampling method selection`
Update README.md 2022-08-27 01:31:04 +03:00			`txt2img samplers: "DDIM", "PLMS", 'k_dpm_2_a', 'k_dpm_2', 'k_euler_a', 'k_euler', 'k_heun', 'k_lms'`
			`img2img samplers: "DDIM", 'k_dpm_2_a', 'k_dpm_2', 'k_euler_a', 'k_euler', 'k_heun', 'k_lms'`
silence the warning from transformers add feature demonstrations to readme 2022-08-23 11:58:50 +03:00
			`![](images/sampling.png)`

			`### Prompt matrix`
			Separate multiple prompts using the `\|` character, and the system will produce an image for every combination of them.
Prompt matrix now draws text like in demo. 2022-08-23 18:04:13 +03:00			For example, if you use `a busy city street in a modern city\|illustration\|cinematic lighting` prompt, there are four combinations possible (first part of prompt is always kept):
silence the warning from transformers add feature demonstrations to readme 2022-08-23 11:58:50 +03:00
Prompt matrix now draws text like in demo. 2022-08-23 18:04:13 +03:00			- `a busy city street in a modern city`
			- `a busy city street in a modern city, illustration`
			- `a busy city street in a modern city, cinematic lighting`
			- `a busy city street in a modern city, illustration, cinematic lighting`
silence the warning from transformers add feature demonstrations to readme 2022-08-23 11:58:50 +03:00
			`Four images will be produced, in this order, all with same seed and each with corresponding prompt:`
			`![](images/prompt-matrix.png)`

Prompt matrix now draws text like in demo. 2022-08-23 18:04:13 +03:00			`Another example, this time with 5 prompts and 16 variations:`
additional picture for prompt matrix proper seeds for img2img a bit of refactoring 2022-08-23 14:07:37 +03:00			`![](images/prompt_matrix.jpg)`

Update README.md 2022-08-28 22:29:31 +03:00
			`### Prompt combinations`

Grid image generation fixes (#180) * Add simple templating * Little grid generation image fix * Add new grid help to readme * Grid image generation fixes * Trim @ symbol if no matrix inputs * Resolve conflicts 2022-08-28 17:23:21 +03:00			`If you add '@' symbol at start your prompt and change text like this:`
			`@(moba\|rpg\|rts) character (2d\|3d) model` it will be produce 3 * 2 combinations or prompt with same seed:

			- `moba character 2d model`
			- `rpg character 2d model`
			- `rts character 2d model`
			- `moba character 3d model`
			- `rpg character 3d model`
			- `rts character 3d model`

readme extra 2022-08-23 22:49:58 +03:00			`If you use this feature, batch count will be ignored, because the number of pictures to produce`
			`depends on your prompts, but batch size will still work (generating multiple pictures at the`
			`same time for a small speed boost).`

Update README.md 2022-08-27 01:31:04 +03:00			`### Flagging (Broken after UI changed to gradio.Blocks() see [Flag button missing from new UI](https://github.com/hlky/stable-diffusion-webui/issues/50))`
silence the warning from transformers add feature demonstrations to readme 2022-08-23 11:58:50 +03:00			Click the Flag button under the output section, and generated images will be saved to `log/images` directory, and generation parameters
			will be appended to a csv file `log/log.csv` in the `/sd` directory.

gfpgan dir for the guide's directory names fix a bug in image resizing 2022-08-24 13:42:21 +03:00			`> but every image is saved, why would I need this?`

			`If you're like me, you experiment a lot with prompts and settings, and only few images are worth saving. You can`
			`just save them using right click in browser, but then you won't be able to reproduce them later because you will not`
			`know what exact prompt created the image. If you use the flag button, generation paramerters will be written to csv file,`
			`and you can easily find parameters for an image by searching for its filename.`

silence the warning from transformers add feature demonstrations to readme 2022-08-23 11:58:50 +03:00			`### Copy-paste generation parameters`
			`A text output provides generation parameters in an easy to copy-paste form for easy sharing.`

			`![](images/kopipe.png)`

readme extra 2022-08-23 22:49:58 +03:00			`If you generate multiple pictures, the displayed seed will be the seed of the first one.`

silence the warning from transformers add feature demonstrations to readme 2022-08-23 11:58:50 +03:00			`### Correct seeds for batches`
			If you use a seed of 1000 to generate two batches of two images each, four generated images will have seeds: `1000, 1001, 1002, 1003`.
			Previous versions of the UI would produce `1000, x, 1001, x`, where x is an iamge that can't be generated by any seed.
added resizing modes added more info into readme 2022-08-24 10:52:41 +03:00
			`### Resizing`
			`There are three options for resizing input images in img2img mode:`

			`- Just resize - simply resizes source image to target resolution, resulting in incorrect aspect ratio`
			`- Crop and resize - resize source image preserving aspect ratio so that entirety of target resolution is occupied by it, and crop parts that stick out`
			`- Resize and fill - resize source image preserving aspect ratio so that it entirely fits target resolution, and fill empty space by rows/columns from source image`

			`Example:`
			`![](images/resizing.jpg)`

			`### Loading`
Update README to reflect RealESRGAN support 2022-08-26 12:54:36 +03:00			`Gradio's loading graphic has a very negative effect on the processing speed of the neural network.`
typos 2022-08-24 10:59:47 +03:00			`My RTX 3090 makes images about 10% faster when the tab with gradio is not active. By default, the UI`
			`now hides loading progress animation and replaces it with static "Loading..." text, which achieves`
			`the same effect. Use the --no-progressbar-hiding commandline option to revert this and show loading animations.`
added resizing modes added more info into readme 2022-08-24 10:52:41 +03:00
			`### Prompt validation`
typos 2022-08-24 10:59:47 +03:00			`Stable Diffusion has a limit for input text length. If your prompt is too long, you will get a`
			`warning in the text output field, showing which parts of your text were truncated and ignored by the model.`
Better crash display 2022-08-24 18:54:41 +03:00
			`### Loopback`
			`A checkbox for img2img allowing to automatically feed output image as input for the next batch. Equivalent to`
			`saving output image, and replacing input image with it. Batch count setting controls how many iterations of`
			`this you get.`

			`Usually, when doing this, you would choose one of many images for the next iteration yourself, so the usefulness`
			`of this feature may be questionable, but I've managed to get some very nice outputs with it that I wasn't abble`
			`to get otherwise.`

			`Example: (cherrypicked result; original picture by anon)`

			`![](images/loopback.jpg)`