mirror of https://github.com/Sygil-Dev/sygil-webui.git synced 2024-12-15 14:31:44 +03:00

Stable Diffusion web UI

Go to file

Alejandro Gil 42d6b5d7b1 more updates to 'webui_flet.py' (WIP) (#1710 ) # Description continued de-spagettification. moved gallery window to 'scripts/flet_gallery_window.py' started storing settings in page then tying control attributes to them. added 'import base64' and 'from io import BytesIO' to flet_utils.py # Checklist: - [x] I have changed the base branch to `dev` - [x] I have performed a self-review of my own code - [x] I have commented my code in hard-to-understand areas - [ ] I have made corresponding changes to the documentation		2022-12-08 12:31:39 -08:00
.github	Removed --frozen-lockfile from yarn command on the docs building workflow.	2022-11-09 15:04:48 -08:00
.streamlit	Improved the img2txt tab by having more tags.	2022-10-24 00:22:36 -07:00
backend	Added dalle-flow submodule as starting point for the new backend using jina.	2022-11-17 03:00:57 -07:00
blog	Improved the documentation site by adding extra information on several pages.	2022-11-07 20:27:41 -08:00
configs	Deleted userconfig_streamlit.yaml	2022-12-05 08:34:35 -05:00
data	Added extra tags to the medium.txt file for img2txt and improved the 1 click installer documentation by telling the user to move the install.bat or .sh to the main root before running it.	2022-12-01 19:23:45 -07:00
docs	Added extra tags to the medium.txt file for img2txt and improved the 1 click installer documentation by telling the user to move the install.bat or .sh to the main root before running it.	2022-12-01 19:23:45 -07:00
frontend	feat: Added new documentation site using docusaurus which should make easier for users to browse and search the documentation site.	2022-11-07 20:27:41 -08:00
images	Improved the documentation site by adding extra information on several pages.	2022-11-07 20:27:41 -08:00
installer	Improved the documentation site by adding extra information on several pages.	2022-11-07 20:27:41 -08:00
ldm	Added some missing files from the ldm folder.	2022-11-26 18:07:59 -07:00
models	Repo merge (#712 )	2022-09-06 23:50:14 +01:00
optimizedSD	Improved optimized mode speed and VRAM usage.	2022-12-03 09:33:43 -07:00
scripts	Merge branch 'Sygil-Dev:dev' into dev	2022-12-06 05:41:23 -06:00
src	Improved the documentation site by adding extra information on several pages.	2022-11-07 20:27:41 -08:00
webui/flet	more updates to 'webui_flet.py'	2022-12-08 14:17:58 -06:00
_config.yml	dark theme	2022-09-19 01:05:46 +03:00
.dockerignore	Update .dockerignore	2022-10-05 05:29:11 +01:00
.gitattributes	Repo merge (#712 )	2022-09-06 23:50:14 +01:00
.gitignore	added image for empty gallery	2022-12-06 11:23:59 -06:00
.gitmodules	Added dalle-flow submodule as starting point for the new backend using jina.	2022-11-17 03:00:57 -07:00
babel.config.js	feat: Added new documentation site using docusaurus which should make easier for users to browse and search the documentation site.	2022-11-07 20:27:41 -08:00
CONTRIBUTING.md	add contribution guide	2022-09-19 01:05:46 +03:00
daisi_app.py	Changed yaml.load() for yaml.safe_load() in daisi_app.py	2022-10-06 23:32:54 -07:00
Dockerfile	Update Dockerfile	2022-12-05 08:41:38 -05:00
Dockerfile_base	Updated Dockerfile_base	2022-12-04 15:05:56 -05:00
Dockerfile_runpod	Updated Dockerfile_runpod	2022-12-05 08:40:58 -05:00
docusaurus.config.js	Improved the documentation site by adding extra information on several pages.	2022-11-07 20:27:41 -08:00
entrypoint.sh	More renaming and changed to links related to the organization, docs and repo names.	2022-10-23 18:54:10 -07:00
environment.yaml	Added option to use cudnn as backend for pytorch, this should help fixing an issue with nvidia 16xx cards getting a black or green square instead of a proper image.	2022-12-03 05:58:18 -07:00
horde_bridge.cmd	More renaming and changed to links related to the organization, docs and repo names.	2022-10-23 18:54:10 -07:00
horde_bridge.sh	More renaming and changed to links related to the organization, docs and repo names.	2022-10-23 18:54:10 -07:00
LICENSE	Create LICENSE	2022-08-29 12:16:23 +01:00
package.json	patch: Fixed create-docusaurus not found when installing with conda, it has being moved to the package.json file.	2022-11-09 00:24:51 -08:00
README.md	Fixed broken links on the readme.md file as well as some typos.	2022-11-13 22:24:57 -08:00
requirements.txt	Updated streamlit-server-state to 0.15.0	2022-12-05 07:21:26 -07:00
runpod_entrypoint.sh	More renaming and changed to links related to the organization, docs and repo names.	2022-10-23 18:54:10 -07:00
setup.py	More renaming and changes to links related to the organization, docs an repo names.	2022-10-23 17:17:50 -07:00
sidebars.js	feat: Added new documentation site using docusaurus which should make easier for users to browse and search the documentation site.	2022-11-07 20:27:41 -08:00
Stable_Diffusion_v1_Model_Card.md	Repo merge (#712 )	2022-09-06 23:50:14 +01:00
streamlit_webview.py	Added a webview script that launches the streamlit server and then a small GUI, this should help for those that do not want to use the browser and want to run the UI natively.	2022-10-26 17:47:13 -07:00
Web_based_UI_for_Stable_Diffusion_colab.ipynb	fixed typo, tested	2022-11-06 18:50:43 -06:00
webui_legacy.cmd	Renamed the webui_streamlit.cmd to webui.cmd.	2022-10-26 20:24:47 -07:00
webui.cmd	Merge branch 'dev' into master	2022-11-03 09:46:25 +05:30
webui.sh	Merge branch 'dev' into master	2022-11-03 09:46:25 +05:30
yarn.lock	Added yarn.lock to repo for docusaurus.	2022-11-07 20:27:41 -08:00

README.md

Web-based UI for Stable Diffusion

Created by Sygil.Dev

Join us at Sygil.Dev's Discord Server

Installation instructions for:

Windows
Linux

Want to ask a question or request a feature?

Come to our Discord Server or use Discussions.

Documentation

Documentation is located here

Want to contribute?

Check the Contribution Guide

Sygil-Dev main devs:

Project Features:

Built-in image enhancers and upscalers, including GFPGAN and realESRGAN
Generator Preview: See your image as its being made
Run additional upscaling models on CPU to save VRAM
Textual inversion: Reaserch Paper
K-Diffusion Samplers: A great collection of samplers to use, including:
- k_euler
- k_lms
- k_euler_a
- k_dpm_2
- k_dpm_2_a
- k_heun
- PLMS
- DDIM
Loopback: Automatically feed the last generated sample back into img2img
Prompt Weighting & Negative Prompts: Gain more control over your creations
Selectable GPU usage from Settings tab
Word Seeds: Use words instead of seed numbers
Automated Launcher: Activate conda and run Stable Diffusion with a single command
Lighter on VRAM: 512x512 Text2Image & Image2Image tested working on 4GB (with optimized mode enabled in Settings)
Prompt validation: If your prompt is too long, you will get a warning in the text output field
Sequential seeds for batches: If you use a seed of 1000 to generate two batches of two images each, four generated images will have seeds: 1000, 1001, 1002, 1003.
Prompt matrix: Separate multiple prompts using the | character, and the system will produce an image for every combination of them.
[Gradio] Advanced img2img editor with Mask and crop capabilities
[Gradio] Mask painting 🖌️: Powerful tool for re-generating only specific parts of an image you want to change (currently Gradio only)

SD WebUI

An easy way to work with Stable Diffusion right from your browser.

Streamlit

Features:

Clean UI with an easy to use design, with support for widescreen displays
Dynamic live preview of your generations
Easily customizable defaults, right from the WebUI's Settings tab
An integrated gallery to show the generations for a prompt
Optimized VRAM usage for bigger generations or usage on lower end GPUs
Text to Video: Generate video clips from text prompts right from the WebUI (WIP)
Image to Text: Use CLIP Interrogator to interrogate an image and get a prompt that you can use to generate a similar image using Stable Diffusion.
Concepts Library: Run custom embeddings others have made via textual inversion.
Textual Inversion training: Train your own embeddings on any photo you want and use it on your prompt.
**Currently in development: Stable Horde integration; ImgLab, batch inputs, & mask editor from Gradio

Prompt Weights & Negative Prompts:

To give a token (tag recognized by the AI) a specific or increased weight (emphasis), add :0.## to the prompt, where 0.## is a decimal that will specify the weight of all tokens before the colon. Ex: cat:0.30, dog:0.70 or guy riding a bicycle :0.7, incoming car :0.30

Negative prompts can be added by using ### , after which any tokens will be seen as negative. Ex: cat playing with string ### yarn will negate yarn from the generated image.

Negatives are a very powerful tool to get rid of contextually similar or related topics, but be careful when adding them since the AI might see connections you can't, and end up outputting gibberish

*Tip: Try using the same seed with different prompt configurations or weight values see how the AI understands them, it can lead to prompts that are more well-tuned and less prone to error.

Please see the Streamlit Documentation to learn more.

Gradio [Legacy]

Features:

Older UI that is functional and feature complete.
Has access to all upscaling models, including LSDR.
Dynamic prompt entry automatically changes your generation settings based on --params in a prompt.
Includes quick and easy ways to send generations to Image2Image or the Image Lab for upscaling.

Note: the Gradio interface is no longer being actively developed by Sygil.Dev and is only receiving bug fixes.

Please see the Gradio Documentation to learn more.

Image Upscalers

GFPGAN

Lets you improve faces in pictures using the GFPGAN model. There is a checkbox in every tab to use GFPGAN at 100%, and also a separate tab that just allows you to use GFPGAN on any picture, with a slider that controls how strong the effect is.

If you want to use GFPGAN to improve generated faces, you need to install it separately. Download GFPGANv1.4.pth and put it into the /sygil-webui/models/gfpgan directory.

RealESRGAN

Lets you double the resolution of generated images. There is a checkbox in every tab to use RealESRGAN, and you can choose between the regular upscaler and the anime version. There is also a separate tab for using RealESRGAN on any picture.

Download RealESRGAN_x4plus.pth and RealESRGAN_x4plus_anime_6B.pth. Put them into the sygil-webui/models/realesrgan directory.

LSDR

Download LDSR project.yaml and model last.cpkt. Rename last.ckpt to model.ckpt and place both under sygil-webui/models/ldsr/

GoBig, and GoLatent (Currently on the Gradio version Only)

More powerful upscalers that uses a separate Latent Diffusion model to more cleanly upscale images.

Please see the Post-Processing Documentation to learn more.

Original Information From The Stable Diffusion Repo:

Stable Diffusion

Stable Diffusion was made possible thanks to a collaboration with Stability AI and Runway and builds upon our previous work:

High-Resolution Image Synthesis with Latent Diffusion Models Robin Rombach*, Andreas Blattmann*, Dominik Lorenz, Patrick Esser, Björn Ommer

CVPR '22 Oral

which is available on GitHub. PDF at arXiv. Please also visit our Project page.

Stable Diffusion is a latent text-to-image diffusion model. Thanks to a generous compute donation from Stability AI and support from LAION, we were able to train a Latent Diffusion Model on 512x512 images from a subset of the LAION-5B database. Similar to Google's Imagen, this model uses a frozen CLIP ViT-L/14 text encoder to condition the model on text prompts. With its 860M UNet and 123M text encoder, the model is relatively lightweight and runs on a GPU with at least 10GB VRAM. See this section below and the model card.

Stable Diffusion v1

Stable Diffusion v1 refers to a specific configuration of the model architecture that uses a downsampling-factor 8 autoencoder with an 860M UNet and CLIP ViT-L/14 text encoder for the diffusion model. The model was pretrained on 256x256 images and then finetuned on 512x512 images.

*Note: Stable Diffusion v1 is a general text-to-image diffusion model and therefore mirrors biases and (mis-)conceptions that are present in its training data. Details on the training procedure and data, as well as the intended use of the model can be found in the corresponding model card.

Comments

Our code base for the diffusion models builds heavily on OpenAI's ADM codebase and https://github.com/lucidrains/denoising-diffusion-pytorch. Thanks for open-sourcing!
The implementation of the transformer encoder is from x-transformers by lucidrains.

BibTeX

@misc{rombach2021highresolution,
      title={High-Resolution Image Synthesis with Latent Diffusion Models}, 
      author={Robin Rombach and Andreas Blattmann and Dominik Lorenz and Patrick Esser and Björn Ommer},
      year={2021},
      eprint={2112.10752},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}