mirror of
https://github.com/sd-webui/stable-diffusion-webui.git
synced 2025-01-05 20:28:01 +03:00
a1e2ee2587
changed: "!streamlit run scripts/webui_streamlit.py --theme.base dark --server.headless true true 2>&1 | tee -a /content/log.txt" to: "!streamlit run scripts/webui_streamlit.py --theme.base dark --server.headless true 2>&1 | tee -a /content/log.txt" avoids 'unknown argument error'
444 lines
18 KiB
Plaintext
444 lines
18 KiB
Plaintext
{
|
|
"nbformat": 4,
|
|
"nbformat_minor": 0,
|
|
"metadata": {
|
|
"colab": {
|
|
"private_outputs": true,
|
|
"provenance": [],
|
|
"collapsed_sections": [
|
|
"5-Bx4AsEoPU-",
|
|
"xMWVQOg0G1Pj"
|
|
]
|
|
},
|
|
"kernelspec": {
|
|
"name": "python3",
|
|
"display_name": "Python 3"
|
|
},
|
|
"language_info": {
|
|
"name": "python"
|
|
},
|
|
"accelerator": "GPU"
|
|
},
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"source": [
|
|
"[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Sygil-Dev/sygil-webui/blob/dev/Web_based_UI_for_Stable_Diffusion_colab.ipynb)"
|
|
],
|
|
"metadata": {
|
|
"id": "S5RoIM-5IPZJ"
|
|
}
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"source": [
|
|
"# README"
|
|
],
|
|
"metadata": {
|
|
"id": "5-Bx4AsEoPU-"
|
|
}
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"source": [
|
|
"###<center>Web-based UI for Stable Diffusion</center>\n",
|
|
"\n",
|
|
"## Created by [Sygil-Dev](https://github.com/Sygil-Dev)\n",
|
|
"\n",
|
|
"## [Visit Sygil-Dev's Discord Server](https://discord.gg/gyXNe4NySY) [![Discord Server](https://user-images.githubusercontent.com/5977640/190528254-9b5b4423-47ee-4f24-b4f9-fd13fba37518.png)](https://discord.gg/gyXNe4NySY)\n",
|
|
"\n",
|
|
"## Installation instructions for:\n",
|
|
"\n",
|
|
"- **[Windows](https://sygil-dev.github.io/sygil-webui/docs/1.windows-installation.html)** \n",
|
|
"- **[Linux](https://sygil-dev.github.io/sygil-webui/docs/2.linux-installation.html)**\n",
|
|
"\n",
|
|
"### Want to ask a question or request a feature?\n",
|
|
"\n",
|
|
"Come to our [Discord Server](https://discord.gg/gyXNe4NySY) or use [Discussions](https://github.com/Sygil-Dev/sygil-webui/discussions).\n",
|
|
"\n",
|
|
"## Documentation\n",
|
|
"\n",
|
|
"[Documentation is located here](https://sygil-dev.github.io/sygil-webui/)\n",
|
|
"\n",
|
|
"## Want to contribute?\n",
|
|
"\n",
|
|
"Check the [Contribution Guide](CONTRIBUTING.md)\n",
|
|
"\n",
|
|
"[Sygil-Dev](https://github.com/Sygil-Dev) main devs:\n",
|
|
"\n",
|
|
"* ![hlky's avatar](https://avatars.githubusercontent.com/u/106811348?s=40&v=4) [hlky](https://github.com/hlky)\n",
|
|
"* ![ZeroCool940711's avatar](https://avatars.githubusercontent.com/u/5977640?s=40&v=4)[ZeroCool940711](https://github.com/ZeroCool940711)\n",
|
|
"* ![codedealer's avatar](https://avatars.githubusercontent.com/u/4258136?s=40&v=4)[codedealer](https://github.com/codedealer)\n",
|
|
"\n",
|
|
"### Project Features:\n",
|
|
"\n",
|
|
"* Two great Web UI's to choose from: Streamlit or Gradio\n",
|
|
"\n",
|
|
"* No more manually typing parameters, now all you have to do is write your prompt and adjust sliders\n",
|
|
"\n",
|
|
"* Built-in image enhancers and upscalers, including GFPGAN and realESRGAN\n",
|
|
"\n",
|
|
"* Run additional upscaling models on CPU to save VRAM\n",
|
|
"\n",
|
|
"* Textual inversion 🔥: [info](https://textual-inversion.github.io/) - requires enabling, see [here](https://github.com/hlky/sd-enable-textual-inversion), script works as usual without it enabled\n",
|
|
"\n",
|
|
"* Advanced img2img editor with Mask and crop capabilities\n",
|
|
"\n",
|
|
"* Mask painting 🖌️: Powerful tool for re-generating only specific parts of an image you want to change (currently Gradio only)\n",
|
|
"\n",
|
|
"* More diffusion samplers 🔥🔥: A great collection of samplers to use, including:\n",
|
|
" \n",
|
|
" - `k_euler` (Default)\n",
|
|
" - `k_lms`\n",
|
|
" - `k_euler_a`\n",
|
|
" - `k_dpm_2`\n",
|
|
" - `k_dpm_2_a`\n",
|
|
" - `k_heun`\n",
|
|
" - `PLMS`\n",
|
|
" - `DDIM`\n",
|
|
"\n",
|
|
"* Loopback ➿: Automatically feed the last generated sample back into img2img\n",
|
|
"\n",
|
|
"* Prompt Weighting 🏋️: Adjust the strength of different terms in your prompt\n",
|
|
"\n",
|
|
"* Selectable GPU usage with `--gpu <id>`\n",
|
|
"\n",
|
|
"* Memory Monitoring 🔥: Shows VRAM usage and generation time after outputting\n",
|
|
"\n",
|
|
"* Word Seeds 🔥: Use words instead of seed numbers\n",
|
|
"\n",
|
|
"* CFG: Classifier free guidance scale, a feature for fine-tuning your output\n",
|
|
"\n",
|
|
"* Automatic Launcher: Activate conda and run Stable Diffusion with a single command\n",
|
|
"\n",
|
|
"* Lighter on VRAM: 512x512 Text2Image & Image2Image tested working on 4GB\n",
|
|
"\n",
|
|
"* Prompt validation: If your prompt is too long, you will get a warning in the text output field\n",
|
|
"\n",
|
|
"* Copy-paste generation parameters: A text output provides generation parameters in an easy to copy-paste form for easy sharing.\n",
|
|
"\n",
|
|
"* Correct seeds for batches: If you use a seed of 1000 to generate two batches of two images each, four generated images will have seeds: `1000, 1001, 1002, 1003`.\n",
|
|
"\n",
|
|
"* Prompt matrix: Separate multiple prompts using the `|` character, and the system will produce an image for every combination of them.\n",
|
|
"\n",
|
|
"* Loopback for Image2Image: A checkbox for img2img allowing to automatically feed output image as input for the next batch. Equivalent to saving output image, and replacing input image with it.\n",
|
|
"\n",
|
|
"# Stable Diffusion Web UI\n",
|
|
"\n",
|
|
"A fully-integrated and easy way to work with Stable Diffusion right from a browser window.\n",
|
|
"\n",
|
|
"## Streamlit\n",
|
|
"\n",
|
|
"![](images/streamlit/streamlit-t2i.png)\n",
|
|
"\n",
|
|
"**Features:**\n",
|
|
"\n",
|
|
"- Clean UI with an easy to use design, with support for widescreen displays.\n",
|
|
"- Dynamic live preview of your generations\n",
|
|
"- Easily customizable presets right from the WebUI (Coming Soon!)\n",
|
|
"- An integrated gallery to show the generations for a prompt or session (Coming soon!)\n",
|
|
"- Better optimization VRAM usage optimization, less errors for bigger generations.\n",
|
|
"- Text2Video - Generate video clips from text prompts right from the WEb UI (WIP)\n",
|
|
"- Concepts Library - Run custom embeddings others have made via textual inversion.\n",
|
|
"- Actively being developed with new features being added and planned - Stay Tuned!\n",
|
|
"- Streamlit is now the new primary UI for the project moving forward.\n",
|
|
"- *Currently in active development and still missing some of the features present in the Gradio Interface.*\n",
|
|
"\n",
|
|
"Please see the [Streamlit Documentation](docs/4.streamlit-interface.md) to learn more.\n",
|
|
"\n",
|
|
"## Gradio\n",
|
|
"\n",
|
|
"![](images/gradio/gradio-t2i.png)\n",
|
|
"\n",
|
|
"**Features:**\n",
|
|
"\n",
|
|
"- Older UI design that is fully functional and feature complete.\n",
|
|
"- Has access to all upscaling models, including LSDR.\n",
|
|
"- Dynamic prompt entry automatically changes your generation settings based on `--params` in a prompt.\n",
|
|
"- Includes quick and easy ways to send generations to Image2Image or the Image Lab for upscaling.\n",
|
|
"- *Note, the Gradio interface is no longer being actively developed and is only receiving bug fixes.*\n",
|
|
"\n",
|
|
"Please see the [Gradio Documentation](docs/5.gradio-interface.md) to learn more.\n",
|
|
"\n",
|
|
"## Image Upscalers\n",
|
|
"\n",
|
|
"---\n",
|
|
"\n",
|
|
"### GFPGAN\n",
|
|
"\n",
|
|
"![](images/GFPGAN.png)\n",
|
|
"\n",
|
|
"Lets you improve faces in pictures using the GFPGAN model. There is a checkbox in every tab to use GFPGAN at 100%, and also a separate tab that just allows you to use GFPGAN on any picture, with a slider that controls how strong the effect is.\n",
|
|
"\n",
|
|
"If you want to use GFPGAN to improve generated faces, you need to install it separately.\n",
|
|
"Download [GFPGANv1.4.pth](https://github.com/TencentARC/GFPGAN/releases/download/v1.3.4/GFPGANv1.4.pth) and put it\n",
|
|
"into the `/sygil-webui/models/gfpgan` directory. \n",
|
|
"\n",
|
|
"### RealESRGAN\n",
|
|
"\n",
|
|
"![](images/RealESRGAN.png)\n",
|
|
"\n",
|
|
"Lets you double the resolution of generated images. There is a checkbox in every tab to use RealESRGAN, and you can choose between the regular upscaler and the anime version.\n",
|
|
"There is also a separate tab for using RealESRGAN on any picture.\n",
|
|
"\n",
|
|
"Download [RealESRGAN_x4plus.pth](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.0/RealESRGAN_x4plus.pth) and [RealESRGAN_x4plus_anime_6B.pth](https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/RealESRGAN_x4plus_anime_6B.pth).\n",
|
|
"Put them into the `sygil-webui/models/realesrgan` directory. \n",
|
|
"\n",
|
|
"\n",
|
|
"\n",
|
|
"### LSDR\n",
|
|
"\n",
|
|
"Download **LDSR** [project.yaml](https://heibox.uni-heidelberg.de/f/31a76b13ea27482981b4/?dl=1) and [model last.cpkt](https://heibox.uni-heidelberg.de/f/578df07c8fc04ffbadf3/?dl=1). Rename last.ckpt to model.ckpt and place both under `sygil-webui/models/ldsr/`\n",
|
|
"\n",
|
|
"### GoBig, and GoLatent *(Currently on the Gradio version Only)*\n",
|
|
"\n",
|
|
"More powerful upscalers that uses a seperate Latent Diffusion model to more cleanly upscale images.\n",
|
|
"\n",
|
|
"\n",
|
|
"\n",
|
|
"Please see the [Image Enhancers Documentation](docs/6.image_enhancers.md) to learn more.\n",
|
|
"\n",
|
|
"-----\n",
|
|
"\n",
|
|
"### *Original Information From The Stable Diffusion Repo*\n",
|
|
"\n",
|
|
"# Stable Diffusion\n",
|
|
"\n",
|
|
"*Stable Diffusion was made possible thanks to a collaboration with [Stability AI](https://stability.ai/) and [Runway](https://runwayml.com/) and builds upon our previous work:*\n",
|
|
"\n",
|
|
"[**High-Resolution Image Synthesis with Latent Diffusion Models**](https://ommer-lab.com/research/latent-diffusion-models/)<br/>\n",
|
|
"[Robin Rombach](https://github.com/rromb)\\*,\n",
|
|
"[Andreas Blattmann](https://github.com/ablattmann)\\*,\n",
|
|
"[Dominik Lorenz](https://github.com/qp-qp)\\,\n",
|
|
"[Patrick Esser](https://github.com/pesser),\n",
|
|
"[Björn Ommer](https://hci.iwr.uni-heidelberg.de/Staff/bommer)<br/>\n",
|
|
"\n",
|
|
"**CVPR '22 Oral**\n",
|
|
"\n",
|
|
"which is available on [GitHub](https://github.com/CompVis/latent-diffusion). PDF at [arXiv](https://arxiv.org/abs/2112.10752). Please also visit our [Project page](https://ommer-lab.com/research/latent-diffusion-models/).\n",
|
|
"\n",
|
|
"[Stable Diffusion](#stable-diffusion-v1) is a latent text-to-image diffusion\n",
|
|
"model.\n",
|
|
"Thanks to a generous compute donation from [Stability AI](https://stability.ai/) and support from [LAION](https://laion.ai/), we were able to train a Latent Diffusion Model on 512x512 images from a subset of the [LAION-5B](https://laion.ai/blog/laion-5b/) database. \n",
|
|
"Similar to Google's [Imagen](https://arxiv.org/abs/2205.11487), \n",
|
|
"this model uses a frozen CLIP ViT-L/14 text encoder to condition the model on text prompts.\n",
|
|
"With its 860M UNet and 123M text encoder, the model is relatively lightweight and runs on a GPU with at least 10GB VRAM.\n",
|
|
"See [this section](#stable-diffusion-v1) below and the [model card](https://huggingface.co/CompVis/stable-diffusion).\n",
|
|
"\n",
|
|
"## Stable Diffusion v1\n",
|
|
"\n",
|
|
"Stable Diffusion v1 refers to a specific configuration of the model\n",
|
|
"architecture that uses a downsampling-factor 8 autoencoder with an 860M UNet\n",
|
|
"and CLIP ViT-L/14 text encoder for the diffusion model. The model was pretrained on 256x256 images and \n",
|
|
"then finetuned on 512x512 images.\n",
|
|
"\n",
|
|
"*Note: Stable Diffusion v1 is a general text-to-image diffusion model and therefore mirrors biases and (mis-)conceptions that are present\n",
|
|
"in its training data. \n",
|
|
"Details on the training procedure and data, as well as the intended use of the model can be found in the corresponding [model card](https://huggingface.co/CompVis/stable-diffusion).\n",
|
|
"\n",
|
|
"## Comments\n",
|
|
"\n",
|
|
"- Our codebase for the diffusion models builds heavily on [OpenAI's ADM codebase](https://github.com/openai/guided-diffusion)\n",
|
|
" and [https://github.com/lucidrains/denoising-diffusion-pytorch](https://github.com/lucidrains/denoising-diffusion-pytorch). \n",
|
|
" Thanks for open-sourcing!\n",
|
|
"\n",
|
|
"- The implementation of the transformer encoder is from [x-transformers](https://github.com/lucidrains/x-transformers) by [lucidrains](https://github.com/lucidrains?tab=repositories). \n",
|
|
"\n",
|
|
"## BibTeX\n",
|
|
"\n",
|
|
"```\n",
|
|
"@misc{rombach2021highresolution,\n",
|
|
" title={High-Resolution Image Synthesis with Latent Diffusion Models}, \n",
|
|
" author={Robin Rombach and Andreas Blattmann and Dominik Lorenz and Patrick Esser and Björn Ommer},\n",
|
|
" year={2021},\n",
|
|
" eprint={2112.10752},\n",
|
|
" archivePrefix={arXiv},\n",
|
|
" primaryClass={cs.CV}\n",
|
|
"}\n",
|
|
"\n",
|
|
"```"
|
|
],
|
|
"metadata": {
|
|
"id": "z4kQYMPQn4d-"
|
|
}
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"source": [
|
|
"# Setup"
|
|
],
|
|
"metadata": {
|
|
"id": "IZjJSr-WPNxB"
|
|
}
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"metadata": {
|
|
"id": "eq0-E5mjSpmP"
|
|
},
|
|
"source": [
|
|
"!nvidia-smi -L"
|
|
],
|
|
"execution_count": null,
|
|
"outputs": []
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"source": [
|
|
"!pip install condacolab\n",
|
|
"import condacolab\n",
|
|
"condacolab.install_from_url(\"https://github.com/conda-forge/miniforge/releases/download/4.14.0-0/Mambaforge-4.14.0-0-Linux-x86_64.sh\")\n",
|
|
"\n",
|
|
"import condacolab\n",
|
|
"condacolab.check()\n",
|
|
"\n",
|
|
"# The runtime will crash after this, its normal as we are forcing a restart of the runtime from code. Just hit \"Run All\" again."
|
|
],
|
|
"metadata": {
|
|
"id": "cDu33xkdJ5mD"
|
|
},
|
|
"execution_count": null,
|
|
"outputs": []
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"source": [
|
|
"!git clone https://github.com/Sygil-Dev/sygil-webui.git\n",
|
|
"%cd /content/sygil-webui/\n",
|
|
"!git checkout dev\n",
|
|
"!git pull\n",
|
|
"!wget -O arial.ttf https://github.com/matomo-org/travis-scripts/blob/master/fonts/Arial.ttf?raw=true"
|
|
],
|
|
"metadata": {
|
|
"id": "pZHGf03Vp305"
|
|
},
|
|
"execution_count": null,
|
|
"outputs": []
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"source": [
|
|
"!mamba install cudatoolkit=11.3 git numpy=1.22.3 pip=20.3 python=3.8.5 pytorch=1.11.0 scikit-image=0.19.2 torchvision=0.12.0 -y"
|
|
],
|
|
"metadata": {
|
|
"id": "dmN2igp5Yk3z"
|
|
},
|
|
"execution_count": null,
|
|
"outputs": []
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"source": [
|
|
"#@title Install dependencies.\n",
|
|
"!python --version\n",
|
|
"!pip install -r requirements.txt"
|
|
],
|
|
"metadata": {
|
|
"id": "vXX0OaR8KyLQ"
|
|
},
|
|
"execution_count": null,
|
|
"outputs": []
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"source": [
|
|
"!npm install localtunnel"
|
|
],
|
|
"metadata": {
|
|
"id": "FHyVuT5aSM2G"
|
|
},
|
|
"execution_count": null,
|
|
"outputs": []
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"source": [
|
|
"#Launch the WebUI"
|
|
],
|
|
"metadata": {
|
|
"id": "csi6cj6gQZmC"
|
|
}
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"source": [
|
|
"#@title Mount Google Drive\n",
|
|
"import os\n",
|
|
"mount_google_drive = True #@param {type:\"boolean\"}\n",
|
|
"save_outputs_to_drive = True #@param {type:\"boolean\"}\n",
|
|
"\n",
|
|
"if mount_google_drive:\n",
|
|
" # Mount google drive to store your outputs.\n",
|
|
" from google.colab import drive\n",
|
|
" drive.mount('/content/drive/', force_remount=True)\n",
|
|
"\n",
|
|
"if save_outputs_to_drive:\n",
|
|
" os.makedirs(\"/content/drive/MyDrive/sygil-webui/outputs\", exist_ok=True)\n",
|
|
" os.symlink(\"/content/drive/MyDrive/sygil-webui/outputs\", \"/content/sygil-webui/outputs\", target_is_directory=True)\n"
|
|
],
|
|
"metadata": {
|
|
"id": "pcSWo9Zkzbsf"
|
|
},
|
|
"execution_count": null,
|
|
"outputs": []
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"source": [
|
|
"#@title Enter Huggingface token\n",
|
|
"!git config --global credential.helper store\n",
|
|
"!huggingface-cli login"
|
|
],
|
|
"metadata": {
|
|
"id": "IsbG7fvIrKwg"
|
|
},
|
|
"execution_count": null,
|
|
"outputs": []
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"source": [
|
|
"#@title <-- Press play on the music player to keep the tab alive (Uses only 13MB of data)\n",
|
|
"%%html\n",
|
|
"<b>Press play on the music player to keep the tab alive, then start your generation below (Uses only 13MB of data)</b><br/>\n",
|
|
"<audio src=\"https://henk.tech/colabkobold/silence.m4a\" controls>"
|
|
],
|
|
"metadata": {
|
|
"id": "-WknaU7uu_q6"
|
|
},
|
|
"execution_count": null,
|
|
"outputs": []
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"source": [
|
|
"JS to prevent idle timeout:\n",
|
|
"\n",
|
|
"Press F12 OR CTRL + SHIFT + I OR right click on this website -> inspect. Then click on the console tab and paste in the following code.\n",
|
|
"\n",
|
|
"function ClickConnect(){\n",
|
|
"console.log(\"Working\");\n",
|
|
"document.querySelector(\"colab-toolbar-button#connect\").click()\n",
|
|
"}\n",
|
|
"setInterval(ClickConnect,60000)"
|
|
],
|
|
"metadata": {
|
|
"id": "pjIjiCuJysJI"
|
|
}
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"source": [
|
|
"#@title Open port 8501 and start Streamlit server. Open link in 'link.txt' file in file pane on left.\n",
|
|
"!npx localtunnel --port 8501 &>/content/link.txt &\n",
|
|
"!streamlit run scripts/webui_streamlit.py --theme.base dark --server.headless true 2>&1 | tee -a /content/log.txt"
|
|
],
|
|
"metadata": {
|
|
"id": "5whXm2nfSZ39"
|
|
},
|
|
"execution_count": null,
|
|
"outputs": []
|
|
}
|
|
]
|
|
}
|