Made a handful of UI tweaks:
- changed literal 'random' default seed to a blank (more intuitive I think, also a blank previously behaved the same as '0')
- moved toggles into a Gradio CheckboxGroup (somewhat subjective, but saves a little vertical space in the UI, and makes it easier to adjust toggles in code)
- changed default CFG scale to 7.5 and 5.0 to match official txt2img and img2img (the waifu-diffusion fork this ultimately borrows from changed them to 7.0 for some reason)
- raised some of the default limits somewhat:
- Steps from 150 -> 250 (the official command line version crashes at exactly 251, so seems like a reasonable limit)
- ~~Batch count 16 -> 40~~ Got changed to 250 before I committed anyway
- CFG scale 15.0 -> 30.0 (above 15 doesn't seem to affect k-diffusion much, but significantly impacts DDIM and PLMS up to about 50—maybe should be - higher?)
- inverted toggle names for clarity (both default on):
- 'Skip grid' -> 'Save grid'
- 'Skip save individual images' -> 'Save individual images'
Also:
- added separate --outdir_txt2img and --outdir_img2img command line args, which take priority over --outdir
- fixed flagging, some var names were only partially updated previously—note that CSV indicies were changed, so old log files will need deleted/renamed/etc
GFPGAN requires images in BGR color space. Using the wrong color space leads to color-shift of the face after it's put through GFPGAN. To fix, convert the color space before sending to GFPGAN and again when it's returned.
Enabled only if textual-inversion exists. i.e.
> config (configs/stable-diffusion/v1-inference.yaml) is updated for textual inversion
> ldm/modules/embedding_manager.py exists (copied from textual-inversion repo)
> ldm/data/personalized.py exists (copied from textual-inversion repo)
> ldm/data/personalized_style.py exists (copied from textual-inversion repo)
> ldm/models/diffusion/ddpm.py is replaced with textual-inversion version
> ldm/util.py is replaced with textual-inversion version
Without textual-inversion installed script will function as normal
Please note:
Once textual-inversion changes have been applied do not remove personalization_config from config .yaml without also replacing the files changed with the original versions
adds support for doing stuff like `a forest under night sky: by Studio Ghibli:1.8 in the style of Starry Night:2.3`
the : are treated as splits in the prompt, so the above turns into 3 prompts-
a forest under night sky 19.6%
by Studio Ghibli 35.2%
in the style of Starry Night 45.0%
the prompts are added together with torch.add using their weight as alpha
if a weight is negative it has the effect of subtracting (same result as torch.sub)
by default all values are normalized to try to add up to 1.0
If you want more control you can disable normalization, but values going far below 0 and 1 will cause artifacts
all images in batches now have proper seeds, not just the first one
added code to remove bad characters from filenames
added code to flag output which writes it to csv and saves images
renamed some fields in UI for clarity
added GFPGAN as an option for img2img
added GFPGAN as a tab
added autodetection for row counts for grids, enabled by default
removed Fixed Code sampling because no one can figure out what it does; maybe someone will be upset by removal and will tell me