Max value for a 32bit unsigned int is 2**32 - 1, not 2**32 (and the
range of possible values returned by randint includes both of its
arguments).
Python's hash() function is not deterministic across different
invocations of the interpreter. This meant a given string seed would
produce different results after restarting the script. Use the passed
strings to seed a Random instance instead.
This may open the option to read data from images dragged into the tool later. Activated with --save_metadata
Properties (example output from imagemagic 'identify -verbose' command:
SD:cfg_scale: 7.5
SD:GFPGAN: False
SD:height: 512
SD:normalize_prompt_weights: True
SD:prompt: a beautiful matte painting of a cottage on a magical fantasy island, unreal engine, barometric projection, rectilinear
SD:seed: 247624793
SD:steps: 50
SD:width: 512
If loading Arial fails, try loading a font commonly installed on Linux
distros.
This means it continues to work on Windows, and will also just work on
most Linux machines (DejaVu Sans is widely available and often installed
by default).
Made a handful of UI tweaks:
- changed literal 'random' default seed to a blank (more intuitive I think, also a blank previously behaved the same as '0')
- moved toggles into a Gradio CheckboxGroup (somewhat subjective, but saves a little vertical space in the UI, and makes it easier to adjust toggles in code)
- changed default CFG scale to 7.5 and 5.0 to match official txt2img and img2img (the waifu-diffusion fork this ultimately borrows from changed them to 7.0 for some reason)
- raised some of the default limits somewhat:
- Steps from 150 -> 250 (the official command line version crashes at exactly 251, so seems like a reasonable limit)
- ~~Batch count 16 -> 40~~ Got changed to 250 before I committed anyway
- CFG scale 15.0 -> 30.0 (above 15 doesn't seem to affect k-diffusion much, but significantly impacts DDIM and PLMS up to about 50—maybe should be - higher?)
- inverted toggle names for clarity (both default on):
- 'Skip grid' -> 'Save grid'
- 'Skip save individual images' -> 'Save individual images'
Also:
- added separate --outdir_txt2img and --outdir_img2img command line args, which take priority over --outdir
- fixed flagging, some var names were only partially updated previously—note that CSV indicies were changed, so old log files will need deleted/renamed/etc
GFPGAN requires images in BGR color space. Using the wrong color space leads to color-shift of the face after it's put through GFPGAN. To fix, convert the color space before sending to GFPGAN and again when it's returned.
Enabled only if textual-inversion exists. i.e.
> config (configs/stable-diffusion/v1-inference.yaml) is updated for textual inversion
> ldm/modules/embedding_manager.py exists (copied from textual-inversion repo)
> ldm/data/personalized.py exists (copied from textual-inversion repo)
> ldm/data/personalized_style.py exists (copied from textual-inversion repo)
> ldm/models/diffusion/ddpm.py is replaced with textual-inversion version
> ldm/util.py is replaced with textual-inversion version
Without textual-inversion installed script will function as normal
Please note:
Once textual-inversion changes have been applied do not remove personalization_config from config .yaml without also replacing the files changed with the original versions
adds support for doing stuff like `a forest under night sky: by Studio Ghibli:1.8 in the style of Starry Night:2.3`
the : are treated as splits in the prompt, so the above turns into 3 prompts-
a forest under night sky 19.6%
by Studio Ghibli 35.2%
in the style of Starry Night 45.0%
the prompts are added together with torch.add using their weight as alpha
if a weight is negative it has the effect of subtracting (same result as torch.sub)
by default all values are normalized to try to add up to 1.0
If you want more control you can disable normalization, but values going far below 0 and 1 will cause artifacts