mirror of
https://github.com/Sygil-Dev/sygil-webui.git
synced 2024-12-14 22:13:41 +03:00
Merge remote-tracking branch 'origin/dev' into dev
This commit is contained in:
commit
8239c328fd
@ -92,6 +92,58 @@ The Gradio Image Lab is a central location to access image enhancers and upscale
|
||||
Please see the [Image Enhancers](6.image_enhancers.md) section to learn more about how to use these tools.
|
||||
|
||||
|
||||
## Scene2Image
|
||||
---
|
||||
|
||||
![](../images/gradio/gradio-s2i.png)
|
||||
|
||||
Gradio Scene2Image allows you to define layers of images in a markdown-like syntax.
|
||||
|
||||
> Would it be possible to have a layers system where we could do have
|
||||
foreground, mid, and background objects which relate to one another and
|
||||
share the style? So we could say generate a landscape, one another layer
|
||||
generate a castle, and on another layer generate a crowd of people.
|
||||
|
||||
You write a a multi-line prompt that looks like markdown, where each section declares one layer.
|
||||
It is hierarchical, so each layer can have their own child layers.
|
||||
In the frontend you can find a brief documentation for the syntax, examples and reference for the various arguments.
|
||||
Here a summary:
|
||||
|
||||
Markdown headings, e.g. '# layer0', define layers.
|
||||
The content of sections define the arguments for image generation.
|
||||
Arguments are defined by lines of the form 'arg:value' or 'arg=value'.
|
||||
|
||||
Layers are hierarchical, i.e. each layer can contain more layers.
|
||||
The number of '#' increases in the headings of a child layers.
|
||||
Child layers are blended together by their image masks, like layers in image editors.
|
||||
By default alpha composition is used for blending.
|
||||
Other blend modes from [ImageChops](https://pillow.readthedocs.io/en/stable/reference/ImageChops.html) can also be used.
|
||||
|
||||
Sections with "prompt" and child layers invoke Image2Image, without child layers they invoke Text2Image.
|
||||
The result of blending child layers will be the input for Image2Image.
|
||||
|
||||
Without "prompt" they are just images, useful for mask selection, image composition, etc.
|
||||
Images can be initialized with "color", resized with "resize" and their position specified with "pos".
|
||||
Rotation and rotation center are "rotation" and "center".
|
||||
|
||||
Mask can automatically be selected by color, color at pixels of the image, or by estimated depth.
|
||||
|
||||
You can chose between two different depth estimation models, see frontend reference for name of arguments.
|
||||
[Monocular depth estimation](https://huggingface.co/spaces/atsantiago/Monocular_Depth_Filter) can be selected as depth model `0`.
|
||||
[MiDaS depth estimation](https://huggingface.co/spaces/pytorch/MiDaS), used by default, can be selected as depth model `1`.
|
||||
|
||||
Depth estimation can be used for traditional 3d reconstruction.
|
||||
Using `transform3d=True` the pixels of an image can be rendered from another perspective or with a different field of view.
|
||||
For this you specify pose and field of view that corresponds to the input image and a desired output pose and field of view.
|
||||
The poses describe the camera position and orientation as x,y,z,rotate_x,rotate_y,rotate_z tuple with angles describing rotations around axes in degrees.
|
||||
The camera coordinate system is the pinhole camera as described and pictured in [OpenCV "Camera Calibration and 3D Reconstruction" documentation](https://docs.opencv.org/4.x/d9/d0c/group__calib3d.html).
|
||||
|
||||
When the camera pose `transform3d_from_pose` where the input image was taken is not specified, the camera pose `transform3d_to_pose` to which the image is to be transformed is in terms of the input camera coordinate system:
|
||||
Walking forwards one depth unit in the input image corresponds to a position `0,0,1`.
|
||||
Walking to the right is something like `1,0,0`.
|
||||
Going downwards is then `0,1,0`.
|
||||
|
||||
|
||||
## Gradio Optional Customizations
|
||||
---
|
||||
|
||||
|
@ -47,6 +47,7 @@ mkdir -p $MODEL_DIR
|
||||
MODEL_FILES=(
|
||||
'model.ckpt models/ldm/stable-diffusion-v1 https://www.googleapis.com/storage/v1/b/aai-blog-files/o/sd-v1-4.ckpt?alt=media fe4efff1e174c627256e44ec2991ba279b3816e364b49f9be2abc0b3ff3f8556'
|
||||
'GFPGANv1.3.pth src/gfpgan/experiments/pretrained_models https://github.com/TencentARC/GFPGAN/releases/download/v1.3.0/GFPGANv1.3.pth c953a88f2727c85c3d9ae72e2bd4846bbaf59fe6972ad94130e23e7017524a70'
|
||||
'GFPGANv1.4.pth src/gfpgan/experiments/pretrained_models https://github.com/TencentARC/GFPGAN/releases/download/v1.3.4/GFPGANv1.4.pth e2cd4703ab14f4d01fd1383a8a8b266f9a5833dacee8e6a79d3bf21a1b6be5ad'
|
||||
'RealESRGAN_x4plus.pth src/realesrgan/experiments/pretrained_models https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.0/RealESRGAN_x4plus.pth 4fa0d38905f75ac06eb49a7951b426670021be3018265fd191d2125df9d682f1'
|
||||
'RealESRGAN_x4plus_anime_6B.pth src/realesrgan/experiments/pretrained_models https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/RealESRGAN_x4plus_anime_6B.pth f872d837d3c90ed2e05227bed711af5671a6fd1c9f7d7e91c911a61f155e99da'
|
||||
'project.yaml src/latent-diffusion/experiments/pretrained_models https://heibox.uni-heidelberg.de/f/31a76b13ea27482981b4/?dl=1 9d6ad53c5dafeb07200fb712db14b813b527edd262bc80ea136777bdb41be2ba'
|
||||
|
BIN
images/gradio/gradio-s2i.png
Normal file
BIN
images/gradio/gradio-s2i.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 2.0 MiB |
Loading…
Reference in New Issue
Block a user