Skip to content

Latest commit

 

History

History
 
 

image_synthesis

Image Synthesis

Image synthesis is the base feature of DiffSynth Studio. We can generate images with very high resolution.

OmniGen

OmniGen is a text-image-to-image model, you can synthesize an image according to several given reference images.

Reference image 1 Reference image 2 Synthesized image
image_man image_woman image_merged

Example: FLUX

Example script: flux_text_to_image.py and flux_text_to_image_low_vram.py(low VRAM).

The original version of FLUX doesn't support classifier-free guidance; however, we believe that this guidance mechanism is an important feature for synthesizing beautiful images. You can enable it using the parameter cfg_scale, and the extra guidance scale introduced by FLUX is embedded_guidance.

1024*1024 (original) 1024*1024 (classifier-free guidance) 2048*2048 (highres-fix)
image_1024 image_1024_cfg image_2048_highres

Example: Stable Diffusion

Example script: sd_text_to_image.py

LoRA Training: ../train/stable_diffusion/

512*512 1024*1024 2048*2048 4096*4096
512 1024 2048 4096

Example: Stable Diffusion XL

Example script: sdxl_text_to_image.py

LoRA Training: ../train/stable_diffusion_xl/

1024*1024 2048*2048
1024 2048

Example: Stable Diffusion 3

Example script: sd3_text_to_image.py

LoRA Training: ../train/stable_diffusion_3/

1024*1024 2048*2048
image_1024 image_2048

Example: Kolors

Example script: kolors_text_to_image.py

LoRA Training: ../train/kolors/

1024*1024 2048*2048
image_1024 image_2048

Kolors also support the models trained for SD-XL. For example, ControlNets and LoRAs. See kolors_with_sdxl_models.py

LoRA: https://civitai.com/models/73305/zyd232s-ink-style

Base model with LoRA (alpha=0.5) with LoRA (alpha=1.0) with LoRA (alpha=1.5)
image_0 0 image_0 5 image_1 0 image_1 5

ControlNet: https://huggingface.co/xinsir/controlnet-union-sdxl-1.0

Reference image Depth image with ControlNet with ControlNet
image_0 0 controlnet_input image_depth_1 image_depth_2

Example: Hunyuan-DiT

Example script: hunyuan_dit_text_to_image.py

LoRA Training: ../train/hunyuan_dit/

1024*1024 2048*2048
image_1024 image_2048

Example: Stable Diffusion XL Turbo

Example script: sdxl_turbo.py

We highly recommend you to use this model in the WebUI.

"black car" "red car"
black_car black_car_to_red_car