support omnigen

2026-03-18 22:08:13 +00:00 · 2024-11-11 18:39:40 +08:00
parent 344cbd3286
commit bd028e4c66
9 changed files with 1470 additions and 3 deletions
--- a/examples/image_synthesis/README.md
+++ b/examples/image_synthesis/README.md
@@ -2,6 +2,14 @@

 Image synthesis is the base feature of DiffSynth Studio. We can generate images with very high resolution.

+### OmniGen
+
+OmniGen is a text-image-to-image model, you can synthesize an image according to several given reference images.
+
+|Reference image 1|Reference image 2|Synthesized image|
+|-|-|-|
+|![image_man](https://github.com/user-attachments/assets/35d00493-625b-45d1-ad2b-b558ea09fe36)|![image_woman](https://github.com/user-attachments/assets/abebf69b-3563-4b3b-91c2-c48ff74a29ea)|![image_merged](https://github.com/user-attachments/assets/2979d5a9-f355-4bec-a91d-1e824d9fc8f6)|
+
 ### Example: FLUX

 Example script: [`flux_text_to_image.py`](./flux_text_to_image.py) and [`flux_text_to_image_low_vram.py`](./flux_text_to_image_low_vram.py)(low VRAM).
--- a/examples/image_synthesis/omnigen_text_to_image.py
+++ b/examples/image_synthesis/omnigen_text_to_image.py
@@ -0,0 +1,25 @@
+import torch
+from diffsynth import ModelManager, OmnigenImagePipeline
+
+
+model_manager = ModelManager(torch_dtype=torch.bfloat16, model_id_list=["OmniGen-v1"])
+pipe = OmnigenImagePipeline.from_model_manager(model_manager)
+
+image_man = pipe(
+    prompt="A portrait of a man.",
+    cfg_scale=2.5, num_inference_steps=50, seed=0
+)
+image_man.save("image_man.jpg")
+
+image_woman = pipe(
+    prompt="A portrait of an Asian woman with a white t-shirt.",
+    cfg_scale=2.5, num_inference_steps=50, seed=1
+)
+image_woman.save("image_woman.jpg")
+
+image_merged = pipe(
+    prompt="a man and a woman. The man is the man in <img><|image_1|></img>. The woman is the woman in <img><|image_2|></img>.",
+    reference_images=[image_man, image_woman],
+    cfg_scale=2.5, image_cfg_scale=2.5, num_inference_steps=50, seed=2
+)
+image_merged.save("image_merged.jpg")