mirror of
https://github.com/modelscope/DiffSynth-Studio.git
synced 2026-04-21 11:46:58 +00:00
update docs
This commit is contained in:
@@ -20,6 +20,7 @@ Diffusion Templates is a controllable generation plugin framework for Diffusion
|
||||
* Aesthetic Alignment: [DiffSynth-Studio/Template-KleinBase4B-Aesthetic](https://modelscope.cn/models/DiffSynth-Studio/Template-KleinBase4B-Aesthetic)
|
||||
* Inpainting: [DiffSynth-Studio/Template-KleinBase4B-Inpaint](https://modelscope.cn/models/DiffSynth-Studio/Template-KleinBase4B-Inpaint)
|
||||
* Content Reference: [DiffSynth-Studio/Template-KleinBase4B-ContentRef](https://modelscope.cn/models/DiffSynth-Studio/Template-KleinBase4B-ContentRef)
|
||||
* Age Control: [DiffSynth-Studio/Template-KleinBase4B-Age](https://modelscope.cn/models/DiffSynth-Studio/Template-KleinBase4B-ContentRef)
|
||||
* Panda Meme (Easter Egg Model): [DiffSynth-Studio/Template-KleinBase4B-PandaMeme](https://modelscope.cn/models/DiffSynth-Studio/Template-KleinBase4B-PandaMeme)
|
||||
* Datasets: [Collection](https://modelscope.cn/collections/DiffSynth-Studio/ImagePulseV2--shujuji)
|
||||
* [DiffSynth-Studio/ImagePulseV2-Edit-Inpaint](https://modelscope.cn/datasets/DiffSynth-Studio/ImagePulseV2-Edit-Inpaint)
|
||||
|
||||
@@ -29,7 +29,7 @@ image = pipe(
|
||||
image.save("image.png")
|
||||
```
|
||||
|
||||
The Template model [DiffSynth-Studio/F2KB4B-Template-Brightness](https://modelscope.cn/models/DiffSynth-Studio/F2KB4B-Template-Brightness) can control image brightness during generation. Through the `TemplatePipeline` model, it can be loaded from ModelScope (via `ModelConfig(model_id="xxx/xxx")`) or from a local path (via `ModelConfig(path="xxx")`). Inputting `scale=0.8` increases image brightness. Note that in the code, input parameters for `pipe` must be transferred to `template_pipeline`, and `template_inputs` should be added.
|
||||
The Template model [DiffSynth-Studio/Template-KleinBase4B-Brightness](https://modelscope.cn/models/DiffSynth-Studio/Template-KleinBase4B-Brightness) can control image brightness during generation. Through the `TemplatePipeline` model, it can be loaded from ModelScope (via `ModelConfig(model_id="xxx/xxx")`) or from a local path (via `ModelConfig(path="xxx")`). Inputting `scale=0.8` increases image brightness. Note that in the code, input parameters for `pipe` must be transferred to `template_pipeline`, and `template_inputs` should be added.
|
||||
|
||||
```python
|
||||
# Load Template model
|
||||
@@ -37,7 +37,7 @@ template_pipeline = TemplatePipeline.from_pretrained(
|
||||
torch_dtype=torch.bfloat16,
|
||||
device="cuda",
|
||||
model_configs=[
|
||||
ModelConfig(model_id="DiffSynth-Studio/F2KB4B-Template-Brightness")
|
||||
ModelConfig(model_id="DiffSynth-Studio/Template-KleinBase4B-Brightness")
|
||||
],
|
||||
)
|
||||
# Generate an image
|
||||
@@ -53,7 +53,7 @@ image.save("image_0.8.png")
|
||||
|
||||
## CFG Enhancement for Template Models
|
||||
|
||||
Template models can enable CFG (Classifier-Free Guidance) to make control effects more pronounced. For example, with the model [DiffSynth-Studio/F2KB4B-Template-Brightness](https://modelscope.cn/models/DiffSynth-Studio/F2KB4B-Template-Brightness), adding `negative_template_inputs` to the TemplatePipeline input parameters and setting its scale to 0.5 will generate images with more noticeable brightness variations by contrasting both sides.
|
||||
Template models can enable CFG (Classifier-Free Guidance) to make control effects more pronounced. For example, with the model [DiffSynth-Studio/Template-KleinBase4B-Brightness](https://modelscope.cn/models/DiffSynth-Studio/Template-KleinBase4B-Brightness), adding `negative_template_inputs` to the TemplatePipeline input parameters and setting its scale to 0.5 will generate images with more noticeable brightness variations by contrasting both sides.
|
||||
|
||||
```python
|
||||
# Generate an image with CFG
|
||||
@@ -77,7 +77,7 @@ template_pipeline = TemplatePipeline.from_pretrained(
|
||||
torch_dtype=torch.bfloat16,
|
||||
device="cuda",
|
||||
model_configs=[
|
||||
ModelConfig(model_id="DiffSynth-Studio/F2KB4B-Template-Brightness")
|
||||
ModelConfig(model_id="DiffSynth-Studio/Template-KleinBase4B-Brightness")
|
||||
],
|
||||
lazy_loading=True,
|
||||
)
|
||||
@@ -85,7 +85,7 @@ template_pipeline = TemplatePipeline.from_pretrained(
|
||||
|
||||
The base model's Pipeline and Template Pipeline are completely independent and can enable VRAM management on demand.
|
||||
|
||||
When Template model outputs contain LoRA in Template Cache, you need to enable VRAM management for the base model's Pipeline or enable LoRA hot loading (using the code below), otherwise LoRA weights will be叠加.
|
||||
When Template model outputs contain LoRA in Template Cache, you need to enable VRAM management for the base model's Pipeline or enable LoRA hot loading (using the code below), otherwise LoRA weights will be fused repeatedly.
|
||||
|
||||
```python
|
||||
pipe.dit = pipe.enable_lora_hot_loading(pipe.dit)
|
||||
@@ -100,6 +100,7 @@ After enabling VRAM management for the base model's Pipeline and lazy loading fo
|
||||
```python
|
||||
from diffsynth.diffusion.template import TemplatePipeline
|
||||
from diffsynth.pipelines.flux2_image import Flux2ImagePipeline, ModelConfig
|
||||
from modelscope import dataset_snapshot_download
|
||||
import torch
|
||||
from PIL import Image
|
||||
|
||||
@@ -137,6 +138,8 @@ template = TemplatePipeline.from_pretrained(
|
||||
ModelConfig(model_id="DiffSynth-Studio/Template-KleinBase4B-Sharpness"),
|
||||
ModelConfig(model_id="DiffSynth-Studio/Template-KleinBase4B-Inpaint"),
|
||||
ModelConfig(model_id="DiffSynth-Studio/Template-KleinBase4B-Aesthetic"),
|
||||
ModelConfig(model_id="DiffSynth-Studio/Template-KleinBase4B-ContentRef"),
|
||||
ModelConfig(model_id="DiffSynth-Studio/Template-KleinBase4B-Age"),
|
||||
ModelConfig(model_id="DiffSynth-Studio/Template-KleinBase4B-PandaMeme"),
|
||||
],
|
||||
)
|
||||
@@ -154,7 +157,7 @@ image = template(
|
||||
template_inputs = [
|
||||
{
|
||||
"model_id": 3,
|
||||
"image": Image.open("data/assets/image_lowres_100.jpg"),
|
||||
"image": Image.open("data/examples/templates/image_lowres_100.jpg"),
|
||||
"prompt": "A cat is sitting on a stone.",
|
||||
},
|
||||
{
|
||||
@@ -165,7 +168,7 @@ image = template(
|
||||
negative_template_inputs = [
|
||||
{
|
||||
"model_id": 3,
|
||||
"image": Image.open("data/assets/image_lowres_100.jpg"),
|
||||
"image": Image.open("data/examples/templates/image_lowres_100.jpg"),
|
||||
"prompt": "",
|
||||
},
|
||||
{
|
||||
@@ -193,7 +196,7 @@ image = template(
|
||||
template_inputs = [
|
||||
{
|
||||
"model_id": 1,
|
||||
"image": Image.open("data/assets/image_depth.jpg"),
|
||||
"image": Image.open("data/examples/templates/image_depth.jpg"),
|
||||
"prompt": "A cat is sitting on a stone, bathed in bright sunshine.",
|
||||
},
|
||||
{
|
||||
@@ -210,7 +213,7 @@ image = template(
|
||||
negative_template_inputs = [
|
||||
{
|
||||
"model_id": 1,
|
||||
"image": Image.open("data/assets/image_depth.jpg"),
|
||||
"image": Image.open("data/examples/templates/image_depth.jpg"),
|
||||
"prompt": "",
|
||||
},
|
||||
{
|
||||
@@ -244,12 +247,12 @@ image = template(
|
||||
template_inputs = [
|
||||
{
|
||||
"model_id": 1,
|
||||
"image": Image.open("data/assets/image_depth.jpg"),
|
||||
"image": Image.open("data/examples/templates/image_depth.jpg"),
|
||||
"prompt": "A cat is sitting on a stone. Colored ink painting.",
|
||||
},
|
||||
{
|
||||
"model_id": 2,
|
||||
"image": Image.open("data/assets/image_reference.jpg"),
|
||||
"image": Image.open("data/examples/templates/image_reference.jpg"),
|
||||
"prompt": "Convert the image style to colored ink painting.",
|
||||
},
|
||||
{
|
||||
@@ -262,12 +265,12 @@ image = template(
|
||||
negative_template_inputs = [
|
||||
{
|
||||
"model_id": 1,
|
||||
"image": Image.open("data/assets/image_depth.jpg"),
|
||||
"image": Image.open("data/examples/templates/image_depth.jpg"),
|
||||
"prompt": "",
|
||||
},
|
||||
{
|
||||
"model_id": 2,
|
||||
"image": Image.open("data/assets/image_reference.jpg"),
|
||||
"image": Image.open("data/examples/templates/image_reference.jpg"),
|
||||
"prompt": "",
|
||||
},
|
||||
],
|
||||
@@ -295,13 +298,13 @@ image = template(
|
||||
},
|
||||
{
|
||||
"model_id": 2,
|
||||
"image": Image.open("data/assets/image_reference.jpg"),
|
||||
"image": Image.open("data/examples/templates/image_reference.jpg"),
|
||||
"prompt": "Convert the image style to flat anime style.",
|
||||
},
|
||||
{
|
||||
"model_id": 6,
|
||||
"image": Image.open("data/assets/image_reference.jpg"),
|
||||
"mask": Image.open("data/assets/image_mask_1.jpg"),
|
||||
"image": Image.open("data/examples/templates/image_reference.jpg"),
|
||||
"mask": Image.open("data/examples/templates/image_mask_1.jpg"),
|
||||
"force_inpaint": True,
|
||||
},
|
||||
],
|
||||
@@ -312,13 +315,13 @@ image = template(
|
||||
},
|
||||
{
|
||||
"model_id": 2,
|
||||
"image": Image.open("data/assets/image_reference.jpg"),
|
||||
"image": Image.open("data/examples/templates/image_reference.jpg"),
|
||||
"prompt": "",
|
||||
},
|
||||
{
|
||||
"model_id": 6,
|
||||
"image": Image.open("data/assets/image_reference.jpg"),
|
||||
"mask": Image.open("data/assets/image_mask_1.jpg"),
|
||||
"image": Image.open("data/examples/templates/image_reference.jpg"),
|
||||
"mask": Image.open("data/examples/templates/image_mask_1.jpg"),
|
||||
},
|
||||
],
|
||||
)
|
||||
|
||||
@@ -101,7 +101,7 @@ TEMPLATE_MODEL_PATH = None
|
||||
|
||||
To train Template models with DiffSynth-Studio, datasets should contain `template_inputs` fields in `metadata.json`. These fields pass through `TEMPLATE_DATA_PROCESSOR` to generate inputs for Template model methods.
|
||||
|
||||
For example, the brightness control model [DiffSynth-Studio/F2KB4B-Template-Brightness](https://modelscope.cn/models/DiffSynth-Studio/F2KB4B-Template-Brightness) takes `scale` as input:
|
||||
For example, the brightness control model [DiffSynth-Studio/Template-KleinBase4B-Brightness](https://modelscope.cn/models/DiffSynth-Studio/Template-KleinBase4B-Brightness) takes `scale` as input:
|
||||
|
||||
```json
|
||||
[
|
||||
|
||||
@@ -20,6 +20,7 @@ Diffusion Templates 是 DiffSynth-Studio 中的 Diffusion 模型可控生成插
|
||||
* 美学对齐:[DiffSynth-Studio/Template-KleinBase4B-Aesthetic](https://modelscope.cn/models/DiffSynth-Studio/Template-KleinBase4B-Aesthetic)
|
||||
* 局部重绘:[DiffSynth-Studio/Template-KleinBase4B-Inpaint](https://modelscope.cn/models/DiffSynth-Studio/Template-KleinBase4B-Inpaint)
|
||||
* 内容参考:[DiffSynth-Studio/Template-KleinBase4B-ContentRef](https://modelscope.cn/models/DiffSynth-Studio/Template-KleinBase4B-ContentRef)
|
||||
* 年龄控制:[DiffSynth-Studio/Template-KleinBase4B-Age](https://modelscope.cn/models/DiffSynth-Studio/Template-KleinBase4B-ContentRef)
|
||||
* 魔性熊猫(彩蛋模型):[DiffSynth-Studio/Template-KleinBase4B-PandaMeme](https://modelscope.cn/models/DiffSynth-Studio/Template-KleinBase4B-PandaMeme)
|
||||
* 数据集:[合集](https://modelscope.cn/collections/DiffSynth-Studio/ImagePulseV2--shujuji)
|
||||
* [DiffSynth-Studio/ImagePulseV2-Edit-Inpaint](https://modelscope.cn/datasets/DiffSynth-Studio/ImagePulseV2-Edit-Inpaint)
|
||||
|
||||
@@ -29,7 +29,7 @@ image = pipe(
|
||||
image.save("image.png")
|
||||
```
|
||||
|
||||
Template 模型 [DiffSynth-Studio/F2KB4B-Template-Brightness](https://modelscope.cn/models/DiffSynth-Studio/F2KB4B-Template-Brightness) 可以控制模型生成图像的亮度。通过 `TemplatePipeline` 模型,可从魔搭模型库加载(`ModelConfig(model_id="xxx/xxx")`)或从本地路径加载(`ModelConfig(path="xxx")`)。输入 scale=0.8 提高图像的亮度。注意在代码中,需将 `pipe` 的输入参数转移到 `template_pipeline` 中,并添加 `template_inputs`。
|
||||
Template 模型 [DiffSynth-Studio/Template-KleinBase4B-Brightness](https://modelscope.cn/models/DiffSynth-Studio/Template-KleinBase4B-Brightness) 可以控制模型生成图像的亮度。通过 `TemplatePipeline` 模型,可从魔搭模型库加载(`ModelConfig(model_id="xxx/xxx")`)或从本地路径加载(`ModelConfig(path="xxx")`)。输入 scale=0.8 提高图像的亮度。注意在代码中,需将 `pipe` 的输入参数转移到 `template_pipeline` 中,并添加 `template_inputs`。
|
||||
|
||||
```python
|
||||
# Load Template model
|
||||
@@ -37,7 +37,7 @@ template_pipeline = TemplatePipeline.from_pretrained(
|
||||
torch_dtype=torch.bfloat16,
|
||||
device="cuda",
|
||||
model_configs=[
|
||||
ModelConfig(model_id="DiffSynth-Studio/F2KB4B-Template-Brightness")
|
||||
ModelConfig(model_id="DiffSynth-Studio/Template-KleinBase4B-Brightness")
|
||||
],
|
||||
)
|
||||
# Generate an image
|
||||
@@ -53,7 +53,7 @@ image.save("image_0.8.png")
|
||||
|
||||
## Template 模型的 CFG 增强
|
||||
|
||||
Template 模型可以开启 CFG(Classifier-Free Guidance),使其控制效果更明显。例如模型 [DiffSynth-Studio/F2KB4B-Template-Brightness](https://modelscope.cn/models/DiffSynth-Studio/F2KB4B-Template-Brightness),在 `TemplatePipeline` 的输入参数中添加 `negative_template_inputs` 并将其 scale 设置为 0.5,模型就会对比两侧的差异,生成亮度变化更明显的图像。
|
||||
Template 模型可以开启 CFG(Classifier-Free Guidance),使其控制效果更明显。例如模型 [DiffSynth-Studio/Template-KleinBase4B-Brightness](https://modelscope.cn/models/DiffSynth-Studio/Template-KleinBase4B-Brightness),在 `TemplatePipeline` 的输入参数中添加 `negative_template_inputs` 并将其 scale 设置为 0.5,模型就会对比两侧的差异,生成亮度变化更明显的图像。
|
||||
|
||||
```python
|
||||
# Generate an image with CFG
|
||||
@@ -77,7 +77,7 @@ template_pipeline = TemplatePipeline.from_pretrained(
|
||||
torch_dtype=torch.bfloat16,
|
||||
device="cuda",
|
||||
model_configs=[
|
||||
ModelConfig(model_id="DiffSynth-Studio/F2KB4B-Template-Brightness")
|
||||
ModelConfig(model_id="DiffSynth-Studio/Template-KleinBase4B-Brightness")
|
||||
],
|
||||
lazy_loading=True,
|
||||
)
|
||||
@@ -100,6 +100,7 @@ pipe.dit = pipe.enable_lora_hot_loading(pipe.dit)
|
||||
```python
|
||||
from diffsynth.diffusion.template import TemplatePipeline
|
||||
from diffsynth.pipelines.flux2_image import Flux2ImagePipeline, ModelConfig
|
||||
from modelscope import dataset_snapshot_download
|
||||
import torch
|
||||
from PIL import Image
|
||||
|
||||
@@ -137,6 +138,8 @@ template = TemplatePipeline.from_pretrained(
|
||||
ModelConfig(model_id="DiffSynth-Studio/Template-KleinBase4B-Sharpness"),
|
||||
ModelConfig(model_id="DiffSynth-Studio/Template-KleinBase4B-Inpaint"),
|
||||
ModelConfig(model_id="DiffSynth-Studio/Template-KleinBase4B-Aesthetic"),
|
||||
ModelConfig(model_id="DiffSynth-Studio/Template-KleinBase4B-ContentRef"),
|
||||
ModelConfig(model_id="DiffSynth-Studio/Template-KleinBase4B-Age"),
|
||||
ModelConfig(model_id="DiffSynth-Studio/Template-KleinBase4B-PandaMeme"),
|
||||
],
|
||||
)
|
||||
@@ -154,7 +157,7 @@ image = template(
|
||||
template_inputs = [
|
||||
{
|
||||
"model_id": 3,
|
||||
"image": Image.open("data/assets/image_lowres_100.jpg"),
|
||||
"image": Image.open("data/examples/templates/image_lowres_100.jpg"),
|
||||
"prompt": "A cat is sitting on a stone.",
|
||||
},
|
||||
{
|
||||
@@ -165,7 +168,7 @@ image = template(
|
||||
negative_template_inputs = [
|
||||
{
|
||||
"model_id": 3,
|
||||
"image": Image.open("data/assets/image_lowres_100.jpg"),
|
||||
"image": Image.open("data/examples/templates/image_lowres_100.jpg"),
|
||||
"prompt": "",
|
||||
},
|
||||
{
|
||||
@@ -193,7 +196,7 @@ image = template(
|
||||
template_inputs = [
|
||||
{
|
||||
"model_id": 1,
|
||||
"image": Image.open("data/assets/image_depth.jpg"),
|
||||
"image": Image.open("data/examples/templates/image_depth.jpg"),
|
||||
"prompt": "A cat is sitting on a stone, bathed in bright sunshine.",
|
||||
},
|
||||
{
|
||||
@@ -210,7 +213,7 @@ image = template(
|
||||
negative_template_inputs = [
|
||||
{
|
||||
"model_id": 1,
|
||||
"image": Image.open("data/assets/image_depth.jpg"),
|
||||
"image": Image.open("data/examples/templates/image_depth.jpg"),
|
||||
"prompt": "",
|
||||
},
|
||||
{
|
||||
@@ -244,12 +247,12 @@ image = template(
|
||||
template_inputs = [
|
||||
{
|
||||
"model_id": 1,
|
||||
"image": Image.open("data/assets/image_depth.jpg"),
|
||||
"image": Image.open("data/examples/templates/image_depth.jpg"),
|
||||
"prompt": "A cat is sitting on a stone. Colored ink painting.",
|
||||
},
|
||||
{
|
||||
"model_id": 2,
|
||||
"image": Image.open("data/assets/image_reference.jpg"),
|
||||
"image": Image.open("data/examples/templates/image_reference.jpg"),
|
||||
"prompt": "Convert the image style to colored ink painting.",
|
||||
},
|
||||
{
|
||||
@@ -262,12 +265,12 @@ image = template(
|
||||
negative_template_inputs = [
|
||||
{
|
||||
"model_id": 1,
|
||||
"image": Image.open("data/assets/image_depth.jpg"),
|
||||
"image": Image.open("data/examples/templates/image_depth.jpg"),
|
||||
"prompt": "",
|
||||
},
|
||||
{
|
||||
"model_id": 2,
|
||||
"image": Image.open("data/assets/image_reference.jpg"),
|
||||
"image": Image.open("data/examples/templates/image_reference.jpg"),
|
||||
"prompt": "",
|
||||
},
|
||||
],
|
||||
@@ -295,13 +298,13 @@ image = template(
|
||||
},
|
||||
{
|
||||
"model_id": 2,
|
||||
"image": Image.open("data/assets/image_reference.jpg"),
|
||||
"image": Image.open("data/examples/templates/image_reference.jpg"),
|
||||
"prompt": "Convert the image style to flat anime style.",
|
||||
},
|
||||
{
|
||||
"model_id": 6,
|
||||
"image": Image.open("data/assets/image_reference.jpg"),
|
||||
"mask": Image.open("data/assets/image_mask_1.jpg"),
|
||||
"image": Image.open("data/examples/templates/image_reference.jpg"),
|
||||
"mask": Image.open("data/examples/templates/image_mask_1.jpg"),
|
||||
"force_inpaint": True,
|
||||
},
|
||||
],
|
||||
@@ -312,13 +315,13 @@ image = template(
|
||||
},
|
||||
{
|
||||
"model_id": 2,
|
||||
"image": Image.open("data/assets/image_reference.jpg"),
|
||||
"image": Image.open("data/examples/templates/image_reference.jpg"),
|
||||
"prompt": "",
|
||||
},
|
||||
{
|
||||
"model_id": 6,
|
||||
"image": Image.open("data/assets/image_reference.jpg"),
|
||||
"mask": Image.open("data/assets/image_mask_1.jpg"),
|
||||
"image": Image.open("data/examples/templates/image_reference.jpg"),
|
||||
"mask": Image.open("data/examples/templates/image_mask_1.jpg"),
|
||||
},
|
||||
],
|
||||
)
|
||||
|
||||
@@ -101,7 +101,7 @@ TEMPLATE_MODEL_PATH = None
|
||||
|
||||
如需使用 DiffSynth-Studio 训练 Template 模型,则需构建训练数据集,数据集中的 `metadata.json` 包含 `template_inputs` 字段。`metadata.json` 中的 `template_inputs` 并不是直接输入给 Template 模型 `process_inputs` 的参数,而是提供给 `TEMPLATE_DATA_PROCESSOR` 的输入参数,由 `TEMPLATE_DATA_PROCESSOR` 计算出输入给 Template 模型 `process_inputs` 的参数。
|
||||
|
||||
例如,[DiffSynth-Studio/F2KB4B-Template-Brightness](https://modelscope.cn/models/DiffSynth-Studio/F2KB4B-Template-Brightness) 这一亮度控制模型的输入参数是 `scale`,即图像的亮度数值。`scale` 可以直接写在 `metadata.json` 中,此时 `TEMPLATE_DATA_PROCESSOR` 只需要传递参数:
|
||||
例如,[DiffSynth-Studio/Template-KleinBase4B-Brightness](https://modelscope.cn/models/DiffSynth-Studio/Template-KleinBase4B-Brightness) 这一亮度控制模型的输入参数是 `scale`,即图像的亮度数值。`scale` 可以直接写在 `metadata.json` 中,此时 `TEMPLATE_DATA_PROCESSOR` 只需要传递参数:
|
||||
|
||||
```json
|
||||
[
|
||||
|
||||
Reference in New Issue
Block a user