theluyuan/DiffSynth-Studio

mirror of https://github.com/modelscope/DiffSynth-Studio.git synced 2026-03-19 23:08:13 +00:00

Files

Qianyi Zhao 9166a6742c Update introduction.md

2024-11-05 05:31:54 -06:00

11 KiB

Raw Blame History

欢迎来到 Diffusion 的魔法世界

欢迎来到 Diffusion 的魔法世界，这里是 DiffSynth-Studio，一个开源的 Diffusion 引擎，我们希望通过这样一个开源项目，构建统一、互联、创新的 Diffusion 模型生态！

统一

目前的开源 Diffusion 模型结构五花八门，以文生图模型为例，有 Stable Diffusion、Kolors、FLUX 等。

FLUX	Stable Diffusion 3	Kolors	Hunyuan-DiT	Stable Diffusion	Stable Diffusion XL

FLUX	Stable Diffusion 3	Kolors	Hunyuan-DiT	Stable Diffusion	Stable Diffusion XL

FLUX	Stable Diffusion 3	Kolors	Hunyuan-DiT	Stable Diffusion	Stable Diffusion XL

<style> table { width: 100%; table-layout: fixed; /* 表格布局设置为固定，以便可以设置列宽 */ } th, td { width: 16.6%; /* 每列的宽度，大约为 100/6 */ text-align: center; } /* 具体设置每一列的宽度（如果每列需要不同的宽度，可以分别设置） */ th:nth-child(1), td:nth-child(1) { width: 15%; } th:nth-child(2), td:nth-child(2) { width: 15%; } th:nth-child(3), td:nth-child(3) { width: 15%; } th:nth-child(4), td:nth-child(4) { width: 15%; } th:nth-child(5), td:nth-child(5) { width: 20%; } th:nth-child(6), td:nth-child(6) { width: 20%; } </style>

FLUX	Stable Diffusion 3	Kolors	Hunyuan-DiT	Stable Diffusion	Stable Diffusion XL

我们设计了统一的框架，实现了通用的增强模块，例如提示词分区控制技术。

以及一站式的训练脚本。

	FLUX.1-dev	Kolors	Stable Diffusion 3	Hunyuan-DiT
Without LoRA
With LoRA

互联

与语言模型不同，Diffusion 模型存在生态模型，包括 LoRA、ControlNet、IP-Adapter 等，这些模型由不同的开发者开发、训练、开源，我们为这些模型提供了一站式的推理支持。例如基于 Stable Diffusion XL，你可以随意使用这些相关的生态模型组装出丰富的功能。

底模生成	使用 ControlNet 保持画面结构重新生成	继续叠加 LoRA 使画面更扁平	叠加 IP-Adapter 转换为水墨风格

你甚至可以继续叠加 AnimateDiff 构建视频转绘方案。

创新

DiffSynth-Studio 集成了多个开源模型，这是属于开源社区的奇迹。我们致力于用强工程基础驱动算法上的创新，目前我们公开了多项创新性生成技术。

ExVideo: 视频生成模型的扩展训练技术
- 项目页面: https://ecnu-cilab.github.io/ExVideoProjectPage/
- 技术报告: https://arxiv.org/abs/2406.14130
- 模型 (ExVideo-CogVideoX)
  - HuggingFace: https://huggingface.co/ECNU-CILab/ExVideo-CogVideoX-LoRA-129f-v1
  - ModelScope: https://modelscope.cn/models/ECNU-CILab/ExVideo-CogVideoX-LoRA-129f-v1
- 模型 (ExVideo-SVD)
  - HuggingFace: https://huggingface.co/ECNU-CILab/ExVideo-SVD-128f-v1
  - ModelScope: https://modelscope.cn/models/ECNU-CILab/ExVideo-SVD-128f-v1
Diffutoon: 动漫风格视频渲染方案
- 项目页面: https://ecnu-cilab.github.io/DiffutoonProjectPage/
- 技术报告: https://arxiv.org/abs/2401.16224
- 样例代码: https://github.com/modelscope/DiffSynth-Studio/tree/main/examples/Diffutoon
FastBlend: 视频去闪烁算法
- 独立仓库: https://github.com/Artiprocher/sd-webui-fastblend
- 视频演示
- 技术报告: https://arxiv.org/abs/2311.09265
DiffSynth: DiffSynth-Studio 的前身
- 项目页面: https://ecnu-cilab.github.io/DiffSynth.github.io/
- 早期代码: https://github.com/alibaba/EasyNLP/tree/master/diffusion/DiffSynth
- 技术报告: https://arxiv.org/abs/2308.03463