From ac67acd235ff36b655f753db7680def60c672fa0 Mon Sep 17 00:00:00 2001 From: Artiprocher Date: Mon, 4 Nov 2024 15:49:41 +0800 Subject: [PATCH] update docs --- docs/source/index.rst | 8 ++- docs/source/introduction/introduction.md | 87 ++++++++++++++++++++++++ 2 files changed, 94 insertions(+), 1 deletion(-) create mode 100644 docs/source/introduction/introduction.md diff --git a/docs/source/index.rst b/docs/source/index.rst index 3d182dd..6aa64f1 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -6,7 +6,13 @@ DiffSynth-Studio 文档 ============================== -欢迎来到 DiffSynth-Studio,我们旨在构建 Diffusion 模型的开源互联生态,在这里,你可以体验到 AIGC(AI Generated Content)技术魔法般的魅力! +欢迎来到 Diffusion 的魔法世界,这里是 DiffSynth-Studio,一个开源的 Diffusion 引擎,我们希望通过这样一个开源项目,构建统一、互联、创新的 Diffusion 模型生态! + +.. toctree:: + :maxdepth: 1 + :caption: 简介 + + introduction/introduction.md .. toctree:: :maxdepth: 1 diff --git a/docs/source/introduction/introduction.md b/docs/source/introduction/introduction.md new file mode 100644 index 0000000..da006cd --- /dev/null +++ b/docs/source/introduction/introduction.md @@ -0,0 +1,87 @@ +# 欢迎来到 Diffusion 的魔法世界 + +欢迎来到 Diffusion 的魔法世界,这里是 DiffSynth-Studio,一个开源的 Diffusion 引擎,我们希望通过这样一个开源项目,构建统一、互联、创新的 Diffusion 模型生态! + +## 统一 + +目前的开源 Diffusion 模型结构五花八门,以文生图模型为例,有 Stable Diffusion、Kolors、FLUX 等。 + +|FLUX|Stable Diffusion 3| +|-|-| +|![image_1024_cfg](https://github.com/user-attachments/assets/6af5b106-0673-4e58-9213-cd9157eef4c0)|![image_1024](https://github.com/modelscope/DiffSynth-Studio/assets/35051019/4df346db-6f91-420a-b4c1-26e205376098)| + +|Kolors|Hunyuan-DiT| +|-|-| +|![image_1024](https://github.com/modelscope/DiffSynth-Studio/assets/35051019/53ef6f41-da11-4701-8665-9f64392607bf)|![image_1024](https://github.com/modelscope/DiffSynth-Studio/assets/35051019/60b022c8-df3f-4541-95ab-bf39f2fa8bb5)| + +|Stable Diffusion|Stable Diffusion XL| +|-|-| +|![1024](https://github.com/Artiprocher/DiffSynth-Studio/assets/35051019/6fc84611-8da6-4a1f-8fee-9a34eba3b4a5)|![1024](https://github.com/Artiprocher/DiffSynth-Studio/assets/35051019/67687748-e738-438c-aee5-96096f09ac90)| + +我们构建了一个统一的框架,实现了通用的增强模块,例如全局生效的高分辨率修复。 + +|FLUX.1-dev (1024*1024)|FLUX.1-dev (2048*2048, highres-fix)| +|-|-| +|![image_1024_cfg](https://github.com/user-attachments/assets/984561e9-553d-4952-9443-79ce144f379f)|![image_2048_highres](https://github.com/user-attachments/assets/2e92b2f8-c177-454f-84f6-f6f5d3aaeeff)| + +还有提示词分区控制技术。 + + + +以及一站式的训练脚本。 + +||FLUX.1-dev|Kolors|Stable Diffusion 3|Hunyuan-DiT| +|-|-|-|-|-| +|Without LoRA|![image_without_lora](https://github.com/user-attachments/assets/df62cef6-d54f-4e3d-a602-5dd290079d49)|![image_without_lora](https://github.com/modelscope/DiffSynth-Studio/assets/35051019/9d79ed7a-e8cf-4d98-800a-f182809db318)|![image_without_lora](https://github.com/modelscope/DiffSynth-Studio/assets/35051019/ddb834a5-6366-412b-93dc-6d957230d66e)|![image_without_lora](https://github.com/Artiprocher/DiffSynth-Studio/assets/35051019/1aa21de5-a992-4b66-b14f-caa44e08876e)| +|With LoRA|![image_with_lora](https://github.com/user-attachments/assets/4fd39890-0291-4d19-8a88-d70d0ae18533)|![image_with_lora](https://github.com/modelscope/DiffSynth-Studio/assets/35051019/02f62323-6ee5-4788-97a1-549732dbe4f0)|![image_with_lora](https://github.com/modelscope/DiffSynth-Studio/assets/35051019/8e7b2888-d874-4da4-a75b-11b6b214b9bf)|![image_with_lora](https://github.com/Artiprocher/DiffSynth-Studio/assets/35051019/83a0a41a-691f-4610-8e7b-d8e17c50a282)| + +## 互联 + +与语言模型不同,Diffusion 模型存在生态模型,包括 LoRA、ControlNet、IP-Adapter 等,这些模型由不同的开发者开发、训练、开源,我们为这些模型提供了一站式的推理支持。例如基于 Stable Diffusion XL,你可以随意使用这些相关的生态模型组装出丰富的功能。 + +|底模生成|使用 ControlNet 保持画面结构重新生成| +|-|-| +|![image](https://github.com/user-attachments/assets/cc094e8f-ff6a-4f9e-ba05-7a5c2e0e609f)|![image_controlnet](https://github.com/user-attachments/assets/d50d173e-e81a-4d7e-93e3-b2787d69953e)| + +|继续叠加 LoRA 使画面更扁平|叠加 IP-Adapter 转换为水墨风格| +|-|-| +|![image_lora](https://github.com/user-attachments/assets/c599b2f8-8351-4be5-a6ae-8380889cb9d8)|![image_ipadapter](https://github.com/user-attachments/assets/e5924aef-03b0-4462-811f-a60e2523fd7f)| + +你甚至可以继续叠加 AnimateDiff 构建视频转绘方案。 + + + +## 创新 + +DiffSynth-Studio 集成了多个开源模型,这是属于开源社区的奇迹。我们致力于用强工程基础驱动算法上的创新,目前我们公开了多项创新性生成技术。 + +* ExVideo: 视频生成模型的扩展训练技术 + * 项目页面: https://ecnu-cilab.github.io/ExVideoProjectPage/ + * 技术报告: https://arxiv.org/abs/2406.14130 + * 模型 (ExVideo-CogVideoX) + * HuggingFace: https://huggingface.co/ECNU-CILab/ExVideo-CogVideoX-LoRA-129f-v1 + * ModelScope: https://modelscope.cn/models/ECNU-CILab/ExVideo-CogVideoX-LoRA-129f-v1 + * 模型 (ExVideo-SVD) + * HuggingFace: https://huggingface.co/ECNU-CILab/ExVideo-SVD-128f-v1 + * ModelScope: https://modelscope.cn/models/ECNU-CILab/ExVideo-SVD-128f-v1 +* Diffutoon: 动漫风格视频渲染方案 + * 项目页面: https://ecnu-cilab.github.io/DiffutoonProjectPage/ + * 技术报告: https://arxiv.org/abs/2401.16224 + * 样例代码: https://github.com/modelscope/DiffSynth-Studio/tree/main/examples/Diffutoon +* FastBlend: 视频去闪烁算法 + * 独立仓库: https://github.com/Artiprocher/sd-webui-fastblend + * 视频演示 + * https://www.bilibili.com/video/BV1d94y1W7PE + * https://www.bilibili.com/video/BV1Lw411m71p + * https://www.bilibili.com/video/BV1RB4y1Z7LF + * 技术报告: https://arxiv.org/abs/2311.09265 +* DiffSynth: DiffSynth-Studio 的前身 + * 项目页面: https://ecnu-cilab.github.io/DiffSynth.github.io/ + * 早期代码: https://github.com/alibaba/EasyNLP/tree/master/diffusion/DiffSynth + * 技术报告: https://arxiv.org/abs/2308.03463