diff --git a/examples/wanvideo/README.md b/examples/wanvideo/README.md index cece7d3..261bb33 100644 --- a/examples/wanvideo/README.md +++ b/examples/wanvideo/README.md @@ -4,6 +4,10 @@ Wan 2.1 is a collection of video synthesis models open-sourced by Alibaba. +**DiffSynth-Studio has adopted a new inference and training framework. To use the previous version, please click [here](https://github.com/modelscope/DiffSynth-Studio/tree/3edf3583b1f08944cee837b94d9f84d669c2729c).** + +## Installation + Before using this model, please install DiffSynth-Studio from **source code**. ```shell @@ -12,6 +16,8 @@ cd DiffSynth-Studio pip install -e . ``` +## Overview + | Model ID | Extra Parameters | Inference | Full Training | Full Training Validation | LoRA Training | LoRA Training Validation | |-|-|-|-|-|-|-| |[Wan-AI/Wan2.1-T2V-1.3B](https://modelscope.cn/models/Wan-AI/Wan2.1-T2V-1.3B)||[code](./model_inference/Wan2.1-T2V-1.3B.py)|[code](./model_training/full/Wan2.1-T2V-1.3B.sh)|[code](./model_training/validate_full/Wan2.1-T2V-1.3B.py)|[code](./model_training/lora/Wan2.1-T2V-1.3B.sh)|[code](./model_training/validate_lora/Wan2.1-T2V-1.3B.py)| @@ -387,3 +393,25 @@ Note that full fine-tuning of the 14B model requires 8 GPUs, each with at least The default video resolution in the training script is `480*832*81`. Increasing the resolution may cause out-of-memory errors. To reduce VRAM usage, add the parameter `--use_gradient_checkpointing_offload`. + +## 案例展示 + +1.3B text-to-video: + +https://github.com/user-attachments/assets/124397be-cd6a-4f29-a87c-e4c695aaabb8 + +Put sunglasses on the dog (1.3B video-to-video): + +https://github.com/user-attachments/assets/272808d7-fbeb-4747-a6df-14a0860c75fb + +14B text-to-video: + +https://github.com/user-attachments/assets/3908bc64-d451-485a-8b61-28f6d32dd92f + +14B image-to-video: + +https://github.com/user-attachments/assets/c0bdd5ca-292f-45ed-b9bc-afe193156e75 + +LoRA training: + +https://github.com/user-attachments/assets/9bd8e30b-97e8-44f9-bb6f-da004ba376a9 diff --git a/examples/wanvideo/README_zh.md b/examples/wanvideo/README_zh.md index fc91990..e9cbd4a 100644 --- a/examples/wanvideo/README_zh.md +++ b/examples/wanvideo/README_zh.md @@ -4,6 +4,10 @@ Wan 2.1 是由阿里巴巴通义实验室开源的一系列视频生成模型。 +**DiffSynth-Studio 启用了新的推理和训练框架,如需使用旧版本,请点击[这里](https://github.com/modelscope/DiffSynth-Studio/tree/3edf3583b1f08944cee837b94d9f84d669c2729c)。** + +## 安装 + 在使用本系列模型之前,请通过源码安装 DiffSynth-Studio。 ```shell @@ -12,6 +16,8 @@ cd DiffSynth-Studio pip install -e . ``` +## 模型总览 + |模型 ID|额外参数|推理|全量训练|全量训练后验证|LoRA 训练|LoRA 训练后验证| |-|-|-|-|-|-|-| |[Wan-AI/Wan2.1-T2V-1.3B](https://modelscope.cn/models/Wan-AI/Wan2.1-T2V-1.3B)||[code](./model_inference/Wan2.1-T2V-1.3B.py)|[code](./model_training/full/Wan2.1-T2V-1.3B.sh)|[code](./model_training/validate_full/Wan2.1-T2V-1.3B.py)|[code](./model_training/lora/Wan2.1-T2V-1.3B.sh)|[code](./model_training/validate_lora/Wan2.1-T2V-1.3B.py)| @@ -390,3 +396,25 @@ model_configs=[ 训练脚本的默认视频尺寸为 `480*832*81`,提升分辨率将可能导致显存不足,请添加参数 `--use_gradient_checkpointing_offload` 降低显存占用。 + +## 案例展示 + +1.3B 文生视频: + +https://github.com/user-attachments/assets/124397be-cd6a-4f29-a87c-e4c695aaabb8 + +给狗狗戴上墨镜(1.3B 视频生视频): + +https://github.com/user-attachments/assets/272808d7-fbeb-4747-a6df-14a0860c75fb + +14B 文生视频: + +https://github.com/user-attachments/assets/3908bc64-d451-485a-8b61-28f6d32dd92f + +14B 图生视频: + +https://github.com/user-attachments/assets/c0bdd5ca-292f-45ed-b9bc-afe193156e75 + +LoRA 训练: + +https://github.com/user-attachments/assets/9bd8e30b-97e8-44f9-bb6f-da004ba376a9