Merge pull request #292 from modelscope/dev

hunyuanvideo examples
2026-04-08 08:58:20 +00:00 · 2024-12-19 13:29:51 +08:00
parent 03d3a26f6f 00a610e5ad
commit aa23356420
2 changed files with 5 additions and 1 deletions
--- a/README.md
+++ b/README.md
@@ -17,6 +17,7 @@ DiffSynth Studio is a Diffusion engine. We have restructured architectures inclu

 Until now, DiffSynth Studio has supported the following models:

+* [HunyuanVideo](https://github.com/Tencent/HunyuanVideo)
 * [CogVideoX](https://huggingface.co/THUDM/CogVideoX-5b)
 * [FLUX](https://huggingface.co/black-forest-labs/FLUX.1-dev)
 * [ExVideo](https://huggingface.co/ECNU-CILab/ExVideo-SVD-128f-v1)
@@ -34,6 +35,9 @@ Until now, DiffSynth Studio has supported the following models:

 ## News

+
+- **December 19, 2024** We implement advanced VRAM management for HunyuanVideo, making it possible to generate videos at a resolution of 129x720x1280 using 24GB of VRAM, or at 129x512x384 resolution with just 6GB of VRAM. Please refer to [./examples/HunyuanVideo/](./examples/HunyuanVideo/) for more details.
+
 - **December 18, 2024** We propose ArtAug, an approach designed to improve text-to-image synthesis models through synthesis-understanding interactions. We have trained an ArtAug enhancement module for FLUX.1-dev in the format of LoRA. This model integrates the aesthetic understanding of Qwen2-VL-72B into FLUX.1-dev, leading to an improvement in the quality of generated images.
  - Paper: https://arxiv.org/abs/2412.12888
  - Examples: https://github.com/modelscope/DiffSynth-Studio/tree/main/examples/ArtAug
--- a/examples/HunyuanVideo/README.md
+++ b/examples/HunyuanVideo/README.md
@@ -1,6 +1,6 @@
 # HunyuanVideo

-[HunyuanVideo](https://www.modelscope.cn/models/AI-ModelScope/HunyuanVideo) is a video generation model trained by Tencent. We provide advanced VRAM management for this model, including three stages:
+[HunyuanVideo](https://github.com/Tencent/HunyuanVideo) is a video generation model trained by Tencent. We provide advanced VRAM management for this model, including three stages:

 |VRAM required|Example script|Frames|Resolution|Note|
 |-|-|-|-|-|