mirror of
https://github.com/modelscope/DiffSynth-Studio.git
synced 2026-03-18 22:08:13 +00:00
Stepvideo
StepVideo is a state-of-the-art (SoTA) text-to-video pre-trained model with 30 billion parameters and the capability to generate videos up to 204 frames.
- Model: https://modelscope.cn/models/stepfun-ai/stepvideo-t2v/summary
- GitHub: https://github.com/stepfun-ai/Step-Video-T2V
- Technical report: https://arxiv.org/abs/2502.10248
Examples
See ./stepvideo_text_to_video.py.
https://github.com/user-attachments/assets/5954fdaa-a3cf-45a3-bd35-886e3cc4581b