update readme

This commit is contained in:
Artiprocher
2025-07-22 20:02:21 +08:00
parent ff95c56884
commit ebeda32215
5 changed files with 462 additions and 42 deletions

View File

@@ -0,0 +1,39 @@
# CogVideoX
### Example: Text-to-Video using CogVideoX-5B (Experimental)
See [cogvideo_text_to_video.py](cogvideo_text_to_video.py).
First, we generate a video using prompt "an astronaut riding a horse on Mars".
https://github.com/user-attachments/assets/4c91c1cd-e4a0-471a-bd8d-24d761262941
Then, we convert the astronaut to a robot.
https://github.com/user-attachments/assets/225a00a4-2bc8-4740-8e86-a64b460a29ec
Upscale the video using the model itself.
https://github.com/user-attachments/assets/c02cb30c-de60-473c-8242-32c67b3155ad
Make the video look smoother by interpolating frames.
https://github.com/user-attachments/assets/f0e465b4-45df-4435-ab10-7a084ca2b0a0
Here is another example.
First, we generate a video using prompt "a dog is running".
https://github.com/user-attachments/assets/e3696297-99f5-4d0c-a5ca-1d1566db85b4
Then, we add a blue collar to the dog.
https://github.com/user-attachments/assets/7ff22be7-4390-4d33-ae6c-53f6f056e18d
Upscale the video using the model itself.
https://github.com/user-attachments/assets/a909c32c-0b7d-495c-a53c-d23a99a3d3e9
Make the video look smoother by interpolating frames.
https://github.com/user-attachments/assets/ea37c150-97a0-4858-8003-0c2e5eef3331

View File

@@ -50,11 +50,10 @@ image.save("image.jpg")
|[FLUX.1-dev-Controlnet-Upscaler](https://www.modelscope.cn/models/jasperai/Flux.1-dev-Controlnet-Upscaler)|`controlnet_inputs`|[code](./model_inference/FLUX.1-dev-Controlnet-Upscaler.py)|[code](./model_inference_low_vram/FLUX.1-dev-Controlnet-Upscaler.py)|[code](./model_training/full/FLUX.1-dev-Controlnet-Upscaler.sh)|[code](./model_training/validate_full/FLUX.1-dev-Controlnet-Upscaler.py)|[code](./model_training/lora/FLUX.1-dev-Controlnet-Upscaler.sh)|[code](./model_training/validate_lora/FLUX.1-dev-Controlnet-Upscaler.py)|
|[FLUX.1-dev-IP-Adapter](https://www.modelscope.cn/models/InstantX/FLUX.1-dev-IP-Adapter)|`ipadapter_images`, `ipadapter_scale`|[code](./model_inference/FLUX.1-dev-IP-Adapter.py)|[code](./model_inference_low_vram/FLUX.1-dev-IP-Adapter.py)|[code](./model_training/full/FLUX.1-dev-IP-Adapter.sh)|[code](./model_training/validate_full/FLUX.1-dev-IP-Adapter.py)|[code](./model_training/lora/FLUX.1-dev-IP-Adapter.sh)|[code](./model_training/validate_lora/FLUX.1-dev-IP-Adapter.py)|
|[FLUX.1-dev-InfiniteYou](https://www.modelscope.cn/models/ByteDance/InfiniteYou)|`infinityou_id_image`, `infinityou_guidance`, `controlnet_inputs`|[code](./model_inference/FLUX.1-dev-InfiniteYou.py)|[code](./model_inference_low_vram/FLUX.1-dev-InfiniteYou.py)|[code](./model_training/full/FLUX.1-dev-InfiniteYou.sh)|[code](./model_training/validate_full/FLUX.1-dev-InfiniteYou.py)|[code](./model_training/lora/FLUX.1-dev-InfiniteYou.sh)|[code](./model_training/validate_lora/FLUX.1-dev-InfiniteYou.py)|
|[FLUX.1-dev-EliGen](https://www.modelscope.cn/models/DiffSynth-Studio/Eligen)|`eligen_entity_prompts`, `eligen_entity_masks`, `eligen_enable_on_negative`, `eligen_enable_inpaint`|[code](./model_inference/FLUX.1-dev-EliGen.py)|[code](./model_inference_low_vram/FLUX.1-dev-EliGen.py)|||||
|[FLUX.1-dev-EliGen](https://www.modelscope.cn/models/DiffSynth-Studio/Eligen)|`eligen_entity_prompts`, `eligen_entity_masks`, `eligen_enable_on_negative`, `eligen_enable_inpaint`|[code](./model_inference/FLUX.1-dev-EliGen.py)|[code](./model_inference_low_vram/FLUX.1-dev-EliGen.py)|-|-|||
|[FLUX.1-dev-LoRA-Encoder](https://www.modelscope.cn/models/DiffSynth-Studio/LoRA-Encoder-FLUX.1-Dev)|`lora_encoder_inputs`, `lora_encoder_scale`|[code](./model_inference/FLUX.1-dev-LoRA-Encoder.py)|[code](./model_inference_low_vram/FLUX.1-dev-LoRA-Encoder.py)|[code](./model_training/full/FLUX.1-dev-LoRA-Encoder.sh)|[code](./model_training/validate_full/FLUX.1-dev-LoRA-Encoder.py)|-|-|
|[Step1X-Edit](https://www.modelscope.cn/models/stepfun-ai/Step1X-Edit)|`step1x_reference_image`|[code](./model_inference/Step1X-Edit.py)|[code](./model_inference_low_vram/Step1X-Edit.py)|[code](./model_training/full/Step1X-Edit.sh)|[code](./model_training/validate_full/Step1X-Edit.py)|[code](./model_training/lora/Step1X-Edit.sh)|[code](./model_training/validate_lora/Step1X-Edit.py)|
|[FLEX.2-preview](https://www.modelscope.cn/models/ostris/Flex.2-preview)|`flex_inpaint_image`, `flex_inpaint_mask`, `flex_control_image`, `flex_control_strength`, `flex_control_stop`|[code](./model_inference/FLEX.2-preview.py)|[code](./model_inference_low_vram/FLEX.2-preview.py)|[code](./model_training/full/FLEX.2-preview.sh)|[code](./model_training/validate_full/FLEX.2-preview.py)|[code](./model_training/lora/FLEX.2-preview.sh)|[code](./model_training/validate_lora/FLEX.2-preview.py)|
|[Nexus-Gen](https://www.modelscope.cn/models/DiffSynth-Studio/Nexus-GenV2)||||||||
## 模型推理

View File

@@ -1,45 +1,5 @@
# Text to Video
In DiffSynth Studio, we can use some video models to generate videos.
### Example: Text-to-Video using CogVideoX-5B (Experimental)
See [cogvideo_text_to_video.py](cogvideo_text_to_video.py).
First, we generate a video using prompt "an astronaut riding a horse on Mars".
https://github.com/user-attachments/assets/4c91c1cd-e4a0-471a-bd8d-24d761262941
Then, we convert the astronaut to a robot.
https://github.com/user-attachments/assets/225a00a4-2bc8-4740-8e86-a64b460a29ec
Upscale the video using the model itself.
https://github.com/user-attachments/assets/c02cb30c-de60-473c-8242-32c67b3155ad
Make the video look smoother by interpolating frames.
https://github.com/user-attachments/assets/f0e465b4-45df-4435-ab10-7a084ca2b0a0
Here is another example.
First, we generate a video using prompt "a dog is running".
https://github.com/user-attachments/assets/e3696297-99f5-4d0c-a5ca-1d1566db85b4
Then, we add a blue collar to the dog.
https://github.com/user-attachments/assets/7ff22be7-4390-4d33-ae6c-53f6f056e18d
Upscale the video using the model itself.
https://github.com/user-attachments/assets/a909c32c-0b7d-495c-a53c-d23a99a3d3e9
Make the video look smoother by interpolating frames.
https://github.com/user-attachments/assets/ea37c150-97a0-4858-8003-0c2e5eef3331
### Example: Text-to-Video using AnimateDiff
Generate a video using a Stable Diffusion model and an AnimateDiff model. We can break the limitation of number of frames! See [sd_text_to_video.py](./sd_text_to_video.py).