Add readthedocs for diffsynth-studio

* add conf docs * add conf docs * add index * add index * update ref * test root * add en * test relative * redirect relative * add document * test_document * test_document
2026-03-22 08:40:47 +00:00 · 2026-02-10 19:51:04 +08:00
parent f6d85f3c2e
commit b3b63fef3e
68 changed files with 777 additions and 267 deletions
--- a/docs/en/Model_Details/FLUX.md
+++ b/docs/en/Model_Details/FLUX.md
@@ -14,7 +14,7 @@ cd DiffSynth-Studio
 pip install -e .
 ```

-For more information about installation, please refer to [Install Dependencies](/docs/en/Pipeline_Usage/Setup.md).
+For more information about installation, please refer to [Install Dependencies](../Pipeline_Usage/Setup.md).

 ## Quick Start

@@ -98,14 +98,14 @@ graph LR;

 Special Training Scripts:

-* Differential LoRA Training: [doc](/docs/en/Training/Differential_LoRA.md), [code](/examples/flux/model_training/special/differential_training/)
-* FP8 Precision Training: [doc](/docs/en/Training/FP8_Precision.md), [code](/examples/flux/model_training/special/fp8_training/)
-* Two-stage Split Training: [doc](/docs/en/Training/Split_Training.md), [code](/examples/flux/model_training/special/split_training/)
-* End-to-end Direct Distillation: [doc](/docs/en/Training/Direct_Distill.md), [code](/examples/flux/model_training/lora/FLUX.1-dev-Distill-LoRA.sh)
+* Differential LoRA Training: [doc](../Training/Differential_LoRA.md), [code](/examples/flux/model_training/special/differential_training/)
+* FP8 Precision Training: [doc](../Training/FP8_Precision.md), [code](/examples/flux/model_training/special/fp8_training/)
+* Two-stage Split Training: [doc](../Training/Split_Training.md), [code](/examples/flux/model_training/special/split_training/)
+* End-to-end Direct Distillation: [doc](../Training/Direct_Distill.md), [code](/examples/flux/model_training/lora/FLUX.1-dev-Distill-LoRA.sh)

 ## Model Inference

-Models are loaded via `FluxImagePipeline.from_pretrained`, see [Loading Models](/docs/en/Pipeline_Usage/Model_Inference.md#loading-models).
+Models are loaded via `FluxImagePipeline.from_pretrained`, see [Loading Models](../Pipeline_Usage/Model_Inference.md#loading-models).

 Input parameters for `FluxImagePipeline` inference include:

@@ -143,7 +143,7 @@ Input parameters for `FluxImagePipeline` inference include:
 * `flex_control_stop`: Flex model control stop timestep.
 * `nexus_gen_reference_image`: Nexus-Gen model reference image.

-If VRAM is insufficient, please enable [VRAM Management](/docs/en/Pipeline_Usage/VRAM_management.md). We provide recommended low VRAM configurations for each model in the example code, see the table in the "Model Overview" section above.
+If VRAM is insufficient, please enable [VRAM Management](../Pipeline_Usage/VRAM_management.md). We provide recommended low VRAM configurations for each model in the example code, see the table in the "Model Overview" section above.

 ## Model Training

@@ -198,4 +198,4 @@ We have built a sample image dataset for your testing. You can download this dat
 modelscope download --dataset DiffSynth-Studio/example_image_dataset --local_dir ./data/example_image_dataset
 ```

-We have written recommended training scripts for each model, please refer to the table in the "Model Overview" section above. For how to write model training scripts, please refer to [Model Training](/docs/en/Pipeline_Usage/Model_Training.md); for more advanced training algorithms, please refer to [Training Framework Detailed Explanation](/docs/Training/).
+We have written recommended training scripts for each model, please refer to the table in the "Model Overview" section above. For how to write model training scripts, please refer to [Model Training](../Pipeline_Usage/Model_Training.md); for more advanced training algorithms, please refer to [Training Framework Detailed Explanation](/docs/Training/).
--- a/docs/en/Model_Details/FLUX2.md
+++ b/docs/en/Model_Details/FLUX2.md
@@ -21,7 +21,7 @@ cd DiffSynth-Studio
 pip install -e .
 ```

-For more information about installation, please refer to [Install Dependencies](/docs/en/Pipeline_Usage/Setup.md).
+For more information about installation, please refer to [Install Dependencies](../Pipeline_Usage/Setup.md).

 ## Quick Start

@@ -69,14 +69,14 @@ image.save("image.jpg")

 Special Training Scripts:

-* Differential LoRA Training: [doc](/docs/en/Training/Differential_LoRA.md)
-* FP8 Precision Training: [doc](/docs/en/Training/FP8_Precision.md)
-* Two-stage Split Training: [doc](/docs/en/Training/Split_Training.md)
-* End-to-end Direct Distillation: [doc](/docs/en/Training/Direct_Distill.md)
+* Differential LoRA Training: [doc](../Training/Differential_LoRA.md)
+* FP8 Precision Training: [doc](../Training/FP8_Precision.md)
+* Two-stage Split Training: [doc](../Training/Split_Training.md)
+* End-to-end Direct Distillation: [doc](../Training/Direct_Distill.md)

 ## Model Inference

-Models are loaded via `Flux2ImagePipeline.from_pretrained`, see [Loading Models](/docs/en/Pipeline_Usage/Model_Inference.md#loading-models).
+Models are loaded via `Flux2ImagePipeline.from_pretrained`, see [Loading Models](../Pipeline_Usage/Model_Inference.md#loading-models).

 Input parameters for `Flux2ImagePipeline` inference include:

@@ -95,7 +95,7 @@ Input parameters for `Flux2ImagePipeline` inference include:
 * `tile_stride`: Tile stride during VAE encoding/decoding stages, default is 64, only effective when `tiled=True`, must be less than or equal to `tile_size`.
 * `progress_bar_cmd`: Progress bar, default is `tqdm.tqdm`. Can be disabled by setting to `lambda x:x`.

-If VRAM is insufficient, please enable [VRAM Management](/docs/en/Pipeline_Usage/VRAM_management.md). We provide recommended low VRAM configurations for each model in the example code, see the table in the "Model Overview" section above.
+If VRAM is insufficient, please enable [VRAM Management](../Pipeline_Usage/VRAM_management.md). We provide recommended low VRAM configurations for each model in the example code, see the table in the "Model Overview" section above.

 ## Model Training

@@ -148,4 +148,4 @@ We have built a sample image dataset for your testing. You can download this dat
 modelscope download --dataset DiffSynth-Studio/example_image_dataset --local_dir ./data/example_image_dataset
 ```

-We have written recommended training scripts for each model, please refer to the table in the "Model Overview" section above. For how to write model training scripts, please refer to [Model Training](/docs/en/Pipeline_Usage/Model_Training.md); for more advanced training algorithms, please refer to [Training Framework Detailed Explanation](/docs/Training/).
+We have written recommended training scripts for each model, please refer to the table in the "Model Overview" section above. For how to write model training scripts, please refer to [Model Training](../Pipeline_Usage/Model_Training.md); for more advanced training algorithms, please refer to [Training Framework Detailed Explanation](/docs/Training/).
--- a/docs/en/Model_Details/LTX-2.md
+++ b/docs/en/Model_Details/LTX-2.md
@@ -12,7 +12,7 @@ cd DiffSynth-Studio
 pip install -e .
 ```

-For more information about installation, please refer to [Installation Dependencies](/docs/en/Pipeline_Usage/Setup.md).
+For more information about installation, please refer to [Installation Dependencies](../Pipeline_Usage/Setup.md).

 ## Quick Start

@@ -83,7 +83,7 @@ write_video_audio_ltx2(

 ## Model Inference

-Models are loaded through `LTX2AudioVideoPipeline.from_pretrained`, see [Loading Models](/docs/en/Pipeline_Usage/Model_Inference.md#loading-models) for details.
+Models are loaded through `LTX2AudioVideoPipeline.from_pretrained`, see [Loading Models](../Pipeline_Usage/Model_Inference.md#loading-models) for details.

 Input parameters for `LTX2AudioVideoPipeline` inference include:

@@ -109,7 +109,7 @@ Input parameters for `LTX2AudioVideoPipeline` inference include:
 * `use_distilled_pipeline`: Whether to use distilled pipeline, default is `False`.
 * `progress_bar_cmd`: Progress bar, default is `tqdm.tqdm`. Can be set to `lambda x:x` to hide the progress bar.

-If VRAM is insufficient, please enable [VRAM Management](/docs/en/Pipeline_Usage/VRAM_management.md). We provide recommended low VRAM configurations for each model in the example code, see the table in the previous "Supported Inference Scripts" section.
+If VRAM is insufficient, please enable [VRAM Management](../Pipeline_Usage/VRAM_management.md). We provide recommended low VRAM configurations for each model in the example code, see the table in the previous "Supported Inference Scripts" section.

 ## Model Training

--- a/docs/en/Model_Details/Overview.md
+++ b/docs/en/Model_Details/Overview.md
@@ -2,7 +2,7 @@

 ## Qwen-Image

-Documentation: [./Qwen-Image.md](/docs/en/Model_Details/Qwen-Image.md)
+Documentation: [./Qwen-Image.md](../Model_Details/Qwen-Image.md)

 <details>

@@ -85,7 +85,7 @@ graph LR;

 ## FLUX Series

-Documentation: [./FLUX.md](/docs/en/Model_Details/FLUX.md)
+Documentation: [./FLUX.md](../Model_Details/FLUX.md)

 <details>

@@ -166,7 +166,7 @@ graph LR;

 ## Wan Series

-Documentation: [./Wan.md](/docs/en/Model_Details/Wan.md)
+Documentation: [./Wan.md](../Model_Details/Wan.md)

 <details>

@@ -286,6 +286,6 @@ graph LR;
 | [PAI/Wan2.2-Fun-A14B-Control](https://modelscope.cn/models/PAI/Wan2.2-Fun-A14B-Control) | `control_video`, `reference_image` | [code](/examples/wanvideo/model_inference/Wan2.2-Fun-A14B-Control.py) | [code](/examples/wanvideo/model_training/full/Wan2.2-Fun-A14B-Control.sh) | [code](/examples/wanvideo/model_training/validate_full/Wan2.2-Fun-A14B-Control.py) | [code](/examples/wanvideo/model_training/lora/Wan2.2-Fun-A14B-Control.sh) | [code](/examples/wanvideo/model_training/validate_lora/Wan2.2-Fun-A14B-Control.py) |
 | [PAI/Wan2.2-Fun-A14B-Control-Camera](https://modelscope.cn/models/PAI/Wan2.2-Fun-A14B-Control-Camera) | `control_camera_video`, `input_image` | [code](/examples/wanvideo/model_inference/Wan2.2-Fun-A14B-Control-Camera.py) | [code](/examples/wanvideo/model_training/full/Wan2.2-Fun-A14B-Control-Camera.sh) | [code](/examples/wanvideo/model_training/validate_full/Wan2.2-Fun-A14B-Control-Camera.py) | [code](/examples/wanvideo/model_training/lora/Wan2.2-Fun-A14B-Control-Camera.sh) | [code](/examples/wanvideo/model_training/validate_lora/Wan2.2-Fun-A14B-Control-Camera.py) |

-* FP8 Precision Training: [doc](/docs/en/Training/FP8_Precision.md), [code](/examples/wanvideo/model_training/special/fp8_training/)
-* Two-stage Split Training: [doc](/docs/en/Training/Split_Training.md), [code](/examples/wanvideo/model_training/special/split_training/)
-* End-to-end Direct Distillation: [doc](/docs/en/Training/Direct_Distill.md), [code](/examples/wanvideo/model_training/special/direct_distill/)
+* FP8 Precision Training: [doc](../Training/FP8_Precision.md), [code](/examples/wanvideo/model_training/special/fp8_training/)
+* Two-stage Split Training: [doc](../Training/Split_Training.md), [code](/examples/wanvideo/model_training/special/split_training/)
+* End-to-end Direct Distillation: [doc](../Training/Direct_Distill.md), [code](/examples/wanvideo/model_training/special/direct_distill/)
--- a/docs/en/Model_Details/Qwen-Image.md
+++ b/docs/en/Model_Details/Qwen-Image.md
@@ -14,7 +14,7 @@ cd DiffSynth-Studio
 pip install -e .
 ```

-For more information about installation, please refer to [Install Dependencies](/docs/en/Pipeline_Usage/Setup.md).
+For more information about installation, please refer to [Install Dependencies](../Pipeline_Usage/Setup.md).

 ## Quick Start

@@ -102,10 +102,10 @@ graph LR;

 Special Training Scripts:

-* Differential LoRA Training: [doc](/docs/en/Training/Differential_LoRA.md), [code](/examples/qwen_image/model_training/special/differential_training/)
-* FP8 Precision Training: [doc](/docs/en/Training/FP8_Precision.md), [code](/examples/qwen_image/model_training/special/fp8_training/)
-* Two-stage Split Training: [doc](/docs/en/Training/Split_Training.md), [code](/examples/qwen_image/model_training/special/split_training/)
-* End-to-end Direct Distillation: [doc](/docs/en/Training/Direct_Distill.md), [code](/examples/qwen_image/model_training/lora/Qwen-Image-Distill-LoRA.sh)
+* Differential LoRA Training: [doc](../Training/Differential_LoRA.md), [code](/examples/qwen_image/model_training/special/differential_training/)
+* FP8 Precision Training: [doc](../Training/FP8_Precision.md), [code](/examples/qwen_image/model_training/special/fp8_training/)
+* Two-stage Split Training: [doc](../Training/Split_Training.md), [code](/examples/qwen_image/model_training/special/split_training/)
+* End-to-end Direct Distillation: [doc](../Training/Direct_Distill.md), [code](/examples/qwen_image/model_training/lora/Qwen-Image-Distill-LoRA.sh)

 DeepSpeed ZeRO Stage 3 Training: The Qwen-Image series models support DeepSpeed ZeRO Stage 3 training, which partitions the model across multiple GPUs. Taking full parameter training of the Qwen-Image model as an example, the following modifications are required:

@@ -114,7 +114,7 @@ DeepSpeed ZeRO Stage 3 Training: The Qwen-Image series models support DeepSpeed

 ## Model Inference

-Models are loaded via `QwenImagePipeline.from_pretrained`, see [Loading Models](/docs/en/Pipeline_Usage/Model_Inference.md#loading-models).
+Models are loaded via `QwenImagePipeline.from_pretrained`, see [Loading Models](../Pipeline_Usage/Model_Inference.md#loading-models).

 Input parameters for `QwenImagePipeline` inference include:

@@ -145,7 +145,7 @@ Input parameters for `QwenImagePipeline` inference include:
 * `tile_stride`: Tile stride during VAE encoding/decoding stages, default is 64, only effective when `tiled=True`, must be less than or equal to `tile_size`.
 * `progress_bar_cmd`: Progress bar, default is `tqdm.tqdm`. Can be disabled by setting to `lambda x:x`.

-If VRAM is insufficient, please enable [VRAM Management](/docs/en/Pipeline_Usage/VRAM_management.md). We provide recommended low VRAM configurations for each model in the example code, see the table in the "Model Overview" section above.
+If VRAM is insufficient, please enable [VRAM Management](../Pipeline_Usage/VRAM_management.md). We provide recommended low VRAM configurations for each model in the example code, see the table in the "Model Overview" section above.

 ## Model Training

@@ -199,4 +199,4 @@ We have built a sample image dataset for your testing. You can download this dat
 modelscope download --dataset DiffSynth-Studio/example_image_dataset --local_dir ./data/example_image_dataset
 ```

-We have written recommended training scripts for each model, please refer to the table in the "Model Overview" section above. For how to write model training scripts, please refer to [Model Training](/docs/en/Pipeline_Usage/Model_Training.md); for more advanced training algorithms, please refer to [Training Framework Detailed Explanation](/docs/Training/).
+We have written recommended training scripts for each model, please refer to the table in the "Model Overview" section above. For how to write model training scripts, please refer to [Model Training](../Pipeline_Usage/Model_Training.md); for more advanced training algorithms, please refer to [Training Framework Detailed Explanation](/docs/Training/).
--- a/docs/en/Model_Details/Wan.md
+++ b/docs/en/Model_Details/Wan.md
@@ -14,7 +14,7 @@ cd DiffSynth-Studio
 pip install -e .
 ```

-For more information about installation, please refer to [Install Dependencies](/docs/en/Pipeline_Usage/Setup.md).
+For more information about installation, please refer to [Install Dependencies](../Pipeline_Usage/Setup.md).

 ## Quick Start

@@ -138,9 +138,9 @@ graph LR;
 | [PAI/Wan2.2-Fun-A14B-Control](https://modelscope.cn/models/PAI/Wan2.2-Fun-A14B-Control) | `control_video`, `reference_image` | [code](/examples/wanvideo/model_inference/Wan2.2-Fun-A14B-Control.py) | [code](/examples/wanvideo/model_training/full/Wan2.2-Fun-A14B-Control.sh) | [code](/examples/wanvideo/model_training/validate_full/Wan2.2-Fun-A14B-Control.py) | [code](/examples/wanvideo/model_training/lora/Wan2.2-Fun-A14B-Control.sh) | [code](/examples/wanvideo/model_training/validate_lora/Wan2.2-Fun-A14B-Control.py) |
 | [PAI/Wan2.2-Fun-A14B-Control-Camera](https://modelscope.cn/models/PAI/Wan2.2-Fun-A14B-Control-Camera) | `control_camera_video`, `input_image` | [code](/examples/wanvideo/model_inference/Wan2.2-Fun-A14B-Control-Camera.py) | [code](/examples/wanvideo/model_training/full/Wan2.2-Fun-A14B-Control-Camera.sh) | [code](/examples/wanvideo/model_training/validate_full/Wan2.2-Fun-A14B-Control-Camera.py) | [code](/examples/wanvideo/model_training/lora/Wan2.2-Fun-A14B-Control-Camera.sh) | [code](/examples/wanvideo/model_training/validate_lora/Wan2.2-Fun-A14B-Control-Camera.py) |

-* FP8 Precision Training: [doc](/docs/en/Training/FP8_Precision.md), [code](/examples/wanvideo/model_training/special/fp8_training/)
-* Two-stage Split Training: [doc](/docs/en/Training/Split_Training.md), [code](/examples/wanvideo/model_training/special/split_training/)
-* End-to-end Direct Distillation: [doc](/docs/en/Training/Direct_Distill.md), [code](/examples/wanvideo/model_training/special/direct_distill/)
+* FP8 Precision Training: [doc](../Training/FP8_Precision.md), [code](/examples/wanvideo/model_training/special/fp8_training/)
+* Two-stage Split Training: [doc](../Training/Split_Training.md), [code](/examples/wanvideo/model_training/special/split_training/)
+* End-to-end Direct Distillation: [doc](../Training/Direct_Distill.md), [code](/examples/wanvideo/model_training/special/direct_distill/)

 DeepSpeed ZeRO Stage 3 Training: The Wan series models support DeepSpeed ZeRO Stage 3 training, which partitions the model across multiple GPUs. Taking full parameter training of the Wan2.1-T2V-14B model as an example, the following modifications are required:

@@ -149,7 +149,7 @@ DeepSpeed ZeRO Stage 3 Training: The Wan series models support DeepSpeed ZeRO St

 ## Model Inference

-Models are loaded via `WanVideoPipeline.from_pretrained`, see [Loading Models](/docs/en/Pipeline_Usage/Model_Inference.md#loading-models).
+Models are loaded via `WanVideoPipeline.from_pretrained`, see [Loading Models](../Pipeline_Usage/Model_Inference.md#loading-models).

 Input parameters for `WanVideoPipeline` inference include:

@@ -199,7 +199,7 @@ Input parameters for `WanVideoPipeline` inference include:
 * `tea_cache_model_id`: Model ID used by TeaCache.
 * `progress_bar_cmd`: Progress bar, default is `tqdm.tqdm`. Can be disabled by setting to `lambda x:x`.

-If VRAM is insufficient, please enable [VRAM Management](/docs/en/Pipeline_Usage/VRAM_management.md). We provide recommended low VRAM configurations for each model in the example code, see the table in the "Model Overview" section above.
+If VRAM is insufficient, please enable [VRAM Management](../Pipeline_Usage/VRAM_management.md). We provide recommended low VRAM configurations for each model in the example code, see the table in the "Model Overview" section above.

 ## Model Training

@@ -254,4 +254,4 @@ We have built a sample video dataset for your testing. You can download this dat
 modelscope download --dataset DiffSynth-Studio/example_video_dataset --local_dir ./data/example_video_dataset
 ```

-We have written recommended training scripts for each model, please refer to the table in the "Model Overview" section above. For how to write model training scripts, please refer to [Model Training](/docs/en/Pipeline_Usage/Model_Training.md); for more advanced training algorithms, please refer to [Training Framework Detailed Explanation](/docs/Training/).
+We have written recommended training scripts for each model, please refer to the table in the "Model Overview" section above. For how to write model training scripts, please refer to [Model Training](../Pipeline_Usage/Model_Training.md); for more advanced training algorithms, please refer to [Training Framework Detailed Explanation](/docs/Training/).
--- a/docs/en/Model_Details/Z-Image.md
+++ b/docs/en/Model_Details/Z-Image.md
@@ -12,7 +12,7 @@ cd DiffSynth-Studio
 pip install -e .
 ```

-For more information about installation, please refer to [Install Dependencies](/docs/en/Pipeline_Usage/Setup.md).
+For more information about installation, please refer to [Install Dependencies](../Pipeline_Usage/Setup.md).

 ## Quick Start

@@ -61,12 +61,12 @@ image.save("image.jpg")

 Special Training Scripts:

-* Differential LoRA Training: [doc](/docs/en/Training/Differential_LoRA.md), [code](/examples/z_image/model_training/special/differential_training/)
+* Differential LoRA Training: [doc](../Training/Differential_LoRA.md), [code](/examples/z_image/model_training/special/differential_training/)
 * Trajectory Imitation Distillation Training (Experimental Feature): [code](/examples/z_image/model_training/special/trajectory_imitation/)

 ## Model Inference

-Models are loaded via `ZImagePipeline.from_pretrained`, see [Loading Models](/docs/en/Pipeline_Usage/Model_Inference.md#loading-models).
+Models are loaded via `ZImagePipeline.from_pretrained`, see [Loading Models](../Pipeline_Usage/Model_Inference.md#loading-models).

 Input parameters for `ZImagePipeline` inference include:

@@ -84,7 +84,7 @@ Input parameters for `ZImagePipeline` inference include:
 * `edit_image`: Edit images for image editing models, supporting multiple images.
 * `positive_only_lora`: LoRA weights used only in positive prompts.

-If VRAM is insufficient, please enable [VRAM Management](/docs/en/Pipeline_Usage/VRAM_management.md). We provide recommended low VRAM configurations for each model in the example code, see the table in the "Model Overview" section above.
+If VRAM is insufficient, please enable [VRAM Management](../Pipeline_Usage/VRAM_management.md). We provide recommended low VRAM configurations for each model in the example code, see the table in the "Model Overview" section above.

 ## Model Training

@@ -137,7 +137,7 @@ We have built a sample image dataset for your testing. You can download this dat
 modelscope download --dataset DiffSynth-Studio/example_image_dataset --local_dir ./data/example_image_dataset
 ```

-We have written recommended training scripts for each model, please refer to the table in the "Model Overview" section above. For how to write model training scripts, please refer to [Model Training](/docs/en/Pipeline_Usage/Model_Training.md); for more advanced training algorithms, please refer to [Training Framework Detailed Explanation](/docs/Training/).
+We have written recommended training scripts for each model, please refer to the table in the "Model Overview" section above. For how to write model training scripts, please refer to [Model Training](../Pipeline_Usage/Model_Training.md); for more advanced training algorithms, please refer to [Training Framework Detailed Explanation](/docs/Training/).

 Training Tips: