update docs

2026-04-08 17:18:21 +00:00 · 2024-09-11 16:37:46 +08:00
parent 7f6e35fe35
commit 41f58e2d41
20 changed files with 637 additions and 570 deletions
--- a/docs/source/GetStarted/A_simple_example.md
+++ b/docs/source/GetStarted/A_simple_example.md
@@ -1,87 +0,0 @@
-
-# 基于Flux的文生图示例
-
-以下是如何使用FLUX.1模型进行文生图任务的示例。该脚本提供了一个简单的设置，用于从文本描述生成图像。包括下载必要的模型、配置pipeline，以及在启用和禁用 classifier-free guidance 的情况下生成图像。
-
-其他 DiffSynth 支持的模型详见 [模型.md](模型.md)
-
-## 准备
-
-首先，确保已下载并配置了必要的模型：
-
-```python
-import torch
-from diffsynth import ModelManager, FluxImagePipeline, download_models
-
-# Download the FLUX.1-dev model files
-download_models(["FLUX.1-dev"])
-```
-
-下载模型的用法详见 [下载模型.md](下载模型.md)
-
-## 加载模型
-
-使用您的设备和数据类型初始化模型管理器
-
-```python
-model_manager = ModelManager(torch_dtype=torch.bfloat16, device="cuda")
-model_manager.load_models([
-    "models/FLUX/FLUX.1-dev/text_encoder/model.safetensors",
-    "models/FLUX/FLUX.1-dev/text_encoder_2",
-    "models/FLUX/FLUX.1-dev/ae.safetensors",
-    "models/FLUX/FLUX.1-dev/flux1-dev.safetensors"
-])
-```
-
-模型加载的用法详见 [ModelManager.md](ModelManager.md)
-
-## 创建 Pipeline
-
-从加载的模型管理器中创建FluxImagePipeline实例：
-
-```python
-pipe = FluxImagePipeline.from_model_manager(model_manager)
-```
-
-Pipeline 的用法详见 [Pipeline.md](Pipeline.md)
-
-## 文生图
-
-使用简短的提示语生成图像。以下是启用和禁用 classifier-free guidance 的图像生成示例。
-
-### 基础文生图
-
-```python
-prompt = "A cute little turtle"
-negative_prompt = ""
-
-torch.manual_seed(6)
-image = pipe(
-    prompt=prompt,
-    num_inference_steps=30, embedded_guidance=3.5
-)
-image.save("image_1024.jpg")
-```
-
-### 使用 Classifier-Free Guidance 生成
-```python
-torch.manual_seed(6)
-image = pipe(
-    prompt=prompt, negative_prompt=negative_prompt,
-    num_inference_steps=30, cfg_scale=2.0, embedded_guidance=3.5
-)
-image.save("image_1024_cfg.jpg")
-```
-
-### 高分辨率修复
-
-```python
-torch.manual_seed(7)
-image = pipe(
-    prompt=prompt,
-    num_inference_steps=30, embedded_guidance=3.5,
-    input_image=image.resize((2048, 2048)), height=2048, width=2048, denoising_strength=0.6, tiled=True
-)
-image.save("image_2048_highres.jpg")
-```
-
--- a/docs/source/GetStarted/Download_models.md
+++ b/docs/source/GetStarted/Download_models.md
@@ -1,20 +0,0 @@
-# 下载模型
-
-下载预设模型，模型ID可参考 [config file](/diffsynth/configs/model_config.py).
-
-```python
-from diffsynth import download_models
-
-download_models(["FLUX.1-dev", "Kolors"])
-```
-
-下载非预设模型，可以选择 [ModelScope](https://modelscope.cn/models) 和 [HuggingFace](https://huggingface.co/models) 两个下载源中的模型。
-
-```python
-from diffsynth.models.downloader import download_from_huggingface, download_from_modelscope
-
-# From Modelscope (recommended)
-download_from_modelscope("Kwai-Kolors/Kolors", "vae/diffusion_pytorch_model.fp16.bin", "models/kolors/Kolors/vae")
-# From Huggingface
-download_from_huggingface("Kwai-Kolors/Kolors", "vae/diffusion_pytorch_model.fp16.safetensors", "models/kolors/Kolors/vae")
-```
--- a/docs/source/GetStarted/Fine-Tuning.md
+++ b/docs/source/GetStarted/Fine-Tuning.md
@@ -1,431 +0,0 @@
-# 微调
-
-我们实现了一个用于文本到图像扩散模型的训练框架，使用户能够轻松地使用我们的框架训练 LoRA 模型。我们提供的脚本具有以下特点：
-
-* **全面功能与用户友好性**：我们的训练框架支持多GPU和多机器配置，便于使用 DeepSpeed 加速，并包括梯度检查点优化，适用于内存需求较大的模型。
-* **代码简洁与研究者可及性**：我们避免了大块复杂的代码。通用模块实现于 `diffsynth/trainers/text_to_image.py` 中，而模型特定的训练脚本仅包含与模型架构相关的最少代码，便于研究人员使用。
-* **模块化设计与开发者灵活性**：基于通用的 Pytorch-Lightning 框架，我们的训练框架在功能上是解耦的，允许开发者通过修改我们的脚本轻松引入额外的训练技术，以满足他们的需求。
-
-LoRA 微调的图像示例。提示词为 "一只小狗蹦蹦跳跳，周围是姹紫嫣红的鲜花，远处是山脉"（针对中文模型）或 "a dog is jumping, flowers around the dog, the background is mountains and clouds"（针对英文模型）。
-
-||Kolors|Stable Diffusion 3|Hunyuan-DiT|
-|-|-|-|-|
-|Without LoRA|![image_without_lora](https://github.com/modelscope/DiffSynth-Studio/assets/35051019/9d79ed7a-e8cf-4d98-800a-f182809db318)|![image_without_lora](https://github.com/modelscope/DiffSynth-Studio/assets/35051019/ddb834a5-6366-412b-93dc-6d957230d66e)|![image_without_lora](https://github.com/Artiprocher/DiffSynth-Studio/assets/35051019/1aa21de5-a992-4b66-b14f-caa44e08876e)|
-|With LoRA|![image_with_lora](https://github.com/modelscope/DiffSynth-Studio/assets/35051019/02f62323-6ee5-4788-97a1-549732dbe4f0)|![image_with_lora](https://github.com/modelscope/DiffSynth-Studio/assets/35051019/8e7b2888-d874-4da4-a75b-11b6b214b9bf)|![image_with_lora](https://github.com/Artiprocher/DiffSynth-Studio/assets/35051019/83a0a41a-691f-4610-8e7b-d8e17c50a282)|
-
-## 下载需要的包
-
-```bash
-pip install peft lightning
-```
-
-## 准备你的数据
-
-我们提供了一个 [示例数据集](https://modelscope.cn/datasets/buptwq/lora-stable-diffusion-finetune/files)。你需要将训练数据集按照如下形式组织：
-
-
-```
-data/dog/
-└── train
-    ├── 00.jpg
-    ├── 01.jpg
-    ├── 02.jpg
-    ├── 03.jpg
-    ├── 04.jpg
-    └── metadata.csv
-```
-
-`metadata.csv`:
-
-```
-file_name,text
-00.jpg,a dog
-01.jpg,a dog
-02.jpg,a dog
-03.jpg,a dog
-04.jpg,a dog
-```
-
-请注意，如果模型是中文模型（例如，Hunyuan-DiT 和 Kolors），我们建议在数据集中使用中文文本。例如：
-
-```
-file_name,text
-00.jpg,一只小狗
-01.jpg,一只小狗
-02.jpg,一只小狗
-03.jpg,一只小狗
-04.jpg,一只小狗
-```
-
-## 训练 LoRA 模型
-
-参数选项：
-
-```
-  --lora_target_modules LORA_TARGET_MODULES
-                        LoRA 模块所在的层。
-  --dataset_path DATASET_PATH
-                        数据集的路径。
-  --output_path OUTPUT_PATH
-                        模型保存路径。
-  --steps_per_epoch STEPS_PER_EPOCH
-                        每个周期的步数。
-  --height HEIGHT       图像高度。
-  --width WIDTH         图像宽度。
-  --center_crop         是否将输入图像中心裁剪到指定分辨率。如果未设置，图像将被随机裁剪。图像会在裁剪前先调整到指定分辨率。
-  --random_flip         是否随机水平翻转图像。
-  --batch_size BATCH_SIZE
-                        训练数据加载器的批量大小（每设备）。
-  --dataloader_num_workers DATALOADER_NUM_WORKERS
-                        数据加载使用的子进程数量。0 表示数据将在主进程中加载。
-  --precision {32,16,16-mixed}
-                        训练精度。
-  --learning_rate LEARNING_RATE
-                        学习率。
-  --lora_rank LORA_RANK
-                        LoRA 更新矩阵的维度。
-  --lora_alpha LORA_ALPHA
-                        LoRA 更新矩阵的权重。
-  --use_gradient_checkpointing
-                        是否使用梯度检查点。
-  --accumulate_grad_batches ACCUMULATE_GRAD_BATCHES
-                        梯度累积的批次数量。
-  --training_strategy {auto,deepspeed_stage_1,deepspeed_stage_2,deepspeed_stage_3}
-                        训练策略。
-  --max_epochs MAX_EPOCHS
-                        训练周期数。
-  --modelscope_model_id MODELSCOPE_MODEL_ID
-                        ModelScope 上的模型 ID (https://www.modelscope.cn/)。如果提供模型 ID，模型将自动上传到 ModelScope。
-
-```
-
-### Kolors
-
-以下文件将用于构建 Kolors。你可以从 [HuggingFace](https://huggingface.co/Kwai-Kolors/Kolors) 或 [ModelScope](https://modelscope.cn/models/Kwai-Kolors/Kolors) 下载 Kolors。由于精度溢出问题，我们需要下载额外的 VAE 模型（从 [HuggingFace](https://huggingface.co/madebyollin/sdxl-vae-fp16-fix) 或 [ModelScope](https://modelscope.cn/models/AI-ModelScope/sdxl-vae-fp16-fix)）。你可以使用以下代码下载这些文件：
-
-
-```python
-from diffsynth import download_models
-
-download_models(["Kolors", "SDXL-vae-fp16-fix"])
-```
-
-```
-models
-├── kolors
-│   └── Kolors
-│       ├── text_encoder
-│       │   ├── config.json
-│       │   ├── pytorch_model-00001-of-00007.bin
-│       │   ├── pytorch_model-00002-of-00007.bin
-│       │   ├── pytorch_model-00003-of-00007.bin
-│       │   ├── pytorch_model-00004-of-00007.bin
-│       │   ├── pytorch_model-00005-of-00007.bin
-│       │   ├── pytorch_model-00006-of-00007.bin
-│       │   ├── pytorch_model-00007-of-00007.bin
-│       │   └── pytorch_model.bin.index.json
-│       ├── unet
-│       │   └── diffusion_pytorch_model.safetensors
-│       └── vae
-│           └── diffusion_pytorch_model.safetensors
-└── sdxl-vae-fp16-fix
-    └── diffusion_pytorch_model.safetensors
-```
-
-使用下面的命令启动训练任务：
-
-```
-CUDA_VISIBLE_DEVICES="0" python examples/train/kolors/train_kolors_lora.py \
-  --pretrained_unet_path models/kolors/Kolors/unet/diffusion_pytorch_model.safetensors \
-  --pretrained_text_encoder_path models/kolors/Kolors/text_encoder \
-  --pretrained_fp16_vae_path models/sdxl-vae-fp16-fix/diffusion_pytorch_model.safetensors \
-  --dataset_path data/dog \
-  --output_path ./models \
-  --max_epochs 1 \
-  --steps_per_epoch 500 \
-  --height 1024 \
-  --width 1024 \
-  --center_crop \
-  --precision "16-mixed" \
-  --learning_rate 1e-4 \
-  --lora_rank 4 \
-  --lora_alpha 4 \
-  --use_gradient_checkpointing
-```
-
-有关参数的更多信息，请使用 `python examples/train/kolors/train_kolors_lora.py -h` 查看详细信息。
-
-训练完成后，使用 `model_manager.load_lora` 加载 LoRA 以进行推理。
-
-
-
-```python
-from diffsynth import ModelManager, SD3ImagePipeline
-import torch
-
-model_manager = ModelManager(torch_dtype=torch.float16, device="cuda",
-                             file_path_list=["models/stable_diffusion_3/sd3_medium_incl_clips.safetensors"])
-model_manager.load_lora("models/lightning_logs/version_0/checkpoints/epoch=0-step=500.ckpt", lora_alpha=1.0)
-pipe = SD3ImagePipeline.from_model_manager(model_manager)
-
-torch.manual_seed(0)
-image = pipe(
-    prompt="a dog is jumping, flowers around the dog, the background is mountains and clouds", 
-    negative_prompt="bad quality, poor quality, doll, disfigured, jpg, toy, bad anatomy, missing limbs, missing fingers, 3d, cgi, extra tails",
-    cfg_scale=7.5,
-    num_inference_steps=100, width=1024, height=1024,
-)
-image.save("image_with_lora.jpg")
-```
-
-
-### Stable Diffusion 3
-
-训练脚本只需要一个文件。你可以使用 [`sd3_medium_incl_clips.safetensors`](https://huggingface.co/stabilityai/stable-diffusion-3-medium/resolve/main/sd3_medium_incl_clips.safetensors)（没有 T5 Encoder）或 [`sd3_medium_incl_clips_t5xxlfp16.safetensors`](https://huggingface.co/stabilityai/stable-diffusion-3-medium/resolve/main/sd3_medium_incl_clips_t5xxlfp16.safetensors)（有 T5 Encoder）。请使用以下代码下载这些文件：
-
-
-```python
-from diffsynth import download_models
-
-download_models(["StableDiffusion3", "StableDiffusion3_without_T5"])
-```
-
-```
-models/stable_diffusion_3/
-├── Put Stable Diffusion 3 checkpoints here.txt
-├── sd3_medium_incl_clips.safetensors
-└── sd3_medium_incl_clips_t5xxlfp16.safetensors
-```
-
-使用下面的命令启动训练任务：
-
-```
-CUDA_VISIBLE_DEVICES="0" python examples/train/stable_diffusion_3/train_sd3_lora.py \
-  --pretrained_path models/stable_diffusion_3/sd3_medium_incl_clips.safetensors \
-  --dataset_path data/dog \
-  --output_path ./models \
-  --max_epochs 1 \
-  --steps_per_epoch 500 \
-  --height 1024 \
-  --width 1024 \
-  --center_crop \
-  --precision "16-mixed" \
-  --learning_rate 1e-4 \
-  --lora_rank 4 \
-  --lora_alpha 4 \
-  --use_gradient_checkpointing
-```
-
-有关参数的更多信息，请使用 `python examples/train/stable_diffusion_3/train_sd3_lora.py -h` 查看详细信息。
-
-训练完成后，使用 `model_manager.load_lora` 加载 LoRA 以进行推理。
-
-```python
-from diffsynth import ModelManager, SD3ImagePipeline
-import torch
-
-model_manager = ModelManager(torch_dtype=torch.float16, device="cuda",
-                             file_path_list=["models/stable_diffusion_3/sd3_medium_incl_clips.safetensors"])
-model_manager.load_lora("models/lightning_logs/version_0/checkpoints/epoch=0-step=500.ckpt", lora_alpha=1.0)
-pipe = SD3ImagePipeline.from_model_manager(model_manager)
-
-torch.manual_seed(0)
-image = pipe(
-    prompt="a dog is jumping, flowers around the dog, the background is mountains and clouds", 
-    negative_prompt="bad quality, poor quality, doll, disfigured, jpg, toy, bad anatomy, missing limbs, missing fingers, 3d, cgi, extra tails",
-    cfg_scale=7.5,
-    num_inference_steps=100, width=1024, height=1024,
-)
-image.save("image_with_lora.jpg")
-```
-
-### Hunyuan-DiT
-
-构建 Hunyuan DiT 需要四个文件。你可以从 [HuggingFace](https://huggingface.co/Tencent-Hunyuan/HunyuanDiT) 或 [ModelScope](https://www.modelscope.cn/models/modelscope/HunyuanDiT/summary) 下载这些文件。你可以使用以下代码下载这些文件：
-
-
-```python
-from diffsynth import download_models
-
-download_models(["HunyuanDiT"])
-```
-
-```
-models/HunyuanDiT/
-├── Put Hunyuan DiT checkpoints here.txt
-└── t2i
-    ├── clip_text_encoder
-    │   └── pytorch_model.bin
-    ├── model
-    │   └── pytorch_model_ema.pt
-    ├── mt5
-    │   └── pytorch_model.bin
-    └── sdxl-vae-fp16-fix
-        └── diffusion_pytorch_model.bin
-```
-
-Launch the training task using the following command:
-
-```
-CUDA_VISIBLE_DEVICES="0" python examples/train/hunyuan_dit/train_hunyuan_dit_lora.py \
-  --pretrained_path models/HunyuanDiT/t2i \
-  --dataset_path data/dog \
-  --output_path ./models \
-  --max_epochs 1 \
-  --steps_per_epoch 500 \
-  --height 1024 \
-  --width 1024 \
-  --center_crop \
-  --precision "16-mixed" \
-  --learning_rate 1e-4 \
-  --lora_rank 4 \
-  --lora_alpha 4 \
-  --use_gradient_checkpointing
-```
-
-有关参数的更多信息，请使用 `python examples/train/hunyuan_dit/train_hunyuan_dit_lora.py -h` 查看详细信息。
-
-训练完成后，使用 `model_manager.load_lora` 加载 LoRA 以进行推理。
-
-
-```python
-from diffsynth import ModelManager, HunyuanDiTImagePipeline
-import torch
-
-model_manager = ModelManager(torch_dtype=torch.float16, device="cuda",
-                             file_path_list=[
-                                 "models/HunyuanDiT/t2i/clip_text_encoder/pytorch_model.bin",
-                                 "models/HunyuanDiT/t2i/model/pytorch_model_ema.pt",
-                                 "models/HunyuanDiT/t2i/mt5/pytorch_model.bin",
-                                 "models/HunyuanDiT/t2i/sdxl-vae-fp16-fix/diffusion_pytorch_model.bin"
-                             ])
-model_manager.load_lora("models/lightning_logs/version_0/checkpoints/epoch=0-step=500.ckpt", lora_alpha=1.0)
-pipe = HunyuanDiTImagePipeline.from_model_manager(model_manager)
-
-torch.manual_seed(0)
-image = pipe(
-    prompt="一只小狗蹦蹦跳跳，周围是姹紫嫣红的鲜花，远处是山脉", 
-    negative_prompt="",
-    cfg_scale=7.5,
-    num_inference_steps=100, width=1024, height=1024,
-)
-image.save("image_with_lora.jpg")
-```
-
-### Stable Diffusion
-
-训练脚本只需要一个文件。我们支持 [CivitAI](https://civitai.com/) 中的主流检查点。默认情况下，我们使用基础的 Stable Diffusion v1.5。你可以从 [HuggingFace](https://huggingface.co/runwayml/stable-diffusion-v1-5/resolve/main/v1-5-pruned-emaonly.safetensors) 或 [ModelScope](https://www.modelscope.cn/models/AI-ModelScope/stable-diffusion-v1-5/resolve/master/v1-5-pruned-emaonly.safetensors) 下载。你可以使用以下代码下载这个文件：
-
-```python
-from diffsynth import download_models
-
-download_models(["StableDiffusion_v15"])
-```
-
-```
-models/stable_diffusion
-├── Put Stable Diffusion checkpoints here.txt
-└── v1-5-pruned-emaonly.safetensors
-```
-
-Launch the training task using the following command:
-
-```
-CUDA_VISIBLE_DEVICES="0" python examples/train/stable_diffusion/train_sd_lora.py \
-  --pretrained_path models/stable_diffusion/v1-5-pruned-emaonly.safetensors \
-  --dataset_path data/dog \
-  --output_path ./models \
-  --max_epochs 1 \
-  --steps_per_epoch 500 \
-  --height 512 \
-  --width 512 \
-  --center_crop \
-  --precision "16-mixed" \
-  --learning_rate 1e-4 \
-  --lora_rank 4 \
-  --lora_alpha 4 \
-  --use_gradient_checkpointing
-```
-
-有关参数的更多信息，请使用 `python examples/train/stable_diffusion/train_sd_lora.py -h` 查看详细信息。
-
-训练完成后，使用 `model_manager.load_lora` 加载 LoRA 以进行推理。
-
-
-
-```python
-from diffsynth import ModelManager, SDImagePipeline
-import torch
-
-model_manager = ModelManager(torch_dtype=torch.float16, device="cuda",
-                             file_path_list=["models/stable_diffusion/v1-5-pruned-emaonly.safetensors"])
-model_manager.load_lora("models/lightning_logs/version_0/checkpoints/epoch=0-step=500.ckpt", lora_alpha=1.0)
-pipe = SDImagePipeline.from_model_manager(model_manager)
-
-torch.manual_seed(0)
-image = pipe(
-    prompt="a dog is jumping, flowers around the dog, the background is mountains and clouds", 
-    negative_prompt="bad quality, poor quality, doll, disfigured, jpg, toy, bad anatomy, missing limbs, missing fingers, 3d, cgi, extra tails",
-    cfg_scale=7.5,
-    num_inference_steps=100, width=512, height=512,
-)
-image.save("image_with_lora.jpg")
-```
-
-### Stable Diffusion XL
-
-训练脚本只需要一个文件。我们支持 [CivitAI](https://civitai.com/) 中的主流检查点。默认情况下，我们使用基础的 Stable Diffusion XL。你可以从 [HuggingFace](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/resolve/main/sd_xl_base_1.0.safetensors) 或 [ModelScope](https://www.modelscope.cn/models/AI-ModelScope/stable-diffusion-xl-base-1.0/resolve/master/sd_xl_base_1.0.safetensors) 下载。也可以使用以下代码下载这个文件：
-
-```python
-from diffsynth import download_models
-
-download_models(["StableDiffusionXL_v1"])
-```
-
-```
-models/stable_diffusion_xl
-├── Put Stable Diffusion XL checkpoints here.txt
-└── sd_xl_base_1.0.safetensors
-```
-
-We observed that Stable Diffusion XL is not float16-safe, thus we recommand users to use float32.
-
-```
-CUDA_VISIBLE_DEVICES="0" python examples/train/stable_diffusion_xl/train_sdxl_lora.py \
-  --pretrained_path models/stable_diffusion_xl/sd_xl_base_1.0.safetensors \
-  --dataset_path data/dog \
-  --output_path ./models \
-  --max_epochs 1 \
-  --steps_per_epoch 500 \
-  --height 1024 \
-  --width 1024 \
-  --center_crop \
-  --precision "32" \
-  --learning_rate 1e-4 \
-  --lora_rank 4 \
-  --lora_alpha 4 \
-  --use_gradient_checkpointing
-```
-
-有关参数的更多信息，请使用 `python examples/train/stable_diffusion_xl/train_sdxl_lora.py -h` 查看详细信息。
-
-训练完成后，使用 `model_manager.load_lora` 加载 LoRA 以进行推理。
-
-```python
-from diffsynth import ModelManager, SDXLImagePipeline
-import torch
-
-model_manager = ModelManager(torch_dtype=torch.float16, device="cuda",
-                             file_path_list=["models/stable_diffusion_xl/sd_xl_base_1.0.safetensors"])
-model_manager.load_lora("models/lightning_logs/version_0/checkpoints/epoch=0-step=500.ckpt", lora_alpha=1.0)
-pipe = SDXLImagePipeline.from_model_manager(model_manager)
-
-torch.manual_seed(0)
-image = pipe(
-    prompt="a dog is jumping, flowers around the dog, the background is mountains and clouds", 
-    negative_prompt="bad quality, poor quality, doll, disfigured, jpg, toy, bad anatomy, missing limbs, missing fingers, 3d, cgi, extra tails",
-    cfg_scale=7.5,
-    num_inference_steps=100, width=1024, height=1024,
-)
-image.save("image_with_lora.jpg")
-```
--- a/docs/source/GetStarted/WebUI.md
+++ b/docs/source/GetStarted/WebUI.md
--- a/docs/source/finetune/overview.md
+++ b/docs/source/finetune/overview.md
@@ -0,0 +1,98 @@
+# 训练框架
+
+我们实现了一个用于文本到图像扩散模型的训练框架，使用户能够轻松地使用我们的框架训练 LoRA 模型。我们提供的脚本具有以下特点：
+
+* **功能全面**：我们的训练框架支持多GPU和多机器配置，便于使用 DeepSpeed 加速，并包括梯度检查点优化，适用于内存需求较大的模型。
+* **代码简洁**：我们避免了大块复杂的代码。通用模块实现于 `diffsynth/trainers/text_to_image.py` 中，而模型特定的训练脚本仅包含与模型架构相关的最少代码，便于学术研究人员使用。
+* **模块化设计**：基于通用的 Pytorch-Lightning 框架，我们的训练框架在功能上是解耦的，允许开发者通过修改我们的脚本轻松引入额外的训练技术，以满足他们的需求。
+
+LoRA 微调的图像示例。提示词为 "一只小狗蹦蹦跳跳，周围是姹紫嫣红的鲜花，远处是山脉"（针对中文模型）或 "a dog is jumping, flowers around the dog, the background is mountains and clouds"（针对英文模型）。
+
+||FLUX.1-dev|Kolors|Stable Diffusion 3|Hunyuan-DiT|
+|-|-|-|-|-|
+|Without LoRA|![image_without_lora](https://github.com/user-attachments/assets/df62cef6-d54f-4e3d-a602-5dd290079d49)|![image_without_lora](https://github.com/modelscope/DiffSynth-Studio/assets/35051019/9d79ed7a-e8cf-4d98-800a-f182809db318)|![image_without_lora](https://github.com/modelscope/DiffSynth-Studio/assets/35051019/ddb834a5-6366-412b-93dc-6d957230d66e)|![image_without_lora](https://github.com/Artiprocher/DiffSynth-Studio/assets/35051019/1aa21de5-a992-4b66-b14f-caa44e08876e)|
+|With LoRA|![image_with_lora](https://github.com/user-attachments/assets/4fd39890-0291-4d19-8a88-d70d0ae18533)|![image_with_lora](https://github.com/modelscope/DiffSynth-Studio/assets/35051019/02f62323-6ee5-4788-97a1-549732dbe4f0)|![image_with_lora](https://github.com/modelscope/DiffSynth-Studio/assets/35051019/8e7b2888-d874-4da4-a75b-11b6b214b9bf)|![image_with_lora](https://github.com/Artiprocher/DiffSynth-Studio/assets/35051019/83a0a41a-691f-4610-8e7b-d8e17c50a282)|
+
+## 安装额外包
+
+```
+pip install peft lightning
+```
+
+## 准备数据集
+
+我们提供了一个[示例数据集](https://modelscope.cn/datasets/buptwq/lora-stable-diffusion-finetune/files)。你需要将训练数据集按照如下形式组织：
+
+```
+data/dog/
+└── train
+    ├── 00.jpg
+    ├── 01.jpg
+    ├── 02.jpg
+    ├── 03.jpg
+    ├── 04.jpg
+    └── metadata.csv
+```
+
+`metadata.csv`:
+
+```
+file_name,text
+00.jpg,a dog
+01.jpg,a dog
+02.jpg,a dog
+03.jpg,a dog
+04.jpg,a dog
+```
+
+请注意，如果模型是中文模型（例如，Hunyuan-DiT 和 Kolors），我们建议在数据集中使用中文文本。例如：
+
+```
+file_name,text
+00.jpg,一只小狗
+01.jpg,一只小狗
+02.jpg,一只小狗
+03.jpg,一只小狗
+04.jpg,一只小狗
+```
+
+## 训练 LoRA 模型
+
+通用参数选项：
+
+```
+  --lora_target_modules LORA_TARGET_MODULES
+                        LoRA 模块所在的层。
+  --dataset_path DATASET_PATH
+                        数据集的路径。
+  --output_path OUTPUT_PATH
+                        模型保存路径。
+  --steps_per_epoch STEPS_PER_EPOCH
+                        每个周期的步数。
+  --height HEIGHT       图像高度。
+  --width WIDTH         图像宽度。
+  --center_crop         是否将输入图像中心裁剪到指定分辨率。如果未设置，图像将被随机裁剪。图像会在裁剪前先调整到指定分辨率。
+  --random_flip         是否随机水平翻转图像。
+  --batch_size BATCH_SIZE
+                        训练数据加载器的批量大小（每设备）。
+  --dataloader_num_workers DATALOADER_NUM_WORKERS
+                        数据加载使用的子进程数量。0 表示数据将在主进程中加载。
+  --precision {32,16,16-mixed}
+                        训练精度。
+  --learning_rate LEARNING_RATE
+                        学习率。
+  --lora_rank LORA_RANK
+                        LoRA 更新矩阵的维度。
+  --lora_alpha LORA_ALPHA
+                        LoRA 更新矩阵的权重。
+  --use_gradient_checkpointing
+                        是否使用梯度检查点。
+  --accumulate_grad_batches ACCUMULATE_GRAD_BATCHES
+                        梯度累积的批次数量。
+  --training_strategy {auto,deepspeed_stage_1,deepspeed_stage_2,deepspeed_stage_3}
+                        训练策略。
+  --max_epochs MAX_EPOCHS
+                        训练轮数。
+  --modelscope_model_id MODELSCOPE_MODEL_ID
+                        ModelScope 上的模型 ID (https://www.modelscope.cn/)。如果提供模型 ID，模型将自动上传到 ModelScope。
+```
--- a/docs/source/finetune/train_flux_lora.md
+++ b/docs/source/finetune/train_flux_lora.md
@@ -0,0 +1,71 @@
+# 训练 FLUX LoRA
+
+以下文件将会被用于构建 FLUX 模型。 你可以从[huggingface](https://huggingface.co/black-forest-labs/FLUX.1-dev)或[modelscope](https://www.modelscope.cn/models/ai-modelscope/flux.1-dev)下载，也可以使用以下代码下载这些文件:
+
+```python
+from diffsynth import download_models
+
+download_models(["FLUX.1-dev"])
+```
+
+```
+models/FLUX/
+└── FLUX.1-dev
+    ├── ae.safetensors
+    ├── flux1-dev.safetensors
+    ├── text_encoder
+    │   └── model.safetensors
+    └── text_encoder_2
+        ├── config.json
+        ├── model-00001-of-00002.safetensors
+        ├── model-00002-of-00002.safetensors
+        └── model.safetensors.index.json
+```
+
+使用以下命令启动训练任务：
+
+```
+CUDA_VISIBLE_DEVICES="0" python examples/train/flux/train_flux_lora.py \
+  --pretrained_text_encoder_path models/FLUX/FLUX.1-dev/text_encoder/model.safetensors \
+  --pretrained_text_encoder_2_path models/FLUX/FLUX.1-dev/text_encoder_2 \
+  --pretrained_dit_path models/FLUX/FLUX.1-dev/flux1-dev.safetensors \
+  --pretrained_vae_path models/FLUX/FLUX.1-dev/ae.safetensors \
+  --dataset_path data/dog \
+  --output_path ./models \
+  --max_epochs 1 \
+  --steps_per_epoch 500 \
+  --height 1024 \
+  --width 1024 \
+  --center_crop \
+  --precision "bf16" \
+  --learning_rate 1e-4 \
+  --lora_rank 4 \
+  --lora_alpha 4 \
+  --use_gradient_checkpointing
+```
+
+有关参数的更多信息，请使用 `python examples/train/flux/train_flux_lora.py -h` 查看详细信息。
+
+训练完成后，使用 `model_manager.load_lora` 加载 LoRA 以进行推理。
+
+```python
+from diffsynth import ModelManager, FluxImagePipeline
+import torch
+
+model_manager = ModelManager(torch_dtype=torch.float16, device="cuda",
+                             file_path_list=[
+                                 "models/FLUX/FLUX.1-dev/text_encoder/model.safetensors",
+                                 "models/FLUX/FLUX.1-dev/text_encoder_2",
+                                 "models/FLUX/FLUX.1-dev/ae.safetensors",
+                                 "models/FLUX/FLUX.1-dev/flux1-dev.safetensors"
+                             ])
+model_manager.load_lora("models/lightning_logs/version_0/checkpoints/epoch=0-step=500.ckpt", lora_alpha=1.0)
+pipe = SDXLImagePipeline.from_model_manager(model_manager)
+
+torch.manual_seed(0)
+image = pipe(
+    prompt=prompt,
+    num_inference_steps=30, embedded_guidance=3.5
+)
+image.save("image_with_lora.jpg")
+```
--- a/docs/source/finetune/train_hunyuan_dit_lora.md
+++ b/docs/source/finetune/train_hunyuan_dit_lora.md
@@ -0,0 +1,72 @@
+# 训练 Hunyuan-DiT LoRA
+
+构建 Hunyuan DiT 需要四个文件。你可以从 [HuggingFace](https://huggingface.co/Tencent-Hunyuan/HunyuanDiT) 或 [ModelScope](https://www.modelscope.cn/models/modelscope/HunyuanDiT/summary) 下载这些文件。你可以使用以下代码下载这些文件：
+
+
+```python
+from diffsynth import download_models
+
+download_models(["HunyuanDiT"])
+```
+
+```
+models/HunyuanDiT/
+├── Put Hunyuan DiT checkpoints here.txt
+└── t2i
+    ├── clip_text_encoder
+    │   └── pytorch_model.bin
+    ├── model
+    │   └── pytorch_model_ema.pt
+    ├── mt5
+    │   └── pytorch_model.bin
+    └── sdxl-vae-fp16-fix
+        └── diffusion_pytorch_model.bin
+```
+
+使用以下命令启动训练任务：
+
+```
+CUDA_VISIBLE_DEVICES="0" python examples/train/hunyuan_dit/train_hunyuan_dit_lora.py \
+  --pretrained_path models/HunyuanDiT/t2i \
+  --dataset_path data/dog \
+  --output_path ./models \
+  --max_epochs 1 \
+  --steps_per_epoch 500 \
+  --height 1024 \
+  --width 1024 \
+  --center_crop \
+  --precision "16-mixed" \
+  --learning_rate 1e-4 \
+  --lora_rank 4 \
+  --lora_alpha 4 \
+  --use_gradient_checkpointing
+```
+
+有关参数的更多信息，请使用 `python examples/train/hunyuan_dit/train_hunyuan_dit_lora.py -h` 查看详细信息。
+
+训练完成后，使用 `model_manager.load_lora` 加载 LoRA 以进行推理。
+
+
+```python
+from diffsynth import ModelManager, HunyuanDiTImagePipeline
+import torch
+
+model_manager = ModelManager(torch_dtype=torch.float16, device="cuda",
+                             file_path_list=[
+                                 "models/HunyuanDiT/t2i/clip_text_encoder/pytorch_model.bin",
+                                 "models/HunyuanDiT/t2i/model/pytorch_model_ema.pt",
+                                 "models/HunyuanDiT/t2i/mt5/pytorch_model.bin",
+                                 "models/HunyuanDiT/t2i/sdxl-vae-fp16-fix/diffusion_pytorch_model.bin"
+                             ])
+model_manager.load_lora("models/lightning_logs/version_0/checkpoints/epoch=0-step=500.ckpt", lora_alpha=1.0)
+pipe = HunyuanDiTImagePipeline.from_model_manager(model_manager)
+
+torch.manual_seed(0)
+image = pipe(
+    prompt="一只小狗蹦蹦跳跳，周围是姹紫嫣红的鲜花，远处是山脉", 
+    negative_prompt="",
+    cfg_scale=7.5,
+    num_inference_steps=100, width=1024, height=1024,
+)
+image.save("image_with_lora.jpg")
+```
--- a/docs/source/finetune/train_kolors_lora.md
+++ b/docs/source/finetune/train_kolors_lora.md
@@ -0,0 +1,78 @@
+# 训练 Kolors LoRA
+
+以下文件将用于构建 Kolors。你可以从 [HuggingFace](https://huggingface.co/Kwai-Kolors/Kolors) 或 [ModelScope](https://modelscope.cn/models/Kwai-Kolors/Kolors) 下载 Kolors。由于精度溢出问题，我们需要下载额外的 VAE 模型（从 [HuggingFace](https://huggingface.co/madebyollin/sdxl-vae-fp16-fix) 或 [ModelScope](https://modelscope.cn/models/AI-ModelScope/sdxl-vae-fp16-fix)）。你可以使用以下代码下载这些文件：
+
+
+```python
+from diffsynth import download_models
+
+download_models(["Kolors", "SDXL-vae-fp16-fix"])
+```
+
+```
+models
+├── kolors
+│   └── Kolors
+│       ├── text_encoder
+│       │   ├── config.json
+│       │   ├── pytorch_model-00001-of-00007.bin
+│       │   ├── pytorch_model-00002-of-00007.bin
+│       │   ├── pytorch_model-00003-of-00007.bin
+│       │   ├── pytorch_model-00004-of-00007.bin
+│       │   ├── pytorch_model-00005-of-00007.bin
+│       │   ├── pytorch_model-00006-of-00007.bin
+│       │   ├── pytorch_model-00007-of-00007.bin
+│       │   └── pytorch_model.bin.index.json
+│       ├── unet
+│       │   └── diffusion_pytorch_model.safetensors
+│       └── vae
+│           └── diffusion_pytorch_model.safetensors
+└── sdxl-vae-fp16-fix
+    └── diffusion_pytorch_model.safetensors
+```
+
+使用下面的命令启动训练任务：
+
+```
+CUDA_VISIBLE_DEVICES="0" python examples/train/kolors/train_kolors_lora.py \
+  --pretrained_unet_path models/kolors/Kolors/unet/diffusion_pytorch_model.safetensors \
+  --pretrained_text_encoder_path models/kolors/Kolors/text_encoder \
+  --pretrained_fp16_vae_path models/sdxl-vae-fp16-fix/diffusion_pytorch_model.safetensors \
+  --dataset_path data/dog \
+  --output_path ./models \
+  --max_epochs 1 \
+  --steps_per_epoch 500 \
+  --height 1024 \
+  --width 1024 \
+  --center_crop \
+  --precision "16-mixed" \
+  --learning_rate 1e-4 \
+  --lora_rank 4 \
+  --lora_alpha 4 \
+  --use_gradient_checkpointing
+```
+
+有关参数的更多信息，请使用 `python examples/train/kolors/train_kolors_lora.py -h` 查看详细信息。
+
+训练完成后，使用 `model_manager.load_lora` 加载 LoRA 以进行推理。
+
+
+
+```python
+from diffsynth import ModelManager, SD3ImagePipeline
+import torch
+
+model_manager = ModelManager(torch_dtype=torch.float16, device="cuda",
+                             file_path_list=["models/stable_diffusion_3/sd3_medium_incl_clips.safetensors"])
+model_manager.load_lora("models/lightning_logs/version_0/checkpoints/epoch=0-step=500.ckpt", lora_alpha=1.0)
+pipe = SD3ImagePipeline.from_model_manager(model_manager)
+
+torch.manual_seed(0)
+image = pipe(
+    prompt="a dog is jumping, flowers around the dog, the background is mountains and clouds", 
+    negative_prompt="bad quality, poor quality, doll, disfigured, jpg, toy, bad anatomy, missing limbs, missing fingers, 3d, cgi, extra tails",
+    cfg_scale=7.5,
+    num_inference_steps=100, width=1024, height=1024,
+)
+image.save("image_with_lora.jpg")
+```
--- a/docs/source/finetune/train_sd3_lora.md
+++ b/docs/source/finetune/train_sd3_lora.md
@@ -0,0 +1,59 @@
+# 训练 Stable Diffusion 3 LoRA
+
+训练脚本只需要一个文件。你可以使用 [`sd3_medium_incl_clips.safetensors`](https://huggingface.co/stabilityai/stable-diffusion-3-medium/resolve/main/sd3_medium_incl_clips.safetensors)（没有 T5 Encoder）或 [`sd3_medium_incl_clips_t5xxlfp16.safetensors`](https://huggingface.co/stabilityai/stable-diffusion-3-medium/resolve/main/sd3_medium_incl_clips_t5xxlfp16.safetensors)（有 T5 Encoder）。请使用以下代码下载这些文件：
+
+
+```python
+from diffsynth import download_models
+
+download_models(["StableDiffusion3", "StableDiffusion3_without_T5"])
+```
+
+```
+models/stable_diffusion_3/
+├── Put Stable Diffusion 3 checkpoints here.txt
+├── sd3_medium_incl_clips.safetensors
+└── sd3_medium_incl_clips_t5xxlfp16.safetensors
+```
+
+使用下面的命令启动训练任务：
+
+```
+CUDA_VISIBLE_DEVICES="0" python examples/train/stable_diffusion_3/train_sd3_lora.py \
+  --pretrained_path models/stable_diffusion_3/sd3_medium_incl_clips.safetensors \
+  --dataset_path data/dog \
+  --output_path ./models \
+  --max_epochs 1 \
+  --steps_per_epoch 500 \
+  --height 1024 \
+  --width 1024 \
+  --center_crop \
+  --precision "16-mixed" \
+  --learning_rate 1e-4 \
+  --lora_rank 4 \
+  --lora_alpha 4 \
+  --use_gradient_checkpointing
+```
+
+有关参数的更多信息，请使用 `python examples/train/stable_diffusion_3/train_sd3_lora.py -h` 查看详细信息。
+
+训练完成后，使用 `model_manager.load_lora` 加载 LoRA 以进行推理。
+
+```python
+from diffsynth import ModelManager, SD3ImagePipeline
+import torch
+
+model_manager = ModelManager(torch_dtype=torch.float16, device="cuda",
+                             file_path_list=["models/stable_diffusion_3/sd3_medium_incl_clips.safetensors"])
+model_manager.load_lora("models/lightning_logs/version_0/checkpoints/epoch=0-step=500.ckpt", lora_alpha=1.0)
+pipe = SD3ImagePipeline.from_model_manager(model_manager)
+
+torch.manual_seed(0)
+image = pipe(
+    prompt="a dog is jumping, flowers around the dog, the background is mountains and clouds", 
+    negative_prompt="bad quality, poor quality, doll, disfigured, jpg, toy, bad anatomy, missing limbs, missing fingers, 3d, cgi, extra tails",
+    cfg_scale=7.5,
+    num_inference_steps=100, width=1024, height=1024,
+)
+image.save("image_with_lora.jpg")
+```
--- a/docs/source/finetune/train_sd_lora.md
+++ b/docs/source/finetune/train_sd_lora.md
@@ -0,0 +1,59 @@
+# 训练 Stable Diffusion LoRA
+
+训练脚本只需要一个文件。我们支持 [CivitAI](https://civitai.com/) 中的主流检查点。默认情况下，我们使用基础的 Stable Diffusion v1.5。你可以从 [HuggingFace](https://huggingface.co/runwayml/stable-diffusion-v1-5/resolve/main/v1-5-pruned-emaonly.safetensors) 或 [ModelScope](https://www.modelscope.cn/models/AI-ModelScope/stable-diffusion-v1-5/resolve/master/v1-5-pruned-emaonly.safetensors) 下载。你可以使用以下代码下载这个文件：
+
+```python
+from diffsynth import download_models
+
+download_models(["StableDiffusion_v15"])
+```
+
+```
+models/stable_diffusion
+├── Put Stable Diffusion checkpoints here.txt
+└── v1-5-pruned-emaonly.safetensors
+```
+
+使用以下命令启动训练任务：
+
+```
+CUDA_VISIBLE_DEVICES="0" python examples/train/stable_diffusion/train_sd_lora.py \
+  --pretrained_path models/stable_diffusion/v1-5-pruned-emaonly.safetensors \
+  --dataset_path data/dog \
+  --output_path ./models \
+  --max_epochs 1 \
+  --steps_per_epoch 500 \
+  --height 512 \
+  --width 512 \
+  --center_crop \
+  --precision "16-mixed" \
+  --learning_rate 1e-4 \
+  --lora_rank 4 \
+  --lora_alpha 4 \
+  --use_gradient_checkpointing
+```
+
+有关参数的更多信息，请使用 `python examples/train/stable_diffusion/train_sd_lora.py -h` 查看详细信息。
+
+训练完成后，使用 `model_manager.load_lora` 加载 LoRA 以进行推理。
+
+
+
+```python
+from diffsynth import ModelManager, SDImagePipeline
+import torch
+
+model_manager = ModelManager(torch_dtype=torch.float16, device="cuda",
+                             file_path_list=["models/stable_diffusion/v1-5-pruned-emaonly.safetensors"])
+model_manager.load_lora("models/lightning_logs/version_0/checkpoints/epoch=0-step=500.ckpt", lora_alpha=1.0)
+pipe = SDImagePipeline.from_model_manager(model_manager)
+
+torch.manual_seed(0)
+image = pipe(
+    prompt="a dog is jumping, flowers around the dog, the background is mountains and clouds", 
+    negative_prompt="bad quality, poor quality, doll, disfigured, jpg, toy, bad anatomy, missing limbs, missing fingers, 3d, cgi, extra tails",
+    cfg_scale=7.5,
+    num_inference_steps=100, width=512, height=512,
+)
+image.save("image_with_lora.jpg")
+```
--- a/docs/source/finetune/train_sdxl_lora.md
+++ b/docs/source/finetune/train_sdxl_lora.md
@@ -0,0 +1,57 @@
+# 训练 Stable Diffusion XL LoRA
+
+训练脚本只需要一个文件。我们支持 [CivitAI](https://civitai.com/) 中的主流检查点。默认情况下，我们使用基础的 Stable Diffusion XL。你可以从 [HuggingFace](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/resolve/main/sd_xl_base_1.0.safetensors) 或 [ModelScope](https://www.modelscope.cn/models/AI-ModelScope/stable-diffusion-xl-base-1.0/resolve/master/sd_xl_base_1.0.safetensors) 下载。也可以使用以下代码下载这个文件：
+
+```python
+from diffsynth import download_models
+
+download_models(["StableDiffusionXL_v1"])
+```
+
+```
+models/stable_diffusion_xl
+├── Put Stable Diffusion XL checkpoints here.txt
+└── sd_xl_base_1.0.safetensors
+```
+
+我们观察到 Stable Diffusion XL 在 float16 精度下会出现数值精度溢出，因此我们建议用户使用 float32 精度训练，使用以下命令启动训练任务：
+
+```
+CUDA_VISIBLE_DEVICES="0" python examples/train/stable_diffusion_xl/train_sdxl_lora.py \
+  --pretrained_path models/stable_diffusion_xl/sd_xl_base_1.0.safetensors \
+  --dataset_path data/dog \
+  --output_path ./models \
+  --max_epochs 1 \
+  --steps_per_epoch 500 \
+  --height 1024 \
+  --width 1024 \
+  --center_crop \
+  --precision "32" \
+  --learning_rate 1e-4 \
+  --lora_rank 4 \
+  --lora_alpha 4 \
+  --use_gradient_checkpointing
+```
+
+有关参数的更多信息，请使用 `python examples/train/stable_diffusion_xl/train_sdxl_lora.py -h` 查看详细信息。
+
+训练完成后，使用 `model_manager.load_lora` 加载 LoRA 以进行推理。
+
+```python
+from diffsynth import ModelManager, SDXLImagePipeline
+import torch
+
+model_manager = ModelManager(torch_dtype=torch.float16, device="cuda",
+                             file_path_list=["models/stable_diffusion_xl/sd_xl_base_1.0.safetensors"])
+model_manager.load_lora("models/lightning_logs/version_0/checkpoints/epoch=0-step=500.ckpt", lora_alpha=1.0)
+pipe = SDXLImagePipeline.from_model_manager(model_manager)
+
+torch.manual_seed(0)
+image = pipe(
+    prompt="a dog is jumping, flowers around the dog, the background is mountains and clouds", 
+    negative_prompt="bad quality, poor quality, doll, disfigured, jpg, toy, bad anatomy, missing limbs, missing fingers, 3d, cgi, extra tails",
+    cfg_scale=7.5,
+    num_inference_steps=100, width=1024, height=1024,
+)
+image.save("image_with_lora.jpg")
+```
--- a/docs/source/index.rst
+++ b/docs/source/index.rst
@@ -6,28 +6,30 @@
 DiffSynth-Studio 文档
 ==============================

-Add your content using ``reStructuredText`` syntax. See the
-`reStructuredText <https://www.sphinx-doc.org/en/master/usage/restructuredtext/index.html>`_
-documentation for details.
-
+欢迎来到 DiffSynth-Studio，我们旨在构建 Diffusion 模型的开源互联生态，在这里，你可以体验到 AIGC（AI Generated Content）技术魔法般的魅力！

 .. toctree::
   :maxdepth: 1
-   :caption: Contents:
-
-   GetStarted/A_simple_example.md
-   GetStarted/Download_models.md
-   GetStarted/ModelManager.md
-   GetStarted/Models.md
-   GetStarted/Pipelines.md
-   GetStarted/PromptProcessing.md
-   GetStarted/Schedulers.md
-   GetStarted/Fine-tuning.md
-   GetStarted/Extensions.md
-   GetStarted/WebUI.md
-
+   :caption: 快速开始

+   tutorial/ASimpleExample.md
+   tutorial/Installation.md
+   tutorial/DownloadModels.md
+   tutorial/Models.md
+   tutorial/Pipelines.md
+   tutorial/PromptProcessing.md
+   tutorial/Extensions.md
+   tutorial/Schedulers.md

 .. toctree::
   :maxdepth: 1
-   :caption: API Docs
+   :caption: 微调
+
+   finetune/overview.md
+   finetune/train_flux_lora.md
+   finetune/train_kolors_lora.md
+   finetune/train_sd3_lora.md
+   finetune/train_hunyuan_dit_lora.md
+   finetune/train_sdxl_lora.md
+   finetune/train_sd_lora.md
+   
--- a/docs/source/tutorial/ASimpleExample.md
+++ b/docs/source/tutorial/ASimpleExample.md
@@ -0,0 +1,81 @@
+# 快速开始
+
+在这篇文档中，我们通过一段代码为你介绍如何快速上手使用 DiffSynth-Studio 进行创作。
+
+## 安装
+
+使用以下命令从 GitHub 克隆并安装 DiffSynth-Studio。更多信息请参考[安装](./Installation.md)。
+
+```shell
+git clone https://github.com/modelscope/DiffSynth-Studio.git
+cd DiffSynth-Studio
+pip install -e .
+```
+
+## 下载模型
+
+我们在 DiffSynth-Studio 中预置了一些主流 Diffusion 模型的下载链接，你可以直接使用 `download_models` 函数下载预置的模型文件。
+
+```python
+from diffsynth import download_models
+
+download_models(["FLUX.1-dev"])
+```
+
+我们支持从 [ModelScope](https://www.modelscope.cn/) 和 [HuggingFace](https://huggingface.co/) 下载模型，也支持下载非预置的模型，请参考[模型下载](./DownloadModels.md)。
+
+## 加载模型
+
+在 DiffSynth-Studio 中，模型由统一的 `ModelManager` 维护。以 FLUX.1-dev 模型为例，模型包括两个文本编码器、一个 DiT、一个 VAE，使用方式如下所示：
+
+```python
+import torch
+from diffsynth import ModelManager
+
+model_manager = ModelManager(torch_dtype=torch.bfloat16, device="cuda")
+model_manager.load_models([
+    "models/FLUX/FLUX.1-dev/text_encoder/model.safetensors",
+    "models/FLUX/FLUX.1-dev/text_encoder_2",
+    "models/FLUX/FLUX.1-dev/ae.safetensors",
+    "models/FLUX/FLUX.1-dev/flux1-dev.safetensors"
+])
+```
+
+你可以把所有想要加载的模型路径放入其中。对于 `.safetensors` 等格式的模型权重文件，`ModelManager` 在加载后会自动判断模型类型；对于文件夹格式的模型，`ModelManager` 会尝试解析其中的 `config.json` 文件并尝试调用 `transformers` 等第三方库中的对应模块。关于 DiffSynth-Studio 支持的模型，请参考[支持的模型](./Models.md)。
+
+## 构建 Pipeline
+
+DiffSynth-Studio 提供了多个推理 `Pipeline`，这些 `Pipeline` 可以直接通过 `ModelManager` 获取所需的模型并初始化。例如，FLUX.1-dev 模型的文生图 `Pipeline` 可以这样构建：
+
+```python
+pipe = FluxImagePipeline.from_model_manager(model_manager)
+```
+
+更多用于图像生成和视频生成的 `Pipeline` 详见[推理流水线](./Pipelines.md)。
+
+## 生成！
+
+写好你的提示词，交给 DiffSynth-Studio，启动生成任务吧！
+
+```python
+import torch
+from diffsynth import ModelManager, FluxImagePipeline
+
+model_manager = ModelManager(torch_dtype=torch.bfloat16, device="cuda")
+model_manager.load_models([
+    "models/FLUX/FLUX.1-dev/text_encoder/model.safetensors",
+    "models/FLUX/FLUX.1-dev/text_encoder_2",
+    "models/FLUX/FLUX.1-dev/ae.safetensors",
+    "models/FLUX/FLUX.1-dev/flux1-dev.safetensors"
+])
+pipe = FluxImagePipeline.from_model_manager(model_manager)
+
+torch.manual_seed(0)
+image = pipe(
+    prompt="In a forest, a wooden plank sign reading DiffSynth",
+    height=576, width=1024
+)
+image.save("image.jpg")
+```
+
+![image](https://github.com/user-attachments/assets/15a52a2b-2f18-46fe-810c-cb3ad2853919)
--- a/docs/source/tutorial/DownloadModels.md
+++ b/docs/source/tutorial/DownloadModels.md
@@ -0,0 +1,30 @@
+# 下载模型
+
+我们在 DiffSynth-Studio 中预置了一些主流 Diffusion 模型的下载链接，你可以轻松地下载并使用这些模型。
+
+## 下载预置模型
+
+你可以直接使用 `download_models` 函数下载预置的模型文件，其中模型 ID 可参考 [config file](/diffsynth/configs/model_config.py)。
+
+```python
+from diffsynth import download_models
+
+download_models(["FLUX.1-dev"])
+```
+
+对于 VSCode 用户，激活 Pylance 或其他 Python 语言服务后，在代码中输入 `""` 即可显示支持的所有模型 ID。
+
+![image](https://github.com/user-attachments/assets/2bbfec32-e015-45a7-98d9-57af13200b7c)
+
+## 下载非预置模型
+
+你可以选择 [ModelScope](https://modelscope.cn/models) 和 [HuggingFace](https://huggingface.co/models) 两个下载源中的模型。当然，你也可以通过浏览器等工具选择手动下载自己所需的模型。
+
+```python
+from diffsynth.models.downloader import download_from_huggingface, download_from_modelscope
+
+# From Modelscope (recommended)
+download_from_modelscope("Kwai-Kolors/Kolors", "vae/diffusion_pytorch_model.fp16.bin", "models/kolors/Kolors/vae")
+# From Huggingface
+download_from_huggingface("Kwai-Kolors/Kolors", "vae/diffusion_pytorch_model.fp16.safetensors", "models/kolors/Kolors/vae")
+```
--- a/docs/source/GetStarted/Extensions.md
+++ b/docs/source/GetStarted/Extensions.md
--- a/docs/source/GetStarted/Installation.md
+++ b/docs/source/GetStarted/Installation.md
@@ -1,5 +1,7 @@
 # 安装

+目前，DiffSynth-Studio 支持从 GitHub 克隆安装或使用 pip 安装，我们建议用户从 GitHub 克隆安装，从而体验最新的功能。
+
 ## 从源码下载

 1. 克隆源码仓库：
--- a/docs/source/GetStarted/Models.md
+++ b/docs/source/GetStarted/Models.md
@@ -2,6 +2,7 @@

 目前为止，DiffSynth Studio 支持的模型如下所示：

+* [CogVideo](https://huggingface.co/THUDM/CogVideoX-5b)
 * [FLUX](https://huggingface.co/black-forest-labs/FLUX.1-dev)
 * [ExVideo](https://huggingface.co/ECNU-CILab/ExVideo-SVD-128f-v1)
 * [Kolors](https://huggingface.co/Kwai-Kolors/Kolors)
--- a/docs/source/GetStarted/Pipelines.md
+++ b/docs/source/GetStarted/Pipelines.md
@@ -1,27 +1,22 @@
-# Pipelines
+# 流水线

-So far, the following table lists our pipelines and the models supported by each pipeline.
+DiffSynth-Studio 中包括多个流水线，分为图像生成和视频生成两类。

-## Image Pipelines
-
-Pipelines for generating images from text descriptions. Each pipeline relies on specific encoder and decoder models.
+## 图像生成流水线

 | Pipeline                   | Models                                                     |
 |----------------------------|----------------------------------------------------------------|
-| HunyuanDiTImagePipeline     | text_encoder: HunyuanDiTCLIPTextEncoder<br>text_encoder_t5: HunyuanDiTT5TextEncoder<br>dit: HunyuanDiT<br>vae_decoder: SDVAEDecoder<br>vae_encoder: SDVAEEncoder |
 | SDImagePipeline             | text_encoder: SDTextEncoder<br>unet: SDUNet<br>vae_decoder: SDVAEDecoder<br>vae_encoder: SDVAEEncoder<br>controlnet: MultiControlNetManager<br>ipadapter_image_encoder: IpAdapterCLIPImageEmbedder<br>ipadapter: SDIpAdapter |
-| SD3ImagePipeline            | text_encoder_1: SD3TextEncoder1<br>text_encoder_2: SD3TextEncoder2<br>text_encoder_3: SD3TextEncoder3<br>dit: SD3DiT<br>vae_decoder: SD3VAEDecoder<br>vae_encoder: SD3VAEEncoder |
 | SDXLImagePipeline           | text_encoder: SDXLTextEncoder<br>text_encoder_2: SDXLTextEncoder2<br>text_encoder_kolors: ChatGLMModel<br>unet: SDXLUNet<br>vae_decoder: SDXLVAEDecoder<br>vae_encoder: SDXLVAEEncoder<br>controlnet: MultiControlNetManager<br>ipadapter_image_encoder: IpAdapterXLCLIPImageEmbedder<br>ipadapter: SDXLIpAdapter |
+| SD3ImagePipeline            | text_encoder_1: SD3TextEncoder1<br>text_encoder_2: SD3TextEncoder2<br>text_encoder_3: SD3TextEncoder3<br>dit: SD3DiT<br>vae_decoder: SD3VAEDecoder<br>vae_encoder: SD3VAEEncoder |
+| HunyuanDiTImagePipeline     | text_encoder: HunyuanDiTCLIPTextEncoder<br>text_encoder_t5: HunyuanDiTT5TextEncoder<br>dit: HunyuanDiT<br>vae_decoder: SDVAEDecoder<br>vae_encoder: SDVAEEncoder |
+| FluxImagePipeline     | text_encoder_1: FluxTextEncoder1<br>text_encoder_2: FluxTextEncoder2<br>dit: FluxDiT<br>vae_decoder: FluxVAEDecoder<br>vae_encoder: FluxVAEEncoder |

-## Video Pipelines
-
-Pipelines for generating videos from text descriptions. In addition to the models required for image generation, they include models for handling motion modules.
+## 视频生成流水线

 | Pipeline                   | Models                                                     |
 |----------------------------|----------------------------------------------------------------|
 | SDVideoPipeline            | text_encoder: SDTextEncoder<br>unet: SDUNet<br>vae_decoder: SDVAEDecoder<br>vae_encoder: SDVAEEncoder<br>controlnet: MultiControlNetManager<br>ipadapter_image_encoder: IpAdapterCLIPImageEmbedder<br>ipadapter: SDIpAdapter<br>motion_modules: SDMotionModel |
 | SDXLVideoPipeline          | text_encoder: SDXLTextEncoder<br>text_encoder_2: SDXLTextEncoder2<br>text_encoder_kolors: ChatGLMModel<br>unet: SDXLUNet<br>vae_decoder: SDXLVAEDecoder<br>vae_encoder: SDXLVAEEncoder<br>ipadapter_image_encoder: IpAdapterXLCLIPImageEmbedder<br>ipadapter: SDXLIpAdapter<br>motion_modules: SDXLMotionModel |
 | SVDVideoPipeline           | image_encoder: SVDImageEncoder<br>unet: SVDUNet<br>vae_encoder: SVDVAEEncoder<br>vae_decoder: SVDVAEDecoder |
-
-
-
+| CogVideoPipeline           | text_encoder: FluxTextEncoder2<br>dit: CogDiT<br>vae_encoder: CogVAEEncoder<br>vae_decoder: CogVAEDecoder |
--- a/docs/source/GetStarted/PromptProcessing.md
+++ b/docs/source/GetStarted/PromptProcessing.md
@@ -1,4 +1,4 @@
-# 提示词（Prompt）处理
+# 提示词处理

 DiffSynth 内置了提示词处理功能，分为：

--- a/docs/source/GetStarted/Schedulers.md
+++ b/docs/source/GetStarted/Schedulers.md