mirror of
https://github.com/modelscope/DiffSynth-Studio.git
synced 2026-03-21 08:08:13 +00:00
support sd35-lora
This commit is contained in:
@@ -256,6 +256,72 @@ image = pipe(
|
||||
image.save("image_with_lora.jpg")
|
||||
```
|
||||
|
||||
### Stable Diffusion 3.5 Series
|
||||
|
||||
|
||||
You need to download the text encoders and DiT model files. Please use the following code to download these files:
|
||||
|
||||
```python
|
||||
from diffsynth import download_models
|
||||
|
||||
download_models(["StableDiffusion3.5-large"])
|
||||
```
|
||||
|
||||
```
|
||||
models/stable_diffusion_3
|
||||
├── Put Stable Diffusion 3 checkpoints here.txt
|
||||
├── sd3.5_large.safetensors
|
||||
└── text_encoders
|
||||
├── clip_g.safetensors
|
||||
├── clip_l.safetensors
|
||||
└── t5xxl_fp16.safetensors
|
||||
```
|
||||
|
||||
Launch the training task using the following command:
|
||||
|
||||
```
|
||||
CUDA_VISIBLE_DEVICES="0" python examples/train/stable_diffusion_3/train_sd3_lora.py \
|
||||
--pretrained_path models/stable_diffusion_3/text_encoders/clip_g.safetensors,models/stable_diffusion_3/text_encoders/clip_l.safetensors,models/stable_diffusion_3/text_encoders/t5xxl_fp16.safetensors,models/stable_diffusion_3/sd3.5_large.safetensors \
|
||||
--dataset_path data/dog \
|
||||
--output_path ./models \
|
||||
--max_epochs 1 \
|
||||
--steps_per_epoch 500 \
|
||||
--height 1024 \
|
||||
--width 1024 \
|
||||
--center_crop \
|
||||
--precision "16" \
|
||||
--learning_rate 1e-4 \
|
||||
--lora_rank 4 \
|
||||
--lora_alpha 4 \
|
||||
--use_gradient_checkpointing
|
||||
```
|
||||
|
||||
For more information about the parameters, please use `python examples/train/stable_diffusion_3/train_sd3_lora.py -h` to see the details.
|
||||
|
||||
After training, use `model_manager.load_lora` to load the LoRA for inference.
|
||||
|
||||
```python
|
||||
from diffsynth import ModelManager, SD3ImagePipeline
|
||||
import torch
|
||||
|
||||
model_manager = ModelManager(torch_dtype=torch.float16, device="cuda",
|
||||
file_path_list=[
|
||||
"models/stable_diffusion_3/text_encoders/clip_g.safetensors",
|
||||
"models/stable_diffusion_3/text_encoders/clip_l.safetensors",
|
||||
"models/stable_diffusion_3/text_encoders/t5xxl_fp16.safetensors",
|
||||
"models/stable_diffusion_3/sd3.5_large.safetensors"
|
||||
])
|
||||
model_manager.load_lora("models/lightning_logs/version_0/checkpoints/epoch=0-step=500.ckpt", lora_alpha=1.0)
|
||||
pipe = SD3ImagePipeline.from_model_manager(model_manager)
|
||||
|
||||
torch.manual_seed(0)
|
||||
image = pipe(
|
||||
prompt="a dog is jumping, flowers around the dog, the background is mountains and clouds",
|
||||
num_inference_steps=30, cfg_scale=7
|
||||
)
|
||||
image.save("image_with_lora.jpg")
|
||||
```
|
||||
|
||||
### Stable Diffusion 3
|
||||
|
||||
Only one file is required in the training script. You can use [`sd3_medium_incl_clips.safetensors`](https://huggingface.co/stabilityai/stable-diffusion-3-medium/resolve/main/sd3_medium_incl_clips.safetensors) (without T5 encoder) or [`sd3_medium_incl_clips_t5xxlfp16.safetensors`](https://huggingface.co/stabilityai/stable-diffusion-3-medium/resolve/main/sd3_medium_incl_clips_t5xxlfp16.safetensors) (with T5 encoder). Please use the following code to download these files:
|
||||
@@ -285,7 +351,7 @@ CUDA_VISIBLE_DEVICES="0" python examples/train/stable_diffusion_3/train_sd3_lora
|
||||
--height 1024 \
|
||||
--width 1024 \
|
||||
--center_crop \
|
||||
--precision "16-mixed" \
|
||||
--precision "16" \
|
||||
--learning_rate 1e-4 \
|
||||
--lora_rank 4 \
|
||||
--lora_alpha 4 \
|
||||
|
||||
Reference in New Issue
Block a user