mirror of
https://github.com/modelscope/DiffSynth-Studio.git
synced 2026-03-19 06:48:12 +00:00
Add files via upload
上一次上传到docs文件夹中了,修改一下
This commit is contained in:
@@ -1,6 +1,6 @@
|
||||
# ControlNet、LoRA、IP-Adapter——Precision Control Technology
|
||||
|
||||
Based on the VinVL model, various adapter-based models can be used to control the generation process.
|
||||
Based on the text-to-images model, various adapter-based models can be used to control the generation process.
|
||||
|
||||
Let's download the models we'll be using in the upcoming examples:
|
||||
|
||||
|
||||
@@ -4,7 +4,7 @@ When generating images, we need to write prompt words to describe the content of
|
||||
|
||||
## Translation
|
||||
|
||||
Most text-to-image models currently only support English prompt words, which can be challenging for users who are not native English speakers. To address this, we can use open-source translation models to translate the prompt words into English. In the following example, we take "一个女孩" (a girl) as the prompt word and use the model opus-mt-zh-en for translation(which can be downloaded from [HuggingFace](https://huggingface.co/Helsinki-NLP/opus-mt-zh-en) or [ModelScope](https://modelscope.cn/models/moxying/opus-mt-zh-en)).
|
||||
Most text-to-image models currently only support English prompt words, which can be challenging for users who are not native English speakers. To address this, we can use open-source translation models to translate the prompt words into English. In the following example, we take "一个女孩" (a girl) as the prompt word and use the model opus-mt-zh-en (which can be downloaded from [HuggingFace](https://huggingface.co/Helsinki-NLP/opus-mt-zh-en) or [ModelScope](https://modelscope.cn/models/moxying/opus-mt-zh-en)) for translation.
|
||||
```python
|
||||
from diffsynth import ModelManager, SDXLImagePipeline, Translator
|
||||
import torch
|
||||
|
||||
@@ -1,16 +1,12 @@
|
||||
Certainly, here is the continuation of the translation:
|
||||
|
||||
---
|
||||
|
||||
# Training Framework
|
||||
|
||||
We have implemented a training framework for text-to-image diffusion models, allowing users to effortlessly train LoRA models with our framework. Our provided scripts come with the following features:
|
||||
|
||||
* **Comprehensive Functionality**: Our training framework supports multi-GPU and multi-node configurations, is optimized for acceleration with DeepSpeed, and includes gradient checkpointing to accommodate models with higher memory requirements.
|
||||
* **Succinct Code**: We have avoided large, complex code blocks. The general module is implemented in `diffsynth/trainers/text_to_image.py`, while model-specific training scripts contain only the minimal code necessary for the model architecture, facilitating ease of use for academic researchers.
|
||||
* **Modular Design**: Built on the versatile PyTorch Lightning framework, our training framework is decoupled in functionality, enabling developers to easily incorporate additional training techniques by modifying our scripts to suit their specific needs.
|
||||
* **Modular Design**: Built on the versatile PyTorch-Lightning framework, our training framework is decoupled in functionality, enabling developers to easily incorporate additional training techniques by modifying our scripts to suit their specific needs.
|
||||
|
||||
Examples of images fine-tuned with LoRA. Prompts are "A little dog jumping around with colorful flowers around and mountains in the background" (for Chinese models) or "a dog is jumping, flowers around the dog, the background is mountains and clouds" (for English models).
|
||||
Examples of images fine-tuned with LoRA. Prompts are "一只小狗蹦蹦跳跳,周围是姹紫嫣红的鲜花,远处是山脉" (for Chinese models) or "a dog is jumping, flowers around the dog, the background is mountains and clouds" (for English models).
|
||||
|
||||
||FLUX.1-dev|Kolors|Stable Diffusion 3|Hunyuan-DiT|
|
||||
|-|-|-|-|-|
|
||||
@@ -53,11 +49,11 @@ Please note that if the model is a Chinese model (e.g., Hunyuan-DiT and Kolors),
|
||||
|
||||
```
|
||||
file_name,text
|
||||
00.jpg,a dog
|
||||
01.jpg,a dog
|
||||
02.jpg,a dog
|
||||
03.jpg,a dog
|
||||
04.jpg,a dog
|
||||
00.jpg,一只小狗
|
||||
01.jpg,一只小狗
|
||||
02.jpg,一只小狗
|
||||
03.jpg,一只小狗
|
||||
04.jpg,一只小狗
|
||||
```
|
||||
|
||||
## Train LoRA Model
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
#Training FLUX LoRA
|
||||
# Training FLUX LoRA
|
||||
|
||||
The following files will be used to build the FLUX model. You can download them from [huggingface](https://huggingface.co/black-forest-labs/FLUX.1-dev)或[modelscope](https://www.modelscope.cn/models/ai-modelscope/flux.1-dev), or you can use the following code to download these files:
|
||||
```python
|
||||
|
||||
@@ -14,7 +14,7 @@ models/stable_diffusion
|
||||
└── v1-5-pruned-emaonly.safetensors
|
||||
```
|
||||
|
||||
To initiate the training process, please use the following command:
|
||||
Start the training task with the following command:
|
||||
|
||||
```
|
||||
CUDA_VISIBLE_DEVICES="0" python examples/train/stable_diffusion/train_sd_lora.py \
|
||||
|
||||
@@ -1,5 +1,7 @@
|
||||
# Installation
|
||||
|
||||
Currently, DiffSynth-Studio supports installation via cloning from GitHub or using pip. We recommend users to clone from GitHub to experience the latest features.
|
||||
|
||||
## From Source
|
||||
|
||||
1. Clone the source repository:
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
# 模型
|
||||
|
||||
目前为止,DiffSynth Studio 支持的模型如下所示:
|
||||
So far, the models supported by DiffSynth Studio are as follows:
|
||||
|
||||
* [CogVideoX](https://huggingface.co/THUDM/CogVideoX-5b)
|
||||
* [FLUX](https://huggingface.co/black-forest-labs/FLUX.1-dev)
|
||||
|
||||
@@ -28,7 +28,7 @@ pipe = SDXLImagePipeline.from_model_manager(model_manager, prompt_refiner_classe
|
||||
|
||||
### Prompt Extenders
|
||||
|
||||
When loading the model pipeline, you can specify the desired prompt extender using the prompt_extender_classes parameter. For example code, refer to [omost_flux_text_to_image.py](examples/image_synthesis/omost_flux_text_to_image.py).
|
||||
When loading the model pipeline, you can specify the desired prompt extender using the `prompt_extender_classes` parameter. For example code, refer to [omost_flux_text_to_image.py](examples/image_synthesis/omost_flux_text_to_image.py).
|
||||
|
||||
```python
|
||||
pipe = FluxImagePipeline.from_model_manager(model_manager, prompt_extender_classes=[OmostPromter])
|
||||
|
||||
Reference in New Issue
Block a user