mirror of https://github.com/modelscope/DiffSynth-Studio.git synced 2026-03-19 14:58:12 +00:00

Files

yrk111222 f6e676cdf9 Add files via upload

再改一次

2024-10-22 09:56:03 +08:00

3.5 KiB

Raw Blame History

Translation and Polishing — The Magic of Prompt Words

When generating images, we need to write prompt words to describe the content of the image. Prompt words directly affect the outcome of the generation, but crafting them is also an art. Good prompt words can produce images with a high degree of aesthetic appeal. We offer a range of models to help users handle prompt words effectively.

Translation

Most text-to-image models currently only support English prompt words, which can be challenging for users who are not native English speakers. To address this, we can use open-source translation models to translate the prompt words into English. In the following example, we take "一个女孩" (a girl) as the prompt word and use the model opus-mt-zh-en (which can be downloaded from HuggingFace or ModelScope) for translation.

from diffsynth import ModelManager, SDXLImagePipeline, Translator
import torch

model_manager = ModelManager(
    torch_dtype=torch.float16, device="cuda",
    model_id_list=["BluePencilXL_v200", "opus-mt-zh-en"]
)
pipe = SDXLImagePipeline.from_model_manager(model_manager, prompt_refiner_classes=[Translator])

torch.manual_seed(0)
prompt = "一个女孩"
image = pipe(
    prompt=prompt, negative_prompt="",
    height=1024, width=1024, num_inference_steps=30
)
image.save("image_1.jpg")

Polishing

Detailed prompt words can generate images with richer details. We can use a prompt polishing model like BeautifulPrompt(which can be downloaded from HuggingFace or ModelScope) to embellish simple prompt words. This model can make the overall picture style more gorgeous.

This module can be activated simultaneously with the translation module, but please pay attention to the order: translate first, then polish.

from diffsynth import ModelManager, SDXLImagePipeline, Translator, BeautifulPrompt
import torch

model_manager = ModelManager(
    torch_dtype=torch.float16, device="cuda",
    model_id_list=["BluePencilXL_v200", "opus-mt-zh-en", "BeautifulPrompt"]
)
pipe = SDXLImagePipeline.from_model_manager(model_manager, prompt_refiner_classes=[Translator, BeautifulPrompt])

torch.manual_seed(0)
prompt = "一个女孩"
image = pipe(
    prompt=prompt, negative_prompt="",
    height=1024, width=1024, num_inference_steps=30
)
image.save("image_2.jpg")

We have also integrated a Tongyi Qwen model that can seamlessly complete the translation and polishing of prompt words in one step.

from diffsynth import ModelManager, SDXLImagePipeline, QwenPrompt
import torch

model_manager = ModelManager(
    torch_dtype=torch.float16, device="cuda",
    model_id_list=["BluePencilXL_v200", "QwenPrompt"]
)
pipe = SDXLImagePipeline.from_model_manager(model_manager, prompt_refiner_classes=[QwenPrompt])

torch.manual_seed(0)
prompt = "一个女孩"
image = pipe(
    prompt=prompt, negative_prompt="",
    height=1024, width=1024, num_inference_steps=30
)
image.save("image_3.jpg")

3.5 KiB Raw Blame History

Translation and Polishing — The Magic of Prompt Words

Translation

Polishing

3.5 KiB

Raw Blame History