mirror of
https://github.com/modelscope/DiffSynth-Studio.git
synced 2026-03-22 16:50:47 +00:00
support ltx2.3 inference
This commit is contained in:
@@ -251,11 +251,27 @@ class Modality:
|
||||
Input data for a single modality (video or audio) in the transformer.
|
||||
Bundles the latent tokens, timestep embeddings, positional information,
|
||||
and text conditioning context for processing by the diffusion transformer.
|
||||
Attributes:
|
||||
latent: Patchified latent tokens, shape ``(B, T, D)`` where *B* is
|
||||
the batch size, *T* is the total number of tokens (noisy +
|
||||
conditioning), and *D* is the input dimension.
|
||||
timesteps: Per-token timestep embeddings, shape ``(B, T)``.
|
||||
positions: Positional coordinates, shape ``(B, 3, T)`` for video
|
||||
(time, height, width) or ``(B, 1, T)`` for audio.
|
||||
context: Text conditioning embeddings from the prompt encoder.
|
||||
enabled: Whether this modality is active in the current forward pass.
|
||||
context_mask: Optional mask for the text context tokens.
|
||||
attention_mask: Optional 2-D self-attention mask, shape ``(B, T, T)``.
|
||||
Values in ``[0, 1]`` where ``1`` = full attention and ``0`` = no
|
||||
attention. ``None`` means unrestricted (full) attention between
|
||||
all tokens. Built incrementally by conditioning items; see
|
||||
:class:`~ltx_core.conditioning.types.attention_strength_wrapper.ConditioningItemAttentionStrengthWrapper`.
|
||||
"""
|
||||
|
||||
latent: (
|
||||
torch.Tensor
|
||||
) # Shape: (B, T, D) where B is the batch size, T is the number of tokens, and D is input dimension
|
||||
sigma: torch.Tensor # Shape: (B,). Current sigma value, used for cross-attention timestep calculation.
|
||||
timesteps: torch.Tensor # Shape: (B, T) where T is the number of timesteps
|
||||
positions: (
|
||||
torch.Tensor
|
||||
@@ -263,6 +279,7 @@ class Modality:
|
||||
context: torch.Tensor
|
||||
enabled: bool = True
|
||||
context_mask: torch.Tensor | None = None
|
||||
attention_mask: torch.Tensor | None = None
|
||||
|
||||
|
||||
def to_denoised(
|
||||
|
||||
Reference in New Issue
Block a user