A Diffusion Transformer model for 3D video-like data was introduced in Wan 2.1 by the Alibaba Wan Team.
The model can be loaded with the following code snippet.
from diffusers import WanTransformer3DModel
transformer = WanTransformer3DModel.from_pretrained("Wan-AI/Wan2.1-T2V-1.3B-Diffusers", subfolder="transformer", torch_dtype=torch.bfloat16)
[[autodoc]] WanTransformer3DModel
[[autodoc]] models.modeling_outputs.Transformer2DModelOutput