Reproducibility in Diffusers

Diffusion is a random process that generates a different output every time. For testing and replicating results, you might want to generate the same result each time. This post explains how to control sources of randomness and enable deterministic algorithms.

1. Using Generators

To generate reproducible results, you can pass a torch.Generator to the pipeline.

import torch
from diffusers import DDIMPipeline

pipeline = DDIMPipeline.from_pretrained("google/ddpm-cifar10-32")
generator = torch.manual_seed(0)
image = pipeline(prompt="...", generator=generator).images[0]

Warning: The Generator object should be passed to the pipeline instead of an integer seed. The Generator maintains a random state that is consumed and modified when used. Once consumed, the same Generator object produces different results in subsequent calls, even across different pipelines, because its state has changed.

2. Deterministic Algorithms

PyTorch supports deterministic algorithms for certain operations to produce the same results, though they may be slower.

You can use Diffusers’ enable_full_determinism function to enable deterministic algorithms:

import torch
from diffusers.utils.testing_utils import enable_full_determinism

enable_full_determinism()

Under the hood, enable_full_determinism works by:

Setting the environment variable CUBLAS_WORKSPACE_CONFIG to :16:8 to only use one buffer size during runtime. Non-deterministic behavior occurs when operations are used in more than one CUDA stream.
Disabling benchmarking to find the fastest convolution operation by setting torch.backends.cudnn.benchmark=False. Non-deterministic behavior occurs because the benchmark may select different algorithms each time depending on hardware or benchmarking noise.
Disabling TensorFloat32 (TF32) operations in favor of more precise and consistent full-precision operations.

Source: Hugging Face Diffusers Documentation