Two months ago, HuggingFace open-source "state-of-the-art diffusion models for image and audio generation in PyTorch" at github.com/huggingface/diffusers.
"Diffusers provides pretrained diffusion models across multiple modalities, such as vision and audio, and serves as a modular toolbox for inference and training of diffusion models."
Here's a text-to-image example from the repository's README.
# !pip install diffusers transformers
from diffusers import DiffusionPipeline
model_id = "CompVis/ldm-text2im-large-256"
# load model and scheduler
ldm = DiffusionPipeline.from_pretrained(model_id)
# run pipeline in inference (sample random noise and denoise)
prompt = "A painting of a squirrel eating a burger"
images = ldm([prompt], num_inference_steps=50, eta=0.3, guidance_scale=6)["sample"]
# save images
for idx, image in enumerate(images):
image.save(f"squirrel-{idx}.png")
Latent diffusion is the type of model architecture used in Google's Imagen or OpenAI's DALL·E to generate images from text and increase the resolution of output images.