CVPR 2026 Tutorial • Colorado Convention Center, Denver, CO • June 3/4

Accelerated Diffusion Models:
From Theory to Interactive World Models

Overview

How can we make diffusion models fast enough for real-time interactive applications?

Diffusion models and flow-based methods have revolutionized generative learning in the visual domain, setting new standards for image, video, and 3D content creation. However, as the field shifts toward interactive applications—such as real-time editing, world models, and embodied AI—the need for low-latency feedback has become critical. Currently, the high computational cost of iterative sampling hinders real-world deployment. While various acceleration techniques exist, the lack of a unified resource makes it difficult to bridge the gap between theory and practice.

To address this challenge, this tutorial offers a practice-oriented course designed to equip researchers and practitioners with the tools to accelerate diffusion pipelines, supported by the open-source FastGen library. The curriculum covers three primary areas: general sampling acceleration, training-based distillation for efficient few-step samplers, and applications in video and interactive world models. The session will conclude with a panel discussion on open problems and the future of real-time diffusion.

Organizers & Presenters

Panelists

Robin Rombach
Robin Rombach
Black Forest Labs
Jiaming Song
Jiaming Song
Luma AI
Ruiqi Gao
Ruiqi Gao
Google DeepMind

Schedule

45 min General Paradigms to Accelerating Diffusion Models
Covering advanced differential equation solvers, low-dimensional latent diffusions, improved noising processes, and architecture-based accelerations.
Arash Vahdat
15 min break
45 min Accelerating Diffusion Models with Step Distillation
Covering trajectory-based distillation approaches (such as progressive distillation, consistency models, and flow maps) and distribution distillation methods (such as DMD, denoising diffusion GANs, and LADD).
Julius Berner
15 min break
45 min From Images to Interactive World Models
Covering key challenges in video-based interactive world models (such as real-time sampling, long-context memory, and block-wise causal generation) and representative approaches (such as CausVid, Self-Forcing, and APT2).
Weili Nie
15 min break
45 min Panel Discussion and Q&A
Covering research challenges in fast diffusions, trade-offs between different acceleration strategies, and practical considerations when choosing algorithms for real-world applications.
Panelists