SDXL Turbo Just Released by Stability AI: A Real-Time Text-to-Image Generation Model

Dee Miller November 28, 2023

Stability AI just broke new ground with the release of their upgraded Stable Diffusion XL model, now known as SDXL Turbo.

The innovation achieves unrivaled performance levels thanks to a novel distillation technology that has impressively trimmed down the image generation process from 50 steps to a single one.

Turbocharged Performance

SDXL Turbo stands out with its breakthrough in single-step image generation. This significant leap in reducing process steps to just one is an unprecedented feat noteworthy for its quality above all.

What powers this record-shattering performance, you might ask? Underneath it all lies a unique combination of adversarial training and score distillation – an intricate distillation technique according to the undisclosed research paper.

Key capabilities include:

Real-time image generation – SDXL Turbo can create images with a single forward pass of the network, enabling real-time synthesis suitable for applications like video games or augmented reality.
High resolution output – The model generates crisp, detailed 512×512 images, improving on prior GAN-based approaches limited to 256×256.
Retains iterative refinement – Unlike GANs, SDXL Turbo can iteratively enhance output quality over multiple sampling steps.
State-of-the-art fidelity – With four steps, SDXL Turbo matches or beats top diffusion models that need 16-50 steps, demonstrating world-class sample quality.

Availability and Licensing

Excitingly for AI enthusiasts, the weights and code for this groundbreaking model are available on Hugging Face, shrouded by a non-commercial research license that grants personal and non-commercial use.

It’s essential to note that SDXL Turbo, despite its exceptional offerings, is not intended for commercial use yet.

SDXL Turbo in Focus

SDXL Turbo builds upon the foundation of its predecessor with a novel distillation technique titled Adversarial Diffusion Distillation or ADD.

ADD is the first method to unlock single-step, real-time image synthesis with foundational models. This capability opens new possibilities in applications requiring rapid image generation.

The speed and quality of image generation make ADD a promising candidate for various applications, including gaming, virtual reality, and real-time content creation.

Advantages Over Other Models

The innovative ADD distillation method gives SDXL Turbo an upper hand against competition models by enabling single-step image outcome generation. Consequently, it avoids blurriness or artifacts often associated with other existing methods in diffusion models.

Screenshot 2023 11 28 at 22.25.31 — User preference study (single step). SDXL-Turbo beats other models in human evaluation tests.

In comparing model variants like StyleGAN-T++, OpenMUSE, IF-XL, SDXL, and LCM-XL using identical prompts for image generation, the SDXL Turbo beats all of them. Users found that with this new version they could do in one step what takes LCM-XL four steps and original SDXL fifty steps!

SDXL-Turbo can generate a 512×512 image within an impressive 207ms (including prompt encoding + a single denoising phase + decoding in fp16), with a single UNet forward evaluation taking up only 67ms.

Demo Time with Clipdrop

If you’re interested, you can check out how SDXL works for real on Stability AI’s editing platform, Clipdrop. It’s open to anyone with a web browser for a free trial.

Author

Dee Miller
Dee Miller is the founder and the writer at Go Find AI. Dee has been into AI since about 2018 when he first realized how revolutionary the AI tech is. Today when AI is all the hype, Dee is looking to make navigating the AI space easier for the newcomers.
View all posts