SDLX Turbo: Generate images from text on your computer

Mar 3, 2024

I've been experimenting with text-to-image generation lately, and I've been using OpenAI's GPT-4 and Dall-E models. However, I recently came across StableDiffusion and Midjourney models that seem more powerful. Since I have a paid subscription to OpenAI, I can use these models for free.

I haven't explored other image generation options before, but these models look promising. I'm working on a side project that requires generating images, so I'm excited to experiment with them.

Generating images with SDLX-Turbo

One of the models that caught my attention is SDLX-Turbo from StabilityAI. Both the model and the code are available on HuggingFace's website under a non-commercial license.

If you want to learn more about how these models work, check out this paper. The main thing that stands out about this model is its ability to generate images quickly. On my computer, it takes just a few seconds, while on a more powerful machine in the cloud it's almost instantaneous. They also offer some interesting tools for working with images on their platform.

Installing and running the environment

With that said, let's proceed to install the environment on our computer. You'll need Python installed.

First, we need to install the libraries needed to work with the models on our computer:

python3 -m pip install diffusers transformers accelerate --upgrade

If you have an NVIDIA graphics card, make sure to use the CUDA driver. On Apple Silicon (M1, M2, M3) machines, you need to use the MPS driver.

I encountered the error "Torch not compiled with CUDA enabled".

Now you can use the following script as a base for generating images:

from diffusers import AutoPipelineForText2Image
import torch

pipe = AutoPipelineForText2Image.from_pretrained("stabilityai/sdxl-turbo", torch_dtype=torch.float16, variant="fp16")
pipe.to("mps")

prompt = "monk wearing a golden tunic meditating in a waterfall with a golden aura"

image = pipe(prompt=prompt, num_inference_steps=1, guidance_scale=0.0).images[0]
image.save("output.png")

The images generated are very realistic, although one of the problems is that it can't generate faces of people correctly. I'll explain more in another post.

Here's an example image generated using the SDLX-Turbo model:

Monk generated using the SDLX-Turbo model.

SDLX Turbo: Generate images from text on your computer

Generating images with SDLX-Turbo

Installing and running the environment

Ideas, tools & strategies to build your idea in record time