On Wednesday, Apple released optimizations that enable the Stable Diffusion AI image generator to run on Apple Silicon using Core ML, Apple’s proprietary framework for machine learning models. The optimizations allow app developers to use Apple Neural Engine hardware to run Stable Diffusion about twice as fast as previous Mac-based methods.
Launched in August, Stable Diffusion (SD) is an open source AI image synthesis model that generates new images using text input. For example, typing “astronaut on a dragon” in SD will usually create an image of that.
By releasing the new SD optimizations – available as conversion scripts on GitHub – Apple aims to unlock the full potential of image synthesis on its devices, as stated on the Apple Research announcement page. “With Stable Diffusion’s growing number of uses, it’s important to ensure developers can effectively leverage this technology to create apps that creatives everywhere can use.”
Apple also cites privacy and avoiding cloud computing costs as benefits to running an AI generation model locally on a Mac or Apple device.
“End user privacy is protected because any data that the user provides as input to the model remains on the user’s device,” Apple says. “Second, after the initial download, users do not need an Internet connection to use the model. Finally, the local implementation of this model allows developers to reduce or eliminate their server-related costs.”
Currently, Stable Diffusion generates images fastest on Nvidia high-end GPUs when run locally on a Windows or Linux PC. For example, generating a 512×512 image with 50 steps on an RTX 3060 takes about 8.7 seconds on our machine.
By comparison, the conventional method of running Stable Diffusion on an Apple Silicon Mac is much slower, taking about 69.8 seconds to generate a 512×512 image in 50 steps using Diffusion Bee in our tests on an M1 Mac Mini.
According to Apple’s benchmarks on GitHub, Apple’s new Core ML SD optimizations can generate a 512×512 50-step image on an M1 chip in 35 seconds. An M2 does the job in 23 seconds, and Apple’s most powerful silicon chip, the M1 Ultra, can achieve the same result in just nine seconds. That’s a dramatic improvement, cutting generation time almost in half in the case of the M1.
Apple’s GitHub release is a Python package that converts Stable Diffusion models from PyTorch to Core ML and includes a Swift package for model deployment. The optimizations work for Stable Diffusion 1.4, 1.5 and the recently released 2.0.
Right now, the experience of setting up Stable Diffusion locally with Core ML on a Mac is aimed at developers and requires some basic command-line skills, but Hugging Face has published a comprehensive guide to setting up Apple’s Core ML optimizations for those who want to to experiment.
For those less tech savvy, the aforementioned app called Diffusion Bee makes it easy to run Stable Diffusion on Apple Silicon, but it doesn’t yet integrate Apple’s new optimizations. You can also run Stable Diffusion on an iPhone or iPad using the Draw Things app.