Last week, a hobbyist experimenting with the new Flux AI image synthesis model discovered that it is unexpectedly good at rendering custom reproductions of fonts. While far more efficient methods of rendering computer fonts have existed for decades, the new technique is useful for AI image hobbyists because Flux is able to render images of accurate text, and users can now directly insert words rendered in custom fonts into AI image generation.
We’ve had the technology to produce accurately flowing computer-rendered fonts in custom shapes since the 1980s (1970s in the research space), so creating an AI-replicated font in itself isn’t big news. But a new technique means you could see a particular font appear in AI-generated images, say of a chalkboard menu in a photorealistic restaurant, or a printed business card held by a cyborg fox.
Shortly after the rise of mainstream AI image synthesis models like Stable Diffusion in 2022, some people started asking: how can I insert my own product, garment, character, or style into an AI-generated image? One answer that emerged came in the form of LoRA (low-rank adaptation), a technique discovered in 2021 that allows users to augment knowledge in a base AI model with modular, custom-trained add-ons.
These LoRAs, as the modules are called, allow image synthesis models to create new concepts that are not originally (or poorly) represented in the base model's training data. In practice, image synthesis enthusiasts use them to render unique styles (e.g., everything in chalk drawings) or subjects (detailed images of Spider-Man, for example). Each LoRA must be specially trained using user-supplied examples.
Until Flux, most AI image generators weren’t very good at accurately rendering text within a scene. If you turned on Stable Diffusion 1.5 to render a sign that said “cheese,” it would return gibberish. OpenAI’s DALL-E 3, released last year, was the first mainstream model that could render text reasonably well. Flux still makes mistakes with words and letters, but it’s the most capable AI model for rendering “in-world text” (if you will) that we’ve seen yet.
Because Flux is an open model that is available for download and refinement, the past month was the first time that training a font with LoRA might make sense. That's exactly what a AI enthusiast Vadim Fedenko (who had not responded to a request for an interview at the time of going to press) recently discovered. “I'm really impressed with how this turned out,” Fedenko wrote in a Reddit post. “Flux picks up on what letters look like in a certain style/font, making it possible to train Loras with specific fonts, typefaces, etc. I'll be training more of these soon.”
For his first experiment, Fedenko chose a cheerful “Y2K” font reminiscent of the fonts popular in the late 1990s and early 2000s. The resulting model was published on the Civitai platform on August 20. Two days later, a Civitai user named “AggravatingScree7189” posted a second LoRA font that reproduces a font similar to one found in the Cyberpunk2077 video game.
“Text was so bad it never occurred to me you could do this,” wrote a Reddit user named eggs-benedryl when responding to Fedenko’s post about the Y2K font. Another Redditor wrote, “I didn’t know the Y2K journal was fake until I zoomed in.”
Is it an exaggeration?
It's true that using a deeply trained image synthesis neural network to render a plain old font on a simple background is probably overkill. You probably wouldn't want to use this method to replace Adobe Illustrator when designing a document.
“This looks good, but it's funny how we're reinventing the idea of fonts as 300MB LoRAs,” one Reddit commenter wrote in a thread about the Cyberpunk2077 font.
Generative AI is often criticized for its environmental impact, and it’s a legitimate concern for massive cloud data centers. But we see that Flux can insert these fonts into AI-generated scenes while running them locally on an RTX 3060 in a quantized (smaller) form (and the entire dev model can be run on an RTX 3090). It’s comparable to the electricity consumption of playing a video game on the same PC. The same goes for LoRA creation: the creator of the Cyberpunk2077 font trained the LoRA in three hours on a 3090 GPU.
There are also ethical issues with using AI image generators, such as how they’re trained on harvested data without the content owner’s permission. While the technology is divisive among some artists, a large community of people use it every day and share the results online via social media platforms like Reddit, leading to new applications of the technology like this one.
At the time of writing, there are only two custom Flux font LoRAs, but we've heard of plans from people to create more as we write. While it's still early days, the technique of creating LoRA fonts could become fundamental if AI image synthesis becomes more widely deployed in the future. Adobe, with its own image synthesis models, is likely watching.