On Saturday, AI image service Midjourney began testing alpha version 4 (“v4”) of its text-to-image synthesis model, which is available to subscribers on its Discord server. The new model offers more detail than was previously available and inspired some AI artists to note that v4 makes it almost “too easy” to get high-quality results with simple prompts.
Midjourney opened to the public in March as part of an early wave of AI image synthesis models. It quickly gained a large following for its distinct style and for being publicly available before DALL-E and Stable Diffusion. Before long, artwork created by Midjourney made headlines by winning art competitions, providing material for potentially historic copyright registrations, and appearing on stock illustration websites (later banned).
Over time, Midjourney has refined its model with more training, new features and more details. The current standard model, known as “v3”, debuted in August. Now Midjourney v4 is being put to the test by thousands of members of the service’s Discord server who create images through the Midjourney bot. Users can currently try v4 by adding “–v 4” to their prompts.
“V4 is a brand new codebase and a brand new AI architecture,” wrote Midjourney founder David Holz in a Discord announcement. “It’s our first model trained on a new Midjourney AI supercluster and it’s been more than 9 months in the making.”
In our testing of Midjourney’s v4 model, we found that it offers much more detail than v3, better understanding of prompts, better scene compositions, and sometimes better subject proportionality. When looking for photo-realistic images, some of the results we’ve seen at lower resolutions can be difficult to distinguish from real photos.
According to Holz, other features of v4 include:
– Much more knowledge (of creatures, places and more)
– Much better at getting small details right (in all situations)
– Handles more complex clues (with multiple levels of detail)
– Better with multi-object/multi-character scenes
– Supports advanced functionality such as image prompts and multi-prompts
– Supports –chaos arg (set it from 0 to 100) to control the variety of image grids
The response to Midjourney v4 has been positive about the service’s Discord, and fans of other image synthesis models — who regularly struggle with complex cues to get good results — take note.
A Redditor named Jon Bristow posted to the r/StableDiffusion community: “Does anyone else feel Midjourney v4 is ‘too easy’? This was ‘Close-up photography of a face’ and it feels like you didn’t make it . As it was preprocessed.” In response, one quipped, “Sad for Pro boosters who will lose their new job created a month ago.”
Midjourney says v4 is still in alpha, so it will continue to fix the quirks of the new model over time. The company plans to increase the resolution and quality of v4’s upscaled images, add custom aspect ratios (such as v3), increase image sharpness, and reduce text artifacts. Midjourney is available for a monthly subscription rate that ranges between US$10 and $50 per month.
Given the progress Midjourney has made in eight months of work, we wonder what next year’s progress in image synthesis will bring.