“One of the most powerful things about this technology is that, like DALL-E, it does what you tell it to do,” said Nate Bennett, one of the researchers who works in the lab at the University of Washington. “From a single prompt, it can generate an endless number of designs.”
The rise of OpenAI
The San Francisco company is one of the world’s most ambitious artificial intelligence laboratories. Here’s a look at some recent developments.
To generate images, DALL-E relies on what artificial intelligence researchers call a neural network, a mathematical system loosely modeled after the network of neurons in the brain. This is the same technology that recognizes the commands you bark into your smartphone, enables self-driving cars to identify (and avoid) pedestrians, and translates languages on services like Skype.
A neural network learns skills by analyzing massive amounts of digital data. For example, by locating patterns in thousands of corgi photos, it can learn to recognize a corgi. Using DALL-E, researchers built a neural network that looked for patterns as it analyzed millions of digital images and the text captions that described what each of those images represented. In this way it learned to recognize the connections between the images and the words.
When you describe an image for DALL-E, a neural network generates a series of key features that this image can contain. A feature may be the roundness of a teddy bear’s ear. Another possibility is the line on the edge of a skateboard. Then a second neural network – called a diffusion model – generates the pixels needed to realize these functions.
The diffusion model is trained on a series of images in which noise – imperfection – is gradually added to a photo until it becomes a sea of random pixels. As it analyzes these images, the model learns to reverse this process. When you enter the random pixels, the noise is removed and these pixels are converted into a cohesive image.
At the University of Washington, other academic labs and new start-ups, researchers are using similar techniques to make new proteins.
Proteins start out as strands of chemical compounds, which then twist and fold into three-dimensional shapes that dictate how they behave. In recent years, artificial intelligence labs like DeepMind, owned by Alphabet, the same parent company as Google, have shown that neural networks can accurately guess the three-dimensional shape of any protein in the body based on the smaller compounds it contains — a huge scientific advancement. .