The artificial intelligence Dall-E turns any idea expressed in a sentence into an image

Nacho Palou    11 August, 2022
Fictional image generated by Dall-E 2, OpenAI's artificial inteligence.

Generating photorealistic images from naturally expressed concepts such as “an astronaut riding a horse” or “a bowl of soup that looks like a monster”. Or anything else you can imagine, no matter how surreal.

That’s what Dall-E 2 does, the latest evolution of the artificial intelligence (AI) system announced by research and development company OpenAI, with Elon Musk among its founders.

It’s true that we’ve seen similar apps and AI systems before, which generate images from text or keywords. However, Dall-E’s latest demo generates images that leave no one indifferent due to their quality and realism, as well as their dreamlike and surreal style.

The name Dall-E combines the names of the Pixar character Wall-E and Salvador Dalí, the master of surrealism.

The tool is not yet available to the public, but you can see the results on OpenAI’s website and Instagram account.

Some images generated by OpenAI's AI model Dall-E
Some images generated by OpenAI’s AI model Dall-E

The company has shared examples of the images Dall-E produces when concepts, features and styles are combined into a short phrase.

Thus the phrase “a bowl of soup that looks like a monster made of play dough” would result in this image and its variants.

"A bowl of soup that looks like a monster made of play dough"
“A bowl of soup that looks like a monster made of play dough” according with Dall-E. Image: OpenAI

Whereas “a bowl of soup that looks like a monster woven out of wool” would result in this other image – and its variants.

"A bowl of soup that looks like a monster woven out of wool". Image: OpenAI
“A bowl of soup that looks like a monster woven out of wool”. Image: OpenAI

Different combinations can be tested on the OpenAI website, and this video shows other examples and explains a bit more about what Dall-E is and how it works.

Dall-E’s neural network “has learned the relationship between images and the texts that describe them,” the researchers explain.

“It not only understands individual objects such as a horse or an astronaut,” they say, but has also learned “how objects and actions relate to each other”. This is how Dall-E ‘knows’ how it should realistically represent an astronaut riding a horse.

Image generated by Dall-E artificial inteligence when you ask for an astronaut riding a horse. Image: OpenAI
Image generated by Dall-E artificial inteligence when you ask for an astronaut riding a horse. Image: OpenAI

To generate the image Dall-E uses a process called “diffusion” which starts by rearranging random patterns of dots and modifying them until the desired result is obtained, producing “images that have not existed before”.

Dall-E is an example “of how human imagination and systems can work together to create new things, amplifying our creative potential”

Dall-E aims to be an example of “useful and secure” AI

According to the researchers, the development of Dall-E fulfils three essential premises for the development of a “useful and secure” AI:

  • It allows the audience to express themselves in a way that was not possible before.
  • It reveals whether the AI system “understands” what is asked of it in writing, or whether it just repeats what it has learned.
  • Helps to understand how the AI system sees and understands the world.

Compared to the first version of Dall-E, announced just over a year ago, Dall-E 2 adds new features, as well as increasing the comprehensibility and the quality and complexity of the images and the speed at which they are generated.

  • Start from an existing image and create complex variations, such as changing the angle of a portrait and its style.
  • Edit an existing image to replace one object with another, to add an object that does not exist in the original image, considering styling, shadows, reflections and textures. You can even change the meaning of the image.

However, limitations in the use of Dall-E lead to bias

In addition to limiting its availability –the tool will be available to a small group of people, mainly AI researchers and some non-commercial artists– OpenIA has implemented some restrictions on the use of its new artificial intelligence model.

These restrictions are intended to avoid harmful or offensive use of the tool by preventing it from generating violent, sexual or politically charged images. It also prevents the generation of images that include known or recognisable people.

Avoiding bias and stereotyping is one of the great challenges for artificial intelligence

These limitations can, however, encourage bias in AI models such as Dall-E. OpenIA researchers found that removing sexual content to prevent Dall-E from producing adult images causes Dall-E to generate fewer images of women overall.

This is not a good thing, reports the publication Vox, “because it makes women invisible.” But this is not a problem unique to Dall-E: avoiding bias and the persistence of stereotypes is now one of the biggest challenges “for the entire AI developer community.”

Leave a Reply

Your email address will not be published.