DALL-E 2 has been the most striking in the field of artificial intelligence in recent months. But Google has not said its last word. Today they have decided to present IMAGE, their new AI capable of creating ultra-realistic images from a brief description. An alternative to the OpenAI tool that, according to Google’s tests and research, get more accurate results.
Contrary to DALL-E 2, which this summer has promised to release its tool to more users, Google has presented IMAGE as an investigation, arguing that for ethical reasons it is better that it remains a non-commercial product and remain as a tool for scholars and experts.
Taking photorealism with AI to new heights
The operation of IMAGE is similar to that of DALL-E 2. The AI ​​converts a small text into a highly detailed image that matches what is described. The combinations are almost unlimited and in most cases, DALL-E 2 managed to offer us an image very similar to what we asked for. Now Google says it has ironed out some of the gaps in the OpenAI tool and has managed to generate images that humans prefer.
AI can unlock joint human/computer creativity! Image is one direction we are pursuing:https://t.co/LTlE3pqq4W
“A high contrast portrait of a very happy fuzzy panda dressed as a chef in a high end kitchen making dough. There is a painting of flowers on the wall behind him.” pic.twitter.com/SrqEv9jeHf
— Jeff Dean (@🏡) (@JeffDean) May 24, 2022
IMAGE is based on the Transformer T5 model, introduced in 2020. Originally the AI ​​produces 64 x 64 pixel images, but then they are scaled to 1024 x 1024 pixels. The same resolution as DALL-E 2. This idea of ​​scaling is what relieves the calculation power and allows the generation of images in a few seconds.
To check which AI is the one that manages to produce the best images, Google has created the ‘DrawBench’ benchmark. According to the results shown by the paper, Google AI made fewer misunderstandings when building the image. An example is put with “A panda making latte art”. Google’s AI understood that it was the animal that should perform the action, while DALL-E 2 directly put a coffee with the face of a panda.
Jeff Dean, VP of Google AI, has posted several examples of what IMAGE is capable of on his Twitter profile. Additionally, users have a small interactive demo of how this AI works, being able to exchange between different animals, clothes, vehicle and background.
Unfortunately, Google is still concerned about the misuse of this AI, something that also happens with DALL-E 2, and for this reason it has decided not to make it available to users, for the time being. Still, it’s fascinating to see how AI is slowly getting better. At this rate, who knows what we will be able to do in a few years.