Meta joins the DALL-E party and introduces his own image-from-text generator
Rivals begin to appear for DALL-E. Open AI artificial intelligence (AI) is not alone in the field of generating images from any text. A few weeks ago we saw how Google Research presented IMAGE and recently an independent research laboratory revealed Midjourney. Now it’s Meta’s turn and her proposal called “Make-A-Scene“
The AI of the company led by Mark Zuckerberg, as he explains in a blog post, has an artistic essence since the results resemble (and are based on) the work done by hand by a person. It allows users to “create a digital painting without even picking up a brush” and is intended to empower the creativity of artists and non-artists alike in the future. Let’s see.
How the new Meta AI works to generate images
Make-A-Scene works a little differently than the other AIs we’ve seen in recent months. IMAGE, for example, works with a diffusion model, also used for other tasks, such as upscaling images, which generates an ultra-realistic rendering from text. Meta’s solution, on the other hand, requires a composition or sketch as a base.
But what is that base? According to the company, it’s part of a new research concept that seeks to address one of the biggest problems with imagers of its kind: that they don’t accurately reflect what we ask for. For example, if we enter the text “a painting of a zebra riding a bicycle”, in the result, the bicycle could be facing the other way and the zebra could be too big or too small.
The solution? Guide the AI with an outline that allows it to clearly delimit its work area. As we can see in the images it is not an overly elaborate job. In this sense, the model focuses first on learning the key aspects of the base image and then generating the artistic representation based on text entered in 2048 x 2048 pixel images.
Make-A-Scene isn’t just for artists, says Meta. The company’s head of programs, Andy Boyatzis, used AI with his two- and four-year-old children. One of them prepared a draft for text input “A monster robot bear riding a train” and the result was very interesting and quite accurate, as it showed the mechanical bear riding on a train looking not at all crazy.
This Meta AI, like many others in development, is limited to closed tests. The company has provided access to a handful of artists so far, but has not specified whether it will be open to everyone later. “We will continue to push the boundaries of what is possible using this new class of creative tools,” they say, so we may know more about their progress in time.
In Xataka | AIs have a problem: they are opaque and closed. BLOOM is the great open source project that wants to change everything