Artificial intelligence promises to mark a before and after in many areas, but one in particular is taking almost all the limelight this year. In April we talked about the enormous possibilities of DALL-E 2, an AI capable of generate images from text. Later came the DALL-E Mini, a generator that surprised us with its crazy creations. Now it’s Parti’s turn, an alternative that bets on a new and promising model to generate photorealistic images.
Unlike DALL-E and its variants, which use a “broadcast” model of generating images from text, Parti (Pathways Autoregressive Text-to-Image) relies on an autoregressive model that allows longer text inputs and is capable of of doing complex compositions. As we can see in the featured image, Parti’s results are more like a work of art than amorphous figures like those offered by DALL-E 2 (image below).
Google’s new image generator
Google researchers recount in a blog post that they tested Parti on four scales (350M, 750M, 3B and 20B) under the same parameters, that is, with the same text inputs. Upon testing, they found that the latter scale especially excels at prompts that are abstract, require knowledge of the world, specific perspectives, and representation of symbols.
In one of the attempts, they used the following input text: “A map of the United States made out of sushi. It is on a table next to a glass of red wine (A map of the United States made of sushi. It’s on a table next to a glass of red wine)”. As we can see, the 350M scale presents a confusing representation, things improve in the 750M, they present “creativity” in the 3B and an amazing result in 20B.
We can also see a test in which the researchers evaluated Parti’s work in different complex scenarios. They entered the text “Portrait of a tiger wearing a train conductor’s hat and holding a skateboard that has a yin-yang symbol on it (Portrait of a tiger in a train driver’s hat holding a skateboard with a yin-yang symbol)“.
And they asked for variants in photography, comic illustration, oil painting, marble statue, among others. Surprisingly, the AI demonstrated its ability to adhere to specific image formats and styles, although not always with such good results. While Parti produces high-quality results for a wide range of indications, the model nonetheless has many limitations.
The Mountain View giant will continue to train and improve its AI models to “improve human creativity and productivity.” It should be noted that for security reasons (Google wants to avoid misuse), Parti is not available to the public, as is the DALL-E Mini, so we will not be able to create our own images from text. However, we are left with the alternative of seeing a large number of examples on the project page and consulting the full investigation.
In Xataka | The first judge made by artificial intelligence is quite impartial. Bad (and good) news for justice