Tech News

The weak point of the AI, it cannot mix animals in the same scene

April 3, 2023

If you have tried to put a snake and a rabbit in the same image using an AI like Midjourney, DALL-E and others, you will have seen the results be disastrous.

Rabbits with scales, snakes with hair… and if on top of that you ask for something like “a snake eating a rabbit”, to go from National Geographic through life, things get really bad. Here are some examples of the result:

This is one of the specific problems that occurs in the Midjourney system, the creation of images involving animals of different species, although I have tried it with DALL-E and the result is not very different. Something similar happens with v5 of Midjourney, and with Adobe Firefly there is not much difference either, although it improves.

Snake and Rabbit with Adobe Firefly

why is this happening

This problem is largely due to the way the Midjourney system works. In essence, the AI system learns to generate images from patterns it finds in existing image data sets. The system breaks the image down into a number of features, such as shapes, textures, and colors, and then uses these to create a new image that matches the textual description provided. However, when asked to generate images involving multiple concepts, such as animals of different species, the system may have difficulty effectively combining the features of both objects.

Another contributing factor to this problem is the lack of diversity in the training data sets used by the system. Image data sets may be biased toward certain features or patterns, limiting the system’s ability to generate images involving unusual combinations of objects or concepts.

What can be done to solve the problem

Although the problem of generating images involving animals of different species is challenging, there are strategies that AI system developers can use to improve the quality of the generated images. One strategy is to train the system with more diverse data sets that include a wide variety of images involving multiple objects or concepts. Another approach is the use of image generation techniques based on adversarial learning, which allow the system to progressively improve the quality of the images generated as it receives feedback on its performance.