Of course, both models can generate videos between three and 30 frames per second. According to the company in the announcement made from its website , these two Stable Video Diffusion models ( here the white paper with all the information ) were initially trained with a data set of millions of videos and then were optimized with another series of data from smaller size composed of hundreds of thousands to a million video clips.

The next question is to know the source of all those videos and the information is not really very clear, but it does imply that many of these videos come from public sources, so it is difficult to know whether or not they are under copyright. Both models are capable of generating videos of up to four seconds and in terms of quality they are on par with Meta’s video generation model or the examples produced by Google and its startups Runway and Pika Labs.

A technology with its limits

If the generation of images has already had its setbacks and potholes , such as how complex it can be to create two hands that are expressive and full of details, the generation of video is going down the same path or a worse one. Here are some of the current Stable Video Diffusion limits:

The models cannot produce motionless videos or slow camera pans.

or slow camera pans. Be controlled by text .

. They cannot render text.

At the moment it cannot generate faces or people properly.

The next steps to be taken according to the company is the creation of a variety of models that use the two current SVD and SVD-XT as a base, as well as a text-to-video tool that will bring the introduction of prompts to the models in the Web.

The great objective is the commercialization of this tool to take it to different fields such as advertising, education, entertainment and more. And as is known through Semafor or Forbes itself, Stability is looking for a coup d’état to begin generating profits, since currently its investors are putting pressure on it due to the almost literal burning of existing capital without really seeing it. the fruits in economic terms.

For now, we will have to wait for the web tool to be launched , since this is a preview that shows how the generative AI technology used for video creation works. A company that also launched Stable Audio, its tool for musical generation, so in its hands are some of the most disruptive technologies.