Meta has just released a new generative AI tool that is sound-focused and capable of generating musical melodies based on text cues, similar to how OpenAI’s Dall-E creates images. This tool is called AudioCraft and consists of three models: MusicGen, AudioGen and EnCodec, all available in open source. These models have been trained on a large catalog of licensed music and public domain sound effects, providing high quality sound with minimal artifacts.

Using text prompts, AudioCraft can generate a variety of sound effects such as birdsong, moving map sounds, and more. Meta claims that this tool can be used not only to create regular music, but even to create epic music for bedtime stories.

Meta emphasizes that AudioCraft is easier to use compared to competing platforms and hopes it becomes a useful tool for businesses and content creators who want to add unique sound effects to their videos on platforms like Instagram.

AudioCraft uses “EnCodec Neural Audio Codec” which processes audio in tokenized format similar to AI chatbots. With text hints, you can specify tone types and sound sources to create a unique sound clip.

AudioCraft is Meta’s next attempt at generative AI. The company also offers Voicebox for generating audio clips in six languages, and CM3leon for images and text.

However, it should be noted that AudioCraft does not provide full control over the creation of sound clips, as if you were using a real instrument or a professional synthesizer.

