Google continues to expand Gemini’s features, and this time it’s up to audio. After supporting images, documents and videos, the ai chatbot, now a pillar of the ecosystem of the Mountain View giant, now introduces the possibility of loading audio files directly from smartphones or browsers, a newly requested novelty by users and which, inevitably, opens the way to new cases of use.
Follow Google Italia on Telegram, Receive news and offers first
How the loading of audio files in Gemini works
The function is already available on Android, iOS and Web. As you can see from our screens, just open the menu marked by the “+” and click on “files” from mobile, or “charge file” on desktop, and select a file in the supported formats (MP3, M4A, WAV, etc.). Gemini will be able to recognize the audio file and its content without problems.
Google underlines that the length limits of the audio file vary according to the subscribed plan:
- Free users: up to 10 minutes of audio per file.
- Subscribers to the pros and ultra plans: up to 3 hours of audio content chargeable in a single solution.
An interesting aspect is that Gemini allows you to manage audio like any other loaded file: analysis, automatic transcription, summary and even insight extraction from conversations or recordings. A function that adds to the support already existing for videos, with different limits (5 minutes free, up to 1 hour for subscribers, max 2 GB).
Not just audio: all the limits updated for files in Gemini
On the occasion of the announcement of this novelty, Google took advantage of it to summarize the specifications and limits for the other formats:
- Generic files: up to 100 MB each, with a maximum of 10 files per chat.
- Video: max 5 minutes free, 1 hour pro/ultra, up to 2 GB.
- Code folders or Github repository: up to 5,000 files, for a total of 100 MB.
- Zip file: maximum 10 elements per archive.
In practice, Gemini is increasingly confirmed as an “open” platform for any type of input, with flexible constraints for those who use the free version and much wider margins for those who choose season tickets.
The possibility of loading Vocal recordings, interviews, lessons or meetings It represents a leap forward in the concrete utility of the app. If up to today Gemini has often been seen as an assistant for text, images and code, now it also becomes a transcriptor and analyst of audio content.
Google also stressed that this was the most popular request from users in recent months: a signal of how central the audio is now in the creation and management of both recreational and professional content.
Gemini becomes more and more a universal hub
With the arrival of the support for audio, Gemini completes an important piece of its strategy in fact now allows users to interact with any type of file without leaving the app. Therefore, not only more texts and prompts, but also documents, images, videos and now audio files.
The intent of that is to transform Gemini from a simple chatbot to a productivity tool and universal analysis of any type of file capable of becoming a point of reference for each type of user.