– Google has updated its Gemini Pro 1.5 AI model to include the ability to listen to audio and video files and provide information without a written transcript
– The update allows Gemini Pro 1.5 to process audio and video content, generate transcripts for video clips, and find specific moments within the files
– The update is currently only available through Google Cloud’s developer dashboard VertexAI, mainly targeted at developers, enterprise, and researchers
Google has enhanced its Gemini Pro 1.5 artificial intelligence model to include the ability to hear the contents of an audio or video file. This update allows the model to listen to uploaded clips and provide information without the need for a written transcript. The Gemini family of models has been trained on various forms of data simultaneously, including audio, video, text, and code, enabling the model to process videos and generate transcripts for video clips.
The latest update to Gemini Pro 1.5 includes a million token context window and the ability to process sound from audio files. The model can identify key moments or specific mentions in podcasts or audio attached to video files. This update is part of the middle-tier of the Gemini family and offers advanced capabilities compared to the Ultra version. Gemini Pro is currently accessible through the Google Cloud developer dashboard VertexAI for developers, enterprise, and researchers.
Google also announced updates to the DeepMind AI image model Imagen 2, which powers Gemini’s image-generation capabilities. The updates include inpainting and outptaining features that allow users to remove or add elements from generated images. Google plans to integrate AI responses across Gemini and other platforms with Google Search to ensure up-to-date information.
Overall, Google is focused on creating more multimodal models that can understand various types of input beyond text. The advancements in Gemini Pro 1.5 and Imagen 2 demonstrate the company’s commitment to innovation in artificial intelligence and machine learning technology.