On December 11, Google introduced an upgraded version of its flagship AI model, Gemini. The latest iteration, Gemini 2.0, is designed to drive advancements in virtual agents, marking a step forward as competition intensifies among tech giants aiming for leadership in the AI space, mentioned in a blog post by Google.
Google introduced Gemini 2.0 a year after debuting its Gemini AI model family, designed from scratch with multimodal capabilities. The Gemini launch marked Google’s first AI model following the April 2023 merger of its AI research divisions, DeepMind and Google Brain, into a unified entity called Google DeepMind, led by DeepMind CEO Demis Hassabis.
Google CEO Sundar Pichai said, “If Gemini 1.0 was about organising and understanding information, Gemini 2.0 is about making it much more useful."
Starting today, developers can experiment with the new Gemini 2.0 Flash model, accessible through the Gemini API in Google AI Studio and Vertex AI. This new version builds on Gemini 1.5 Flash, which was introduced in May and optimized for fast, low-latency applications.
Along with enhanced performance, Gemini 2.0 Flash introduces new features, including multimodal output support that combines text and natively generated images, as well as customisable text-to-speech (TTS) in multiple languages. Additionally, it now supports native integration with tools like Google Search, code execution, and user-defined third-party functions.
Multimodal input and text output features are now available to all developers, while text-to-speech and native image generation are currently limited to early-access partners. These features are expected to be widely available by January, along with additional model sizes.
Additionally, Google plans to launch a new Multimodal Live API, which will enable real-time audio and video streaming input, along with the ability to utilize multiple tools simultaneously. This new offering aims to help developers create dynamic and interactive applications, as outlined in the company’s blog post.
Consumers worldwide will soon have access to a chat-optimised version of Gemini 2.0 via the Gemini AI chatbot, available on both desktop and mobile web platforms, with mobile app support coming soon.
Pichai also highlighted the introduction of a new feature, Deep Research, which utilises advanced reasoning and long-context capabilities to act as a research assistant. This feature can analyse complex topics and generate reports for users and will be available to subscribers of Gemini Advanced, the premium tier of the Gemini chatbot.
Google is incorporating Gemini 2.0 into several new research projects, including Project Astra, a futuristic universal AI assistant, Project Mariner, an experimental extension for Chrome that can take actions, and Jules, an AI-driven code assistant currently in development.