After OpenAI's GPT-4o launch, Google announces its competitor Project Astra

Project Astra rivals OpenAI’s GPT-4o by offering a similar model capable of understanding and generating content across various modalities.

15 May 2024 08:56 IST

New Update

Project Astra rivals OpenAI’s GPT-4o by offering a similar model capable of understanding and generating content across various modalities.

Google has unveiled Project Astra at this year’s Google I/O, a day after OpenAI announced its new GPT-4o model. OpenAI’s move, strategically timed ahead of Google’s major event, did not catch the tech behemoth off guard. Instead, Google was ready with a powerful counter, showcasing its advancements in AI technology through its Gemini suite, including the new and formidable multi-modal AI model Project Astra.

The announcement came with a demonstration that quickly garnered attention on social media platform X (formerly known as Twitter). In a clear response to OpenAI’s latest achievements, Google demonstrated how its own AI, Gemini, could analyse a room and make intelligent guesses about the ongoing activities. This feature mirrors the capabilities shown by OpenAI with its latest ChatGPT model, suggesting a head-to-head competition in multi-modal AI technology.

Project Astra is the centrepiece of Google’s recent AI announcements. It aims to rival OpenAI’s GPT-4o by offering a similarly sophisticated model capable of understanding and generating content across various modalities. However, Astra is only one of many new developments within the Gemini portfolio that Google showcased during the event.

A notable introduction is the Gemini 1.5 Flash model. This model is designed to perform common tasks like summarisation and captioning at a much faster rate, addressing the increasing demand for speed and efficiency in AI operations. Speed is also a key feature of another new model, Gemini Nano, which is optimised for use on local devices such as smartphones. Google claims that Nano’s performance surpasses previous iterations, making it the fastest model for on-device applications.

The tech giant also introduced Gemini Veo, a model capable of generating videos from text prompts.

In addition to these models, Google has significantly improved the context window of Gemini Pro. The context window, which determines how much information the model can process in a single query, has been doubled to accommodate 2 million tokens. This enhancement allows Gemini Pro to handle more complex instructions and provide more detailed responses, further solidifying its position as a top-tier AI model.

To enhance user interaction with these advanced models, Google has introduced Gemini Live. This new product is a voice-only assistant designed for seamless conversational interactions. Users can engage in natural, back-and-forth dialogue with the AI, even interrupting it if it becomes too verbose or revisiting earlier parts of the conversation. This feature promises a more user-friendly and intuitive experience, making AI assistance more accessible to a broader audience.

Another innovative feature announced at Google I/O is the new capability in Google Lens, which now allows users to search the web by recording and narrating a video. This integration of video input into web search represents a significant leap in how users can interact with and utilise search technology, providing a more dynamic and engaging way to find information online.

GenAI Gemini AI Google I/O Project Astra