Google's latest AI model, Gemini 2, is a significant step forward in transforming personal computing, web search, and our interaction with the physical world. The new version of Google's flagship AI model has been trained to plan and execute tasks on a user's computers and the web, chat like a person, and make sense of the physical world as a virtual butler.
Gemini 2 is primarily another step up in AI's intelligence as measured by benchmarks used to gauge such things. The model also has improved "multimodal" abilities, meaning it is more skilled at parsing video and audio and conversing in speech. This new version of the AI model can understand more about the world around you, think multiple steps ahead, and take action on your behalf with your supervision.
The company believes that so-called AI agents could be the next big leap forward for technology, revolutionizing personal computing by routinely booking flights, arranging meetings, and analyzing and organizing documents. However, getting the technology to follow open-ended commands reliably remains a challenge, with the risk of errors translating into costly and hard-to-undo mistakes.
To demonstrate Gemini 2's agentic potential, Google is introducing two specialized AI agents: one for coding and another for data science. These agents can take on more complex work, such as checking code into repositories or combining data to enable analysis. The company is also showing off Project Mariner, an experimental Chrome extension that can take over web navigation to do useful chores for users.
The agent was asked to help plan a meal, which saw it navigate to the website of the supermarket chain Sainsbury's, log in to a user's account, and then add relevant items to their shopping basket. When certain items were unavailable, the model chose suitable replacements based on its own knowledge about cooking.
Google is also exploring how one reimagines the user interface with AI through Project Mariner. The company launched Gemini in December 2023 as part of an effort to catch up with OpenAI, the startup behind the wildly popular chatbot ChatGPT. With its new model, Google now offers a chatbot as capable as ChatGPT.
Gemini 2 has also been trained to make sense of its surroundings, as viewed through a smartphone camera or another device, and converse naturally in a humanlike voice about what it sees. This is demonstrated by the new version of an experimental project called Astra, which allows Gemini 2 to assess several wine bottles in view, providing geographical information, details of taste characteristics, and pricing sourced from the web.
The company believes that Astra could be the ultimate recommendation system, connecting books you like to read with food you like to eat. The model can not only search the web for information relevant to a user's surroundings but also remember what it has seen and heard – although users would be able to delete data.
Google acknowledges that bringing AI into the physical world could result in unexpected behaviors. The CEO, Sundar Pichai, believes that companies might be able to pay to have their products highlighted by Astra, offering a business model opportunity for advertising or recommendations.
In conclusion, Gemini 2 is a significant step forward in transforming personal computing, web search, and our interaction with the physical world. While there are challenges to overcome, the new AI model demonstrates Google's commitment to advancing its technology and creating a more intelligent and capable virtual assistant.
2025-01-29T09:49:09
2024-12-27T11:49:26
2024-12-27T11:15:44
2024-12-13T11:08:20
2024-12-11T21:35:58
2024-12-12T21:45:06
2024-12-13T11:08:20
2024-12-15T14:21:54
2024-12-15T14:22:58
2024-12-16T18:01:24
2024-12-16T18:02:16
2024-12-16T18:03:56
2024-12-16T18:05:43
2024-12-17T11:39:28