IT Brief New Zealand - Technology news for CIOs & IT decision-makers
Story image

Google reveals sweeping AI upgrades for Gemini & launches new Beam

Today

Google has unveiled developments in artificial intelligence (AI) across its Gemini platform alongside expansions to products such as Search, Workspace, and video communications.

Google Chief Executive Sundar Pichai delivered a wide-ranging update highlighting the company's swift pace of AI advancement, referencing significant growth metrics, new solutions, and an emphasis on making AI more accessible.

Pichai noted how the typical lead-up to annual events has changed: "Normally, you wouldn't have heard much from us in the weeks leading up to I/O, because we'd be saving up our best models for the stage. But in our Gemini era, we're just as likely to ship our most intelligent model on a Tuesday in March, or announce a really cool breakthrough like AlphaEvolve a week before."

He explained the company's objective: "We want to get our best models into your hands and our products ASAP. And so we're shipping faster than ever."

The introduction of the seventh-generation Tensor Processing Unit (TPU), called Ironwood, was highlighted as part of Google's infrastructure advancements, with Pichai noting, "Our seventh-generation TPU, Ironwood, is the first designed specifically to power thinking and inferential AI workloads at scale. It delivers 10 times the performance over the previous generation, and packs an incredible 42.5 exaflops compute per pod — just amazing."

Pichai described how improvements in infrastructure have contributed to lowering model costs while maintaining performance: "Our infrastructure strength, down to the TPU, is what helps us deliver dramatically faster models, even as model prices are coming down significantly. Over and over, we've been able to deliver the best models at the most effective price point. Not only is Google leading the Pareto Frontier, we've fundamentally shifted the frontier itself."

The company has reported a rapid increase in the adoption of its AI technology. "This time last year, we were processing 9.7 trillion tokens a month across our products and APIs. Now, we're processing over 480 trillion — that's 50 times more. Over 7 million developers are building with Gemini, five times more than this time last year, and Gemini usage on Vertex AI is up 40 times. The Gemini app now has over 400 million monthly active users. We are seeing strong growth and engagement particularly with the 2.5 series of models. For those using 2.5 Pro in the Gemini app, usage has gone up 45%," Pichai stated.

On bringing research into tangible application, Pichai said, "What all this progress means is that we're in a new phase of the AI platform shift. Where decades of research are now becoming reality for people, businesses and communities all over the world."

The progression of Project Starline was addressed, now rebranded as Google Beam. "We debuted Project Starline, our breakthrough 3D video technology, at I/O a few years back. The goal was to create a feeling of being in the same room as someone, even if you were far apart," said Pichai.

He continued, "We've continued to make technical advances. Today we're ready to introduce the next chapter: Google Beam, a new AI-first video communications platform. Beam uses a new state-of-the-art video model to transform 2D video streams into a realistic 3D experience, using an array of six cameras and AI to merge video streams together and render you on a 3D lightfield display. It has near perfect head tracking, down to the millimeter, and at 60 frames per second, all in real-time. The result is a much more natural and deeply immersive conversational experience. In collaboration with HP, the first Google Beam devices will be available for early customers later this year."

Pichai also highlighted speech translation advances in Google Meet, allowing for highly natural, cross-lingual communication: "In near real time, it can match the speaker's voice and tone, and even their expressions — bringing us closer to natural and free-flowing conversation across languages. Translation in English and Spanish is rolling out to Google AI Pro and Ultra subscribers in beta, with more languages coming in the next few weeks. This will come to Workspace business customers for early testing this year."

Gemini Live has added Project Astra's camera and screen-sharing features, broadening real-world applications. "People are using it in interesting ways, from interview preparation to marathon training. This feature is already available to all Android users and rolling out to iOS users starting today," Pichai said.

Discussing AI agents, Pichai presented developments from Project Mariner, now enabling new multitasking and learning techniques: "We think of agents as systems that combine the intelligence of advanced AI models with access to tools, so they can take actions on your behalf and under your control."

He provided detail on forthcoming availability, "We're bringing Project Mariner's computer use capabilities to developers via the Gemini API. Trusted testers like Automation Anywhere and UiPath are already starting to build with it, and it will be available more broadly this summer."

Pichai mentioned the "teach and repeat" method, where agents can learn to perform similar future tasks after being shown once. He described collaboration efforts: "Like our open Agent2Agent Protocol, so that agents can talk to each other, or the Model Context Protocol introduced by Anthropic, so agents can access other services. And today, we're excited to announce that our Gemini API and SDK are now compatible with MCP tools."

Agentic services will feature in the Gemini app, including a new mode for scheduling, filtering, and more, as Pichai explained: "For example, a new Agent Mode in the Gemini app will help you get even more done. If you're apartment hunting, it will help find listings that match your criteria on websites like Zillow, adjust filters and use MCP to access the listings and even schedule a tour for you. An experimental version of Agent Mode in the Gemini app will be coming soon to subscribers. And it's great for companies like Zillow, bringing in new customers and improving conversion rates."

Pichai outlined his vision for personalisation in AI, referencing "personal context" features: "With your permission, Gemini models can use relevant personal context across your Google apps in a way that is private, transparent and fully under your control."

He provided an example: "If your friend emails you for advice about a road trip that you've done in the past, Gemini can do the work of searching your past emails and files in Google Drive, such as itineraries you created in Google Docs, to suggest a response with specific details that are on point. It will match your typical greeting and capture your tone, style and even favorite word choices, all to generate a reply that's more relevant and sounds authentically like you. Personalized Smart Replies will be available for subscribers later this year."

In relation to Google Search, Pichai discussed AI Overviews and the introduction of a comprehensive AI Mode. "Since launching last year, AI Overviews have scaled to over 1.5 billion users and are now in 200 countries and territories. As people use AI Overviews, we see they're happier with their results, and they search more often. In our biggest markets like the U.S. and India, AI Overviews are driving over 10% growth in the types of queries that show them, and this growth increases over time. It's one of the most successful launches in Search in the past decade."

He elaborated on the new AI Mode: "For those who want an end-to-end AI Search experience, we're introducing an all-new AI Mode. It's a total reimagining of Search. With more advanced reasoning, you can ask AI Mode longer and more complex queries. In fact, early testers have been asking queries that are two to three times the length of traditional searches, and you can go further with follow-up questions. All of this is available as a new tab right in Search."

Pichai expressed his perspective as a user: "I've been using it a lot, and it's completely changed how I use Search. And I'm excited to share that AI Mode is coming to everyone in the U.S., starting today. With our latest Gemini models our AI responses are at the quality and accuracy you've come to expect from Search, and are the fastest in the industry. And starting this week, Gemini 2.5, is coming to Search in the U.S., as well."

The Gemini 2.5 model series received updates, with Pichai stating, "Our powerful and most efficient workhorse model, Gemini 2.5 Flash, has been incredibly popular with developers who love its speed and low cost. And the new 2.5 Flash is better in nearly every dimension — improving across key benchmarks for reasoning, multimodality, code and long context. It's second only to 2.5 Pro on the LMArena leaderboard."

He announced new features: "We're making 2.5 Pro even better by introducing an enhanced reasoning mode we're calling Deep Think. It uses our latest cutting-edge research in thinking and reasoning, including parallel thinking techniques."

Gemini app enhancements include the ability to connect with Google Drive and Gmail to support personalised research, create multimedia content through Canvas, and generate dynamic output such as infographics, quizzes, and podcasts in multiple languages. Pichai indicated, "We're making Deep Research more personal, allowing you to upload your own files and soon connect to Google Drive and Gmail, enhancing its ability to generate custom research reports. We're also integrating it with Canvas, enabling the creation of dynamic infographics, quizzes and even podcasts in numerous languages with a single click."

For creative professionals, Google announced Veo 3 for video generation with native audio capability, alongside Imagen 4 for images. These models are intended to broaden creative possibilities, and a new tool called Flow aims to support filmmakers by extending short clips into cinematic scenes.

Pichai closed his remarks by reflecting on the broader impact of AI advancements: "The opportunity with AI is truly as big as it gets. And it will be up to this wave of developers, technology builders and problem solvers to make sure its benefits reach as many people as possible. And it's especially inspiring to think about the research we're working on today that will become the foundation of tomorrow's reality, from robotics to quantum, AlphaFold and Waymo."

He added a personal anecdote: "This opportunity to improve lives is not something I take for granted. And a recent experience brought that home for me. I was in San Francisco with my parents. The first thing they wanted to do was ride in a Waymo, which I'm learning is becoming one of the city's top tourist attractions. I had taken Waymos before, but my father, who is in his 80s, was totally amazed; I saw the progress in a whole new light. It was a reminder of the incredible power of technology to inspire, to awe and to move us forward. And I can't wait to see the amazing things we'll build together next."

Follow us on:
Follow us on LinkedIn Follow us on X
Share on:
Share on LinkedIn Share on X