Google Launches Gemini 2.5 Computer Use Model to Power Smarter, UI-Navigating AI Agents
Early testers, including Google’s own payments team, Autotab, and Poke.com, report significant speed and reliability gains
Google has unveiled the Gemini 2.5 Computer Use model, a new AI system built on Gemini 2.5 Pro, designed to enable agents that can interact directly with user interfaces.
Now available in preview through the Gemini API in Google AI Studio and Vertex AI, the model lets AI agents perform on-screen actions like clicking, typing, and form-filling — essentially using apps and websites the way humans do.
💻 Introducing Gemini 2.5 Computer Use, available in preview via the API.
— Google AI Developers (@googleaidevs) October 7, 2025
It builds on Gemini 2.5 Pro’s vision & reasoning capabilities to power agent interactions with UIs. It completes tasks with lower latency, & outperforms alternatives on web & mobile control benchmarks. pic.twitter.com/K8buLl2gtL
Unlike traditional models that rely on APIs, Gemini 2.5 Computer Use tackles digital tasks that require hands-on interaction, such as navigating logins or completing web forms.
The model’s new computer_use tool processes user inputs, screenshots, and past actions, then decides the next step — from clicking a button to requesting user confirmation before executing sensitive actions.
Google says the model outperforms leading alternatives in both accuracy and latency, according to benchmarks by Browserbase and internal evaluations. While it’s currently optimised for web environments, it also shows strong promise for mobile UIs.
To ensure safety, Google has embedded guardrails to prevent misuse, prompt injections, and unauthorized actions. Developers can also enforce manual confirmations for high-risk tasks.
Early testers, including Google’s own payments team, Autotab, and Poke.com, report significant speed and reliability gains, with workflows completing up to 50% faster.
Comments ()