Advertisement
  1. News
  2. Technology
  3. Gemini 2.5 Computer use model: A smarter AI for real tasks

Gemini 2.5 Computer use model: A smarter AI for real tasks

Google launched Gemini 2.5 Computer Use model, a major upgrade that enables AI agents to directly interact with apps and websites — just like humans do. The model allows developers to automate complex web tasks, fill out forms, click buttons, and even manage workflows using the Gemini API.

Gemini 2.5 Computer Use model
Gemini 2.5 Computer Use model Image Source : Google
Written By: Saumya Nigam @snigam04
Published: , Updated:
New Delhi:

Google has announced the launch of its Gemini 2.5 Computer Use model, designed to enable AI systems to control and navigate graphical user interfaces (GUIs). Unlike traditional AI models that work with structured data or APIs, this version can perform human-like actions such as clicking, typing, scrolling, and submitting forms directly on websites and mobile apps.

According to Google, this new model outperforms its competitors on several web and mobile automation benchmarks, offering lower latency and making it faster and more efficient for real-world use.

What makes it different from existing Gemini models

While earlier Gemini models focused mainly on text, vision, and reasoning, the new Computer Use model adds a unique layer- the ability to interact directly with live user interfaces.

It’s powered by Gemini 2.5 Pro’s advanced visual reasoning, enabling it to understand screenshots, identify clickable buttons, and execute commands in real time.

The model operates in a loop structure, meaning it continually updates itself after every action- taking new screenshots, understanding the current screen, and deciding the next move until the task is complete.

Benefits to users and developers

The new Gemini model can be a game-changer for automation. Developers can now use it for:

  • UI testing: Automatically finding and fixing app interface errors.
  • Workflow automation: Performing repetitive online tasks like filling forms or sorting data.
  • Personal assistants: Building smarter agents capable of handling digital tasks independently.

Early testers, including companies like Autotab and Poke.com, reported that Gemini 2.5 was up to 50 per cent faster and more accurate than competing models. Google’s internal teams are already using it to improve software testing and payment systems.

How to use the Gemini 2.5 Computer Use Model

Developers can access the new model through the Gemini API on Google AI Studio and Vertex AI. Here’s how to get started:

  • Access the public preview via Google AI Studio or Vertex AI.
  • Set up your project using Playwright or Browserbase for browser-based automation.
  • Use the computer_use tool, which takes inputs like screenshots, user requests, and action history.
  • Run the agent loop — the model performs actions, updates screenshots, and continues until the task is done.

Users can also try a live demo through Browserbase to see Gemini’s automation in action.

Safety comes first

Google emphasised that safety and responsible use are at the heart of the Gemini 2.5 Computer Use model. It includes built-in safety checks, per-step verification, and developer controls that prevent harmful or high-risk actions- such as bypassing CAPTCHA or making unauthorised transactions.

The future of AI-powered automation

With this release, Google is taking a major step toward fully autonomous digital agents. The Gemini 2.5 Computer Use model not only enhances productivity for developers but also signals the start of a future where AI can safely and efficiently perform human-like digital tasks.

 

Read all the Breaking News Live on indiatvnews.com and Get Latest English News & Updates from Technology
Advertisement
Advertisement
Advertisement
Advertisement
 
\