Anthropic Computer Use vs Project Mariner: Two Visions for AI Web Agents
Claude controls your screen through vision. Gemini controls your Chrome tab. Two different architectures, two different availability stories, one emerging category.
The question of whether AI can operate a computer like a human has moved from science fiction to active API documentation in about two years. Anthropic Computer Use and Project Mariner are the two most-discussed attempts to answer that question in 2026. They approach the same goal from different architectural foundations, have very different availability stories, and have found different audiences as a result.
This isn't a comparison between two mature, competing products. It's a comparison between a production API capability and a research preview. That distinction shapes almost every practical conclusion here.
The 30-second answer
Anthropic Computer Use is a production-ready API that lets Claude control a full desktop through vision today. Developers are building with it, teams are running it in workflows, and it handles more than just a browser. Project Mariner is Google's research-preview Chrome extension that shows what a Gemini-powered web agent can do in a real browser session. Mariner is more polished and more browser-specific in its demonstrations. Computer Use is available and buildable right now. If you need to ship something, that's the deciding factor.
What Anthropic Computer Use actually is
Anthropic Computer Use is a set of tools available in the Claude API that lets Claude interact with a computer by seeing screenshots and taking actions. You spin up a virtual machine or a container, give Claude screenshot access, and Claude can click, type, scroll, drag, open applications, run commands, and navigate the web. It's not a consumer product or a Chrome extension. It's an API capability that developers integrate into their own systems.
The architecture is vision-based. Claude looks at a screenshot of the current screen state, reasons about what's on it, decides on the next action, calls a tool (mouse click, keyboard input, scroll), and then processes the next screenshot. This loop repeats until the task is complete or Claude reaches a decision point that requires user input.
The full-desktop scope is the key capability difference from browser-only agents. Claude can open a file manager, work in a spreadsheet application, run terminal commands, switch between applications, and interact with any software that has a visible interface, not just pages in a browser. For tasks that touch multiple parts of a computer environment, this is genuinely useful.
The beta label is honest. Computer Use works well for tasks that are within Claude's visual understanding and action space, but it fails on tasks with unusual UI patterns, very small click targets, or long sequences where any single error compounds. You need to handle failures in your integration code because tasks don't always complete cleanly.
Pricing is API token-based. You pay per screenshot processed and per action taken, at standard Claude API rates. Claude 3.7 Sonnet balances cost and speed for most tasks. Claude 4 Opus handles the hardest visual reasoning challenges but costs more.
What Project Mariner actually is
Project Mariner is a research project from Google DeepMind, first demonstrated in December 2024. It runs as a Chrome extension and uses Gemini 2.5 to understand and interact with web pages in the user's active Chrome tab. The agent sees the rendered page using visual understanding, identifies what actions are available, and executes steps to complete the user's stated goal.
The Chrome-native design is both a strength and a constraint. Because Mariner runs inside the user's real Chrome session, it has access to the user's existing logins, cookies, and browser state. Tasks like "check my email for the confirmation number and fill in this form" are natural fits because the agent is already authenticated everywhere the user is. This is more convenient than running a separate browser instance that needs its own authentication setup.
Google's research demonstrations have been notably impressive. Mariner handles visually complex pages well, adapts to unusual layouts, and completes multi-step flows on real consumer websites with fewer errors than many comparable systems. This likely reflects both Gemini 2.5's strong visual reasoning and the tight integration between the agent and the Chrome rendering engine.
The limitation is access. Mariner remains a research preview with no public API, no SDK, and no timeline for general availability. Developers cannot integrate it. Businesses cannot deploy it. You can experience it only if you have been granted preview access through Google's research program.
Head-to-head: architecture
| Anthropic Computer Use | Project Mariner | |
|---|---|---|
| Scope | Full desktop + browser | Browser only (Chrome) |
| Vision approach | Screenshot-based | Rendered page visual understanding |
| Runs in user's session | No (separate environment) | Yes (active Chrome tab) |
| Model | Claude 3.7 Sonnet / Claude 4 Opus | Gemini 2.5 |
| Integration with user auth | Requires setup | Native (user's Chrome session) |
| API available | Yes | No |
The architecture gap matters for what kinds of tasks each tool is suited for. Mariner's use of the user's existing session is more convenient for personal task completion, no re-authentication, no separate browser state to manage. Computer Use's separate environment is better for programmatic workflows where you want reproducibility and isolation from the user's personal session.
Full desktop access in Computer Use is an expansion of scope that Mariner simply doesn't match. Anything that happens outside a browser window is out of Mariner's reach.
Head-to-head: reliability and failure handling
This is where both tools show their limitations, and it's worth being direct about it.
Anthropic Computer Use is genuinely useful but also genuinely brittle on hard tasks. The screenshot-to-action loop accumulates errors in long task sequences. A misclick on step 4 of a 12-step task can send the agent down a bad path. The API gives you the tools to build retry logic and human-in-the-loop checkpoints, but that complexity falls on the developer.
Anthropic recommends running Computer Use with human oversight on any task that has real-world consequences, form submissions, file deletions, purchases, emails sent. The model can hallucinate screen states, mistake one UI element for another, and proceed confidently on incorrect assumptions. The beta label reflects this reality.
Project Mariner's research demonstrations show fewer errors than you'd expect from a beta product, but research demonstrations are not the same as production performance. Google shows Mariner succeeding at carefully chosen tasks. Real-world performance on arbitrary user goals will inevitably surface failures that don't appear in curated demos.
Both tools are in a similar position: impressive, useful for specific tasks under supervision, not ready to run autonomously on high-stakes workflows without guardrails.
Head-to-head: use cases
Anthropic Computer Use is currently most useful for:
- Automated testing of desktop and web applications, where you want to verify that a UI behaves correctly without writing explicit test scripts.
- Browser automation where the target site blocks traditional scraping approaches but renders normally in a browser with standard interaction patterns.
- Multi-application workflows that span a browser, a terminal, and a file system in a single task.
- Internal tools at companies that need to automate work in web applications that don't expose an API.
Project Mariner, in its current research-preview form, is best understood as a demonstration of what's possible rather than a tool you deploy. If you have preview access, it's worth using to understand the state of the art. If you're evaluating it for production use, you're waiting for general availability.
Head-to-head: developer experience
Computer Use has real documentation, a real API, clear tool specifications, and a community of developers who have built with it. Anthropic publishes examples and usage guidance. The tool set is well-defined: computer_20241022 tool type with screenshot, click, type, scroll, and key actions. Integration requires setting up a VNC or virtual display environment, which adds infrastructure overhead but is well-understood by developers.
Project Mariner has no developer experience to evaluate. It's a research preview with a Chrome extension interface and no public SDK. Developers interested in Mariner's approach should watch the Computer Use API for Anthropic's continued iteration and watch Google's announcements for when Mariner capabilities become accessible to builders.
The competitive context
This category is moving fast. Both Anthropic and Google are investing heavily, and the gap between research preview and production capability is narrowing.
Browser Use is the open-source alternative that lets you build DOM-based browser agents with any LLM today. OpenAI Operator is OpenAI's entry in this space, a more polished consumer product than Computer Use's raw API. Skyvern offers a hosted browser automation service that handles some of the infrastructure complexity that Computer Use requires you to manage.
For developers who want full-desktop control right now, Computer Use is the only production-ready API option in the major model providers' offerings.
When Anthropic Computer Use wins
Computer Use is the right choice when you need to build a working system today. For developers automating workflows that span a browser and other desktop software, for teams automating web interactions on sites that don't have APIs, or for QA workflows that need to test real browser behavior, Computer Use is the production-ready path.
It's also the right choice when Gemini isn't your preferred model. Computer Use is model-agnostic in the sense that it's a tool set you use through Claude, if Claude's reasoning, instruction-following, and tool use are aligned with your existing stack, Computer Use fits naturally.
When Project Mariner is worth watching
Mariner is worth watching if your use case is specifically about in-session personal web task completion and you can wait for general availability. The architecture is well-suited for tasks where user authentication and browser state matter, the kind of personal assistant use cases where the agent works in your session on your behalf.
If you're planning product roadmaps around web agents and want to understand where Google is heading with Gemini-native browser capabilities, following Mariner closely is reasonable. When it ships broadly, the combination of Gemini 2.5 vision and tight Chrome integration will likely produce a capable competitor to what Computer Use offers today.
The verdict
This comparison ultimately comes down to availability. Anthropic Computer Use is a real API that developers use in production. Project Mariner is research that isn't available to build with yet.
On the technical merits, they're building toward similar outcomes from different positions. Computer Use has the broader scope, full desktop, any application, but requires more infrastructure overhead to deploy. Mariner's Chrome-native approach is cleaner for browser-specific tasks and has shown strong results in demonstrations, but you can't access it.
For 2026, the practical answer is simple: use Computer Use if you're building, and watch Mariner for when Google's browser agent capabilities become broadly available. The research preview signal is strong enough that when Mariner does ship, it will be worth a direct production comparison. Until then, Computer Use is what building in this category actually looks like.
Anthropic Computer Use
Claude's computer-use capability that powers desktop and browser agents
Paid
Read full review →Project Mariner
Google DeepMind's experimental browser agent for completing web tasks
From $20/mo
Read full review →Side-by-side comparison
| Anthropic Computer Use | Project Mariner | |
|---|---|---|
| Tagline | Claude's computer-use capability that powers desktop and browser agents | Google DeepMind's experimental browser agent for completing web tasks |
| Pricing | Paid | From $20/mo |
| Categories | autonomous, computer-use, api | autonomous, browser-agent, research |
| Made by | Anthropic | Google DeepMind |
| Launched | 2024-10 | 2024-12 |
| Platforms | macOS, Linux, Windows, API | Chrome browser |
| Status | active | active |
Anthropic Computer Use highlights
- + Screenshot capture with pixel-accurate coordinate targeting and zoom
- + Mouse control: click, drag, scroll, double-click, right-click
- + Keyboard input: type text, key combos, modifier-key chords
- + Bash tool for shell command execution alongside visual control
- + Text editor tool for direct file reads and string-replace edits
Project Mariner highlights
- + Chrome extension that takes over the active browser tab to complete multi-step tasks
- + Gemini 2.0 multimodal brain reads pixels, web elements, text, forms, and images simultaneously
- + Sandboxed execution: agent is limited to the currently active tab and cannot access other tabs or local files
- + Human-in-the-loop confirmation gates for sensitive actions such as purchases or form submissions
- + 83.5% score on the WebVoyager benchmark for end-to-end web task completion