web Python Official

Fetch

Let AI agents read any web page, converted to clean markdown

The Fetch MCP server is Anthropic's official reference implementation for giving AI agents access to the web. It exposes a single fetch tool that takes a URL, retrieves the page over HTTP, converts the HTML to markdown, and returns the result to the agent. No browser, no JavaScript execution. Just clean, readable content the model can reason over.

When an AI agent encounters a URL in a task, the question is always the same: can it actually read that page? Most of the time you do not need a browser. You need something that fetches the HTML, strips the noise, and hands the model clean text. The Fetch MCP server does exactly that and nothing more. It is one of the official reference implementations from Anthropic, written in Python, and it is the simplest way to give an agent access to the public web.

Quick verdict

If your agents need to read documentation, check a GitHub README, pull a reference page from MDN, or grab the text of any publicly accessible static site, Fetch MCP is the right tool. Install it in two commands, configure it once, and it works. The limitation is just as clear: it does not run JavaScript. If the page you need is a React application that loads data from an API after the initial HTML lands, Fetch MCP will return a nearly empty document. Know the boundary and you will never be frustrated by it.

What the Fetch MCP server actually does

The server exposes one tool called fetch. You give it a URL, it makes an HTTP GET request using Python's standard HTTP libraries, takes whatever HTML comes back, runs it through an HTML-to-markdown converter, and returns the result. Installing Node.js alongside the server enables a more capable HTML simplifier, which is worth doing for pages with complex layouts.

The output is plain markdown. Headers become markdown headers. Lists become markdown lists. Links are preserved with their href values. Code blocks in <pre> tags survive the conversion. What gets stripped is everything that does not help the model: navigation menus, cookie banners, analytics scripts, inline CSS, JavaScript.

The tool accepts four parameters:

url (required): the address to retrieve
max_length (optional): character limit for the response, default 5000
start_index (optional): character position to start extraction from, default 0
raw (optional): return content without markdown conversion, useful for non-HTML documents

The combination of max_length and start_index is the key to reading long pages. If a documentation page is 20,000 characters and you want all of it, the agent makes multiple calls: first with start_index 0, then 5000, then 10000, and so on until the page is covered. This is chunked reading without any additional plumbing.

Installation and configuration

The server is Python-based and the recommended install path uses uvx, the package runner from the uv project:

uvx mcp-server-fetch

If you prefer pip:

pip install mcp-server-fetch
python -m mcp_server_fetch

To add it to Claude Desktop, update your claude_desktop_config.json:

{
  "mcpServers": {
    "fetch": {
      "command": "uvx",
      "args": ["mcp-server-fetch"]
    }
  }
}

For Claude Code, the same config goes in your .claude/mcp.json file at the project or user level. Restart the client and the fetch tool appears in the agent's tool list.

Optional flags worth knowing:

--ignore-robots-txt: skip robots.txt checks for all requests
--user-agent=YourAgent: override the default user-agent string
--proxy-url=http://proxy:8080: route requests through a proxy

Windows users may need PYTHONIOENCODING=utf-8 set as an environment variable if they see encoding errors on pages with non-ASCII characters.

How robots.txt handling works

This is one of the more thoughtful details in the implementation. The server sends two different user-agent strings depending on how the request originated.

For model-initiated requests, where the agent decided on its own to fetch a URL, the user-agent is:

ModelContextProtocol/1.0 (Autonomous; +https://github.com/modelcontextprotocol/servers)

For user-initiated requests, where a human explicitly provided a URL in their message, the user-agent is:

ModelContextProtocol/1.0 (User-Specified; +https://github.com/modelcontextprotocol/servers)

Model-initiated requests respect robots.txt by default. User-initiated requests do not. The reasoning is that a human choosing to retrieve a page is exercising their own judgment, while an autonomous agent should follow the same etiquette a well-behaved crawler would. You can disable robots.txt checking entirely with --ignore-robots-txt if you have a specific use case that requires it.

When Fetch MCP is enough

For AI agents doing research tasks, Fetch MCP covers a surprisingly large fraction of what they need to read. Most developer documentation is static HTML or server-rendered. GitHub READMEs, Hacker News threads, Wikipedia articles, npm package pages, PyPI documentation, official language references, most blog posts, news articles, and plain-text files at public URLs all return useful content from a basic HTTP GET.

The markdown conversion is good enough that agents can parse structure from it. A docs page with nested sections comes back with heading hierarchy intact. A changelog with a list of fixes comes back as a proper markdown list. API reference pages with code examples preserve the <pre> blocks. The agent does not need to work hard to extract meaning from the output.

For use cases like checking whether a library has a specific method, reading a framework's getting-started guide, pulling a recent blog post to summarize, or verifying a configuration option in official docs, Fetch MCP handles all of it without any browser overhead.

When you need something more

The boundary is JavaScript. Fetch MCP makes one HTTP request and works with what comes back. Any page that defers its content to client-side JavaScript will not give the agent useful output.

This covers a lot of modern web:

Single-page applications built with React, Vue, or Angular where content is injected after the initial render
Dashboards that require authentication tokens refreshed in the browser
Sites that detect headless requests and serve a challenge page or redirect
Dynamic tables or infinite scroll lists that load data from APIs on user interaction
Paywalled content that requires a cookie from an interactive login flow

For these cases you need a browser-based MCP server. The Puppeteer MCP server controls a Chromium instance, executes JavaScript, waits for page load events, and returns the fully rendered DOM. It handles authentication flows, handles cookie-dependent sessions, and can interact with UI elements. It is heavier, slower, and requires more setup than Fetch MCP. But if the page requires JavaScript to display its content, Puppeteer is the right tool.

The practical pattern is to default to Fetch MCP and escalate to a browser-based server only when content does not come through. Most research tasks that agents run stay within the static web, so Fetch MCP handles the majority without the complexity tax.

Security considerations

Two things to know before you deploy this server.

First, the server can make requests to local and internal IP addresses. If an agent running with Fetch MCP can be influenced by external content to request http://192.168.1.1 or http://localhost:8080/admin, that is a Server-Side Request Forgery (SSRF) vector. In a development setup on your own machine, this is mostly an academic concern. In a shared environment or a server-side deployment, it requires explicit attention. Either restrict what URLs the agent is permitted to construct, or deploy the server behind a network policy that blocks private address ranges.

Second, the dual user-agent system means that some servers may respond differently to Fetch MCP requests than they would to a browser. Sites can detect the user-agent string and block it, redirect it, or serve a simplified version of the page. This is not a security issue, it is just something to account for when an agent reports that a page returned no content.

Comparing Fetch MCP to alternatives

Fetch MCP vs Puppeteer MCP: Fetch is a Python HTTP client with HTML conversion. Puppeteer runs a real Chromium browser. Fetch is fast, lightweight, and requires no external browser installation. Puppeteer handles JavaScript, interactive elements, and complex authentication. Use Fetch for static content, Puppeteer when the page requires JavaScript to render. See the Puppeteer MCP server for the full comparison.

Fetch MCP vs browser-use the agent framework: Browser-use is a full agent framework built around browser automation, not just an MCP server. It handles multi-step navigation, form filling, clicking, and maintaining state across a session. Fetch MCP is a single-request tool. If your agent needs to navigate a multi-page flow to get to the content, browser-use is the right level of abstraction. If your agent needs to read a URL it already has, Fetch MCP is the right level.

Fetch MCP vs writing a custom HTTP tool: For a one-off project, you could write a small function that calls requests.get() and returns the content. Fetch MCP wraps that in the MCP standard, which means any MCP-compatible client can use it without custom code per client. Once you have it configured, it works with Claude Desktop, Claude Code, and any other MCP client you adopt later. The standardization is the value.

Using Fetch MCP in practice

The most common pattern is an agent that encounters a URL during a research task and needs to read it before continuing. With Fetch MCP configured, the agent can do this without any special instruction. It calls the fetch tool, gets back markdown, and keeps going.

For a task like "summarize the release notes for version 4.2 of this library," an agent with Fetch MCP will fetch the CHANGELOG or releases page, parse the markdown structure, find the version 4.2 section, and return a summary. No human copy-paste involved.

For multi-page documentation, the agent can follow links it finds in fetched pages. It reads the index page, sees links to sub-pages, fetches the relevant ones, and builds a complete picture from multiple requests. This works because the markdown output preserves [link text](href) format and agents can extract and act on those URLs.

The chunked reading pattern matters most for long API references or specification documents. If a page returns 50,000 characters and the agent needs a specific section from near the end, it can binary-search with start_index rather than reading the full document sequentially. Smart agents will check whether they found the relevant content and adjust start_index accordingly rather than always reading from the beginning.

Getting started

Install uv if you do not have it, then run:

uvx mcp-server-fetch

Add the server block to your MCP client config as shown above. Restart the client. The fetch tool is now available.

To verify it works, ask your agent to fetch a URL you know well, like the Python docs for the requests library or the README for a GitHub project you use. Check that the response is clean markdown with the content you expected. If the page is JavaScript-dependent and comes back nearly empty, that is the correct behavior: the server is working, and the page simply requires a browser to render.

For agents doing web research alongside code work, this is the first MCP server worth adding after the filesystem server and whatever version control tools you use. It has zero ongoing cost beyond the compute of making HTTP requests, and it opens a significant fraction of the public web to your agents without any browser infrastructure to maintain.

The bottom line

Fetch MCP is the smallest useful tool in the MCP ecosystem. It does one thing: retrieve a URL and return the content as markdown. The implementation is clean, the robots.txt handling is thoughtful, and the chunked reading via start_index solves the long-page problem without added complexity. It is not a browser and it does not pretend to be one. For agents that need to read the static web, that is exactly right.

Features

HTTP GET to any URL with automatic HTML-to-markdown conversion
Chunked reading via start_index for long pages
robots.txt honored by default for model-initiated requests
Dual user-agent strings distinguishing autonomous vs user-directed requests
Raw mode for non-HTML content
Configurable user-agent and proxy support

How to set up the Fetch MCP server

Install Python and uv if not already present
Run uvx mcp-server-fetch to start the server, or pip install mcp-server-fetch
Add the server to your Claude Desktop or Claude Code MCP config
Restart your client and the fetch tool appears

Frequently Asked Questions

What is the Fetch MCP server?

It is an official Anthropic reference server for the Model Context Protocol. It gives AI agents a single fetch tool that retrieves a URL over HTTP and returns the page content as markdown. No browser is involved. It is best suited for static pages and documentation sites.

Does Fetch MCP handle JavaScript-rendered pages?

No. The server makes a plain HTTP GET request and converts whatever HTML comes back into markdown. If a page relies on JavaScript to populate content, the agent will only see the initial HTML shell, which is often empty or incomplete. For JavaScript-heavy sites you need a browser-based MCP server like Puppeteer.

How does Fetch MCP handle robots.txt?

By default, model-initiated requests respect robots.txt. User-initiated requests (where a human explicitly provides the URL in a prompt) bypass robots.txt. You can disable robots.txt checking entirely with the --ignore-robots-txt flag when starting the server.

Can I read long web pages with Fetch MCP?

Yes. The fetch tool accepts a start_index parameter that sets the character position where extraction begins. Combined with the max_length parameter (default 5000 characters), you can read large pages in chunks by incrementing start_index across multiple calls.

Is Fetch MCP secure to run locally?

With some caveats. The server can access local IP addresses and internal network URLs, not just public websites. If an agent can be prompted to fetch internal addresses, that is a risk to consider. Run it with the same care you would give any tool that makes outbound HTTP requests.