web-scraping TypeScript Official

Firecrawl MCP Server

Scrape, crawl, and extract structured data from any website using Firecrawl's API inside an AI agent

The Firecrawl MCP server is the official Model Context Protocol integration for Firecrawl's web scraping and crawling API. It lets AI agents scrape individual pages, crawl entire sites, and extract structured data from web content in Markdown or JSON format. Firecrawl handles JavaScript rendering, anti-bot measures, and clean content extraction so the agent gets readable output rather than raw HTML.

Most AI agents have a web problem. Give them a URL and they either fetch raw HTML (which is mostly unusable noise) or call a search API that returns snippets without full page content. Neither is great when the agent actually needs to read and work with what is on a page.

Firecrawl solves this properly. It runs a real headless browser, strips the markup and boilerplate, and returns clean readable content. The MCP server wraps that capability so any agent can use it without API client code. For agents that need to work with real web content, this is one of the more practical integrations available.

What the server exposes

The tool surface covers five distinct scraping and crawling operations:

scrape. Fetches a single URL and returns the content as clean Markdown (or JSON if you prefer). The Markdown output removes navigation menus, sidebars, footers, ads, and cookie banners, leaving the main content. For a typical article or documentation page, the output is immediately readable and well-structured. This is what you want when you need the content of one specific page.

crawl. Starts at a URL and follows internal links to collect content from multiple pages under the same domain. You configure the maximum crawl depth and the maximum number of pages. The tool returns a list of page objects, each with its URL and content. Useful for indexing documentation sites, scraping a product catalog, or collecting all posts from a blog.

extract. Takes a URL and a JSON schema describing the structure you want. Firecrawl uses AI extraction to identify the matching content on the page and returns a structured object matching your schema. If you want product names and prices from an e-commerce page, or speaker names and bios from a conference site, the extract tool handles this without custom parsing logic.

search. Runs a web search and returns the content of the top results, not just the search result links. This is meaningfully different from other search tools: instead of getting URLs the agent then has to visit, you get the actual content of those pages in one call. For research tasks where the agent needs multiple sources, this reduces the number of tool calls significantly.

map. Discovers all URLs accessible under a domain without scraping the full content of each page. The output is a list of URLs with their link structure. Useful as a first step before a targeted crawl, or for auditing what pages exist on a site.

Setup

The setup is simple. You need a Firecrawl account and an API key.

Create an account at firecrawl.dev and find your API key in the dashboard. The free tier includes enough credits to start evaluating without entering payment information.

Add to Claude Desktop at ~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "firecrawl": {
      "command": "npx",
      "args": ["-y", "firecrawl-mcp"],
      "env": {
        "FIRECRAWL_API_KEY": "fc-your-api-key-here"
      }
    }
  }
}

For Claude Code, the config lives at ~/.claude/mcp.json with the same structure.

The fc- prefix is part of the Firecrawl API key format. Make sure to copy the full key including the prefix.

That is the entire setup. No local dependencies, no browser installation. Firecrawl runs the headless browser on its servers and returns processed output.

Real use cases

Research and summarization. An agent doing research on a topic can use the search tool to pull content from multiple sources, then synthesize it. The difference from using a search-only tool is that the agent gets the actual article or documentation content, not just a headline and snippet. For deep research tasks, this closes a major gap.

Competitor analysis. Scrape a competitor's pricing page, product list, or documentation and ask the agent to compare it to yours. The extract tool works well here if the information has a consistent structure across pages. Map the site first to understand what content exists, then scrape the pages you care about.

Documentation indexing. Crawl an entire documentation site and ask the agent to summarize it, find gaps, identify outdated content, or extract all API endpoint descriptions. A documentation crawl with depth limit set to 3 or 4 will pull most of what a typical docs site contains in a few minutes.

Lead and contact research. Company websites often have structured contact and team information. The extract tool with an appropriate schema can pull names, titles, and contact information from an about page or team page. Useful for sales research or building outreach lists.

Content monitoring. Scrape a specific page and compare the result to a previous scrape. The agent can identify what changed (prices, product availability, policy text, contact information) when you give it two versions to compare. Without Firecrawl, this would require storing and diffing raw HTML.

E-commerce data collection. Product pages typically have structured data (name, price, description, specifications) that the extract tool handles well. You can pull product information from multiple pages without writing custom scrapers for each site's markup.

Knowledge base building. For teams building internal RAG systems or knowledge bases from public documentation, Firecrawl MCP lets an agent crawl a target site, pull the content, and pipe it into whatever storage system you have configured. Pairing it with Filesystem MCP lets the agent write the crawled content to local files for further processing.

Comparing to the Fetch MCP server

The Fetch MCP server does simple HTTP requests. It is lightweight, free, has no rate limits, and works well for pages that return clean HTML with static content.

Firecrawl is the right tool when:

The page content is loaded by JavaScript (React, Vue, Angular apps, dynamic tables)
You want clean Markdown rather than raw HTML
You need to crawl multiple pages from one starting URL
You want structured data extraction from unstructured page content
The site has anti-bot measures that block simple HTTP requests

The tradeoffs: Firecrawl costs money above the free tier, introduces API latency (headless browser rendering takes longer than a simple HTTP request), and routes your requests through Firecrawl's servers rather than directly. For content that works fine with a plain HTTP fetch, the Fetch MCP server is faster and simpler.

Use both: Fetch for simple cases, Firecrawl when you actually need what it provides.

Responsible use

Web scraping exists in a legally and ethically complicated space. A few things worth being clear about:

Check robots.txt. Firecrawl respects robots.txt by default, but the MCP server does not enforce this independently. Be aware of what a site's robots.txt says before crawling it aggressively.

Rate limiting. Aggressive crawls can place meaningful load on small sites. Firecrawl has built-in rate limiting, but for very large crawls you should configure reasonable page limits and not run bulk crawls against sites where the traffic would be noticeable.

Terms of service. Many sites prohibit automated scraping in their ToS. Scraped data used for training, redistribution, or commercial purposes carries additional risks. This is a judgment call that depends on the specific site and your intended use.

API key security. Your Firecrawl API key is tied to your billing account. Keep it out of version-controlled config files and out of any logs. Set it via environment variable, not inline in the config JSON.

Pairing with other servers

Firecrawl MCP pairs naturally with storage and processing tools.

Combined with Filesystem MCP, the agent can crawl a site, clean the content, and write the output to a local directory for offline use or further processing.

Combined with Memory MCP, scraped content can be stored in a persistent memory graph that the agent queries across sessions. The agent effectively builds a local index of crawled content.

For research workflows, pairing Firecrawl with Brave Search MCP gives the agent both the ability to find relevant URLs and the ability to read their full content. Search finds the pages; Firecrawl reads them.

Bottom line

For agents that need to work with real web content, Firecrawl MCP is the most capable option available. The combination of JavaScript rendering, clean Markdown output, multi-page crawling, and structured data extraction covers the main reasons simple HTTP fetching falls short.

The free tier is enough to evaluate it seriously. For any agent workflow that involves research, content monitoring, or data extraction from the web, this belongs in the config.

Features

Scrape single URLs to clean Markdown or structured JSON
Crawl entire websites with configurable depth and page limits
Extract structured data using a JSON schema from any page
Search the web and scrape results in one call
Map site structure by discovering all URLs under a domain
Handles JavaScript-rendered pages transparently
Returns clean Markdown without boilerplate, ads, or navigation
Authenticates via Firecrawl API key

How to set up the Firecrawl MCP Server MCP server

Create an account at firecrawl.dev and copy your API key from the dashboard
Add the server block to your Claude Desktop or Claude Code MCP config
Set FIRECRAWL_API_KEY in the env block
Restart your MCP client and verify Firecrawl tools appear in the tool list
Test by asking the agent to scrape a URL and return it as Markdown

Frequently Asked Questions

What is the Firecrawl MCP server?

It is the official MCP integration for Firecrawl's scraping and crawling API. It gives AI agents tools to fetch and clean web content, crawl sites, extract structured data from pages, and map site structure. Firecrawl handles JavaScript rendering and content cleaning, so the agent receives readable output rather than raw HTML.

How does Firecrawl differ from the Fetch MCP server?

The Fetch MCP server does a simple HTTP GET and returns the raw or lightly processed response. Firecrawl does significantly more: it renders JavaScript, bypasses common anti-bot measures, strips navigation and ads, returns clean Markdown, and can crawl multi-page sites or extract structured JSON from pages. Firecrawl is the right tool when the content you want is loaded by JavaScript or when you need clean, usable output rather than raw HTML.

Does Firecrawl MCP handle JavaScript-rendered pages?

Yes. Firecrawl runs a headless browser on the server side and returns the rendered content. This is the key advantage over simple HTTP scraping tools. Pages that require JavaScript to display their main content (single-page apps, dynamic tables, infinite scroll lists) work correctly with Firecrawl.

What's the difference between scrape and crawl tools?

The scrape tool fetches a single URL and returns its content. The crawl tool starts at a URL and follows links to discover and scrape multiple pages under the same domain, up to a configurable depth and page limit. Use scrape when you need one specific page; use crawl when you need content from an entire site or section.

Does the Firecrawl MCP server require a paid plan?

Firecrawl has a free tier with a monthly credit limit. For light agent use (a few scrapes per session), the free tier is usually sufficient. Heavier crawls and high-volume use require a paid plan. Check firecrawl.dev for current pricing.

Can the agent extract structured data from pages?

Yes. The extract tool accepts a URL and a JSON schema describing the data structure you want. Firecrawl uses AI extraction to map the page content to your schema and returns a structured JSON object. This is useful for pulling product data, contact information, pricing tables, or any consistently structured content from pages.