What's the difference between Firecrawl and Fetch MCP?

Fetch MCP retrieves single pages. Firecrawl crawls entire websites following links, extracting structured data, and handling JavaScript rendering. Use Fetch for single pages, Firecrawl for comprehensive site crawling.

Does Firecrawl require an API key?

Yes, Firecrawl is a paid service. You need an API key from firecrawl.dev. They offer a free tier for getting started.

Can Firecrawl handle JavaScript-rendered sites?

Yes! Firecrawl uses headless browsers to render JavaScript before extracting content, making it effective for modern SPAs and dynamic websites.

How many pages can I crawl?

Depends on your Firecrawl plan. Free tier has limits; paid plans offer more. You can also limit crawls programmatically (e.g., max 100 pages).

What format does Firecrawl return?

Firecrawl returns structured data: markdown content, metadata, links, and optional custom extracted fields. In v2.5 (2025), it also offers Semantic Index for historical web data access.

Is it legal to crawl websites?

Depends on the site's terms of service and robots.txt. Always respect site policies, don't overload servers, and avoid scraping personal data without consent.

Can I extract specific data like prices or contacts?

Yes! Firecrawl supports custom extraction schemas. Define what data you want (prices, emails, addresses) and it extracts structured results.

How fast is Firecrawl?

Very fast - it's optimized for speed with parallel crawling. A 100-page site typically takes seconds to minutes depending on complexity.

Firecrawl MCP Server: Deep Website Crawling for AI

TL;DR - Firecrawl MCP Quick Start

Crawl entire websites for AI analysis - Deep extraction and structured data.

🆕 2025: Firecrawl v2.5 introduces Semantic Index for 40% faster data access, Agent endpoint, and AI-native search capabilities! For an introduction to MCP, see the MCP Introduction guide.

Quick Setup:

{
  "mcpServers": {
    "firecrawl": {
      "command": "npx",
      "args": ["-y", "firecrawl-mcp"],
      "env": {
        "FIRECRAWL_API_KEY": "fc-your-api-key-here"
      }
    }
  }
}

What you can do:

🕷️ Crawl: Entire websites following links
📄 Extract: Structured content and metadata
🔍 Scrape: Single pages with JS rendering
📊 Map: Discover all pages on a site
🎯 Extract: Custom data schemas

Example conversation:

You: Crawl docs.example.com and summarize the documentation structure

Claude: Crawling docs.example.com...

        **Site Structure (47 pages)**
        
        📁 Getting Started (5 pages)
        ├── Introduction
        ├── Installation
        ├── Quick Start
        ├── Configuration
        └── First Steps
        
        📁 Core Concepts (12 pages)
        ├── Architecture
        ├── Components
        ...
        
        📁 API Reference (30 pages)
        ├── Authentication
        ├── Endpoints
        ...

💡 Requires Firecrawl API key - Get one at firecrawl.dev

🤖 AI-Native: Designed specifically for LLMs, RAG pipelines, and agentic systems. For more on RAG, see the RAG, Embeddings & Vector Databases guide.

Firecrawl vs Other Web MCPs

When to use each web-focused MCP:

MCP	Best For	JS Rendering	Speed
Firecrawl	Entire sites, structured extraction	✅ Yes	Fast (parallel)
Fetch	Single pages, simple content	❌ No	Fastest
Playwright	Interactive pages, forms, testing	✅ Yes	Slower

Decision Guide

Need to crawl a whole site?
         │
    ┌────┴────┐
    ▼         ▼
   Yes        No
    │         │
    ▼         ▼
Firecrawl   Single page?
             │
        ┌────┴────┐
        ▼         ▼
       Yes        No (interactive)
        │         │
        ▼         ▼
      Fetch    Playwright

Prerequisites

1. Firecrawl API Key

Go to firecrawl.dev
Sign up for an account
Navigate to API Keys
Create and copy your API key

Pricing Tiers:

Tier	Pages/Month	Features
Free	500	Basic crawling
Starter	3,000	Custom extraction
Standard	50,000	Priority processing
Scale	Unlimited	Dedicated support

2025 Updated Pricing

Plan	Price	Credits/Month
Free	$0	500
Hobby	$16/mo	3,000
Standard	$83/mo	100,000
Growth	$333/mo	500,000
Enterprise	Custom	Unlimited

2. Node.js v18+

node --version  # Should be v18+

Installation & Configuration

Claude Desktop Setup

Add to claude_desktop_config.json:

{
  "mcpServers": {
    "firecrawl": {
      "command": "npx",
      "args": ["-y", "firecrawl-mcp"],
      "env": {
        "FIRECRAWL_API_KEY": "fc-xxxxxxxxxxxxx"
      }
    }
  }
}

Cursor Setup

Add to .cursor/mcp.json:

{
  "mcpServers": {
    "firecrawl": {
      "command": "npx",
      "args": ["-y", "firecrawl-mcp"],
      "env": {
        "FIRECRAWL_API_KEY": "fc-xxxxxxxxxxxxx"
      }
    }
  }
}

Verify Installation

Restart your AI client and test:

You: Crawl example.com

Claude: Crawling example.com...

        **Crawl Results:**
        - Pages found: 1
        - Content extracted: Yes
        
        **Page: Example Domain**
        This domain is for use in illustrative examples...

Available Tools

Core Operations

Tool	Description	Example Prompt
`firecrawl_crawl`	Crawl entire website	”Crawl all of docs.example.com”
`firecrawl_scrape`	Scrape single page	”Scrape the pricing page”
`firecrawl_map`	Get site map/structure	”Map all pages on example.com”

Extraction Options

Option	Description	Use Case
`limit`	Max pages to crawl	Large sites
`excludePaths`	Skip certain paths	Avoid duplicates
`includePaths`	Only crawl matching paths	Focus on section
`extractSchema`	Custom data extraction	Prices, contacts

Practical Workflows

Workflow 1: Documentation Analysis

Crawl and analyze documentation sites. For more on AI research tools, see the AI Search Engines guide.

You: Crawl the React documentation and create a summary 
     of all available hooks

Claude: Crawling React docs...

        **React Hooks Summary** (from react.dev)
        
        **State Hooks:**
        - `useState` - Add state to components
        - `useReducer` - Complex state management
        
        **Context Hooks:**
        - `useContext` - Subscribe to context
        
        **Ref Hooks:**
        - `useRef` - Reference DOM elements
        - `useImperativeHandle` - Customize ref behavior
        
        **Effect Hooks:**
        - `useEffect` - Side effects
        - `useLayoutEffect` - DOM-sync effects
        - `useInsertionEffect` - CSS-in-JS
        
        **Performance Hooks:**
        - `useMemo` - Memoize values
        - `useCallback` - Memoize functions
        - `useTransition` - Non-blocking updates
        - `useDeferredValue` - Defer updates
        
        **Other Hooks:**
        - `useId` - Unique IDs
        - `useSyncExternalStore` - External stores
        - `useActionState` - Form actions
        
        Found 14 documented hooks across 47 pages.

Workflow 2: Competitive Analysis

Analyze competitor websites:

You: Crawl competitor.com and analyze their product offerings

Claude: Crawling competitor.com...

        **Competitor Analysis**
        
        **Products Found (8):**
        
        | Product | Price | Category |
        |---------|-------|----------|
        | Pro Plan | $29/mo | SaaS |
        | Team Plan | $99/mo | SaaS |
        | Enterprise | Custom | SaaS |
        | Widget Toolkit | $199 | One-time |
        | API Access | $0.01/call | Usage |
        
        **Key Features Advertised:**
        - Real-time collaboration
        - 99.9% uptime SLA
        - SOC 2 compliance
        - 24/7 support (Enterprise)
        
        **Content Strategy:**
        - 23 blog posts (SEO focus)
        - 15 case studies
        - 8 integration guides
        
        **Target Audience Signals:**
        - Enterprise language prominent
        - Security messaging strong
        - Developer-focused docs

Workflow 3: Content Inventory

Audit a website’s content:

You: Map all pages on blog.company.com and categorize 
     the content types

Claude: Mapping blog.company.com...

        **Content Inventory** (156 pages)
        
        **By Category:**
        | Category | Count | % of Total |
        |----------|-------|------------|
        | Engineering | 45 | 29% |
        | Product Updates | 38 | 24% |
        | Tutorials | 32 | 21% |
        | Company News | 25 | 16% |
        | Case Studies | 16 | 10% |
        
        **By Year:**
        - 2024: 42 posts
        - 2023: 58 posts
        - 2022: 36 posts
        - 2021: 20 posts
        
        **Top Authors:**
        - Alice Chen: 28 posts
        - Bob Developer: 24 posts
        - Carol Writer: 19 posts
        
        **Content Gaps Identified:**
        - No posts about AI/ML topics
        - Limited video content
        - Few beginner tutorials

Workflow 4: Structured Data Extraction

Extract specific data fields:

You: Crawl realestate.example.com and extract all property listings 
     with price, location, and number of bedrooms

Claude: Crawling with extraction schema...

        **Property Listings** (45 found)
        
        | Address | Price | Beds | Baths | Sqft |
        |---------|-------|------|-------|------|
        | 123 Oak St | $450,000 | 3 | 2 | 1,850 |
        | 456 Maple Ave | $325,000 | 2 | 1 | 1,200 |
        | 789 Pine Rd | $875,000 | 5 | 3 | 3,400 |
        ...
        
        **Statistics:**
        - Average price: $524,000
        - Median beds: 3
        - Price range: $250K - $1.2M
        
        **Location Breakdown:**
        - Downtown: 12 listings
        - Suburbs: 28 listings
        - Waterfront: 5 listings

Workflow 5: Site Migration Prep

Prepare for website migration. For more on AI-powered workflows, see the AI-Powered Workflows guide.

You: Crawl oldsite.com completely and create an inventory 
     for migration to the new platform

Claude: Full site crawl starting...

        **Migration Inventory** (234 pages)
        
        **Page Types:**
        | Type | Count | Action Needed |
        |------|-------|---------------|
        | Static pages | 45 | Direct migrate |
        | Blog posts | 128 | CMS import |
        | Product pages | 32 | DB migration |
        | Landing pages | 18 | Redesign |
        | Redirects needed | 11 | 301 setup |
        
        **URLs with Issues:**
        - 5 pages return 404
        - 3 pages have duplicate content
        - 7 pages missing meta descriptions
        
        **External Links Found:** 156
        - Need verification: 23 (old/potentially broken)
        
        **Images:** 892 total
        - Average size: 245KB
        - Need optimization: 234 (over 500KB)
        
        **Recommended Migration Order:**
        1. Static pages (simple)
        2. Product pages (business critical)
        3. Blog content (volume)
        4. Legacy pages (assess need)

Example Prompts

Full Site Crawling

# Complete crawl
"Crawl all of docs.example.com"
"Crawl company.com with a limit of 100 pages"
"Crawl blog.site.com but exclude /archive/*"

# Focused crawl
"Crawl only the /docs section of example.com"
"Crawl pages matching /api/* on the docs site"
"Crawl the getting-started guides only"

Site Mapping

# Structure discovery
"Map all pages on example.com"
"Show me the site structure of docs.company.io"
"What pages exist on this documentation site?"

# Link analysis
"Find all internal links on this site"
"What external sites does this page link to?"
"Show the navigation structure"

Single Page Scraping

# Content extraction
"Scrape the pricing page from competitor.com"
"Extract the main content from this article"
"Get the product details from this page"

# Specific data
"Scrape contact information from the about page"
"Extract all code examples from this tutorial"
"Get the FAQ content from the support page"

Structured Extraction

# E-commerce
"Crawl the shop and extract product names, prices, and availability"
"Find all products under $50 on this store"

# Real estate
"Extract property listings with addresses and prices"
"Get all rental listings from this site"

# Jobs
"Crawl the careers page and list all open positions"
"Extract job postings with titles and locations"

# Events
"Find all upcoming events with dates and venues"
"Extract conference schedule information"

Analysis Requests

# Content analysis
"Crawl the blog and identify the most common topics"
"Analyze the documentation structure and suggest improvements"
"Find content gaps by comparing to competitor's site"

# SEO analysis
"Crawl the site and check for missing meta descriptions"
"Find pages with duplicate content"
"Analyze internal linking patterns"

# Technical audit
"Check for broken internal links"
"Find pages with slow load indicators"
"Identify missing alt text on images"

Advanced Features

Custom Extraction Schemas

Extract specific structured data:

You: Crawl products.example.com and extract:
     - Product name
     - Price
     - SKU
     - Availability (in stock / out of stock)

Claude: Crawling with custom schema...

        **Extracted Products:**
        
        ```json
        [
          {
            "name": "Widget Pro 2000",
            "price": 149.99,
            "sku": "WP-2000",
            "availability": "in_stock"
          },
          {
            "name": "Gadget Plus",
            "price": 79.99,
            "sku": "GP-100",
            "availability": "out_of_stock"
          }
          ...
        ]

Extracted 45 products with complete data.


### Crawl Limits and Filters

Control crawl scope:

```text
You: Crawl docs.example.com but:
     - Limit to 50 pages max
     - Only include /guides and /api sections
     - Exclude any /deprecated pages

Claude: Crawling with filters...

        Configuration:
        - Max pages: 50
        - Include: /guides/*, /api/*
        - Exclude: /deprecated/*
        
        Found 47 matching pages...

Handling Large Sites

For enterprise sites:

You: This is a large site. Crawl it in sections:
     First, map the structure, then we'll crawl each section

Claude: Good approach for large sites!

        **Step 1: Site Map**
        Main sections found:
        - /docs (estimated 200 pages)
        - /blog (estimated 500 pages)
        - /api (estimated 150 pages)
        - /help (estimated 100 pages)
        
        Total estimated: ~950 pages
        
        Which section should I crawl first?

Combining with Other MCPs

Firecrawl + Filesystem

Save crawled content:

You: Crawl the documentation and save each page as markdown 
     in my docs folder

Claude: [Uses Firecrawl to crawl docs]
        [Uses Filesystem to save files]
        
        Saved 47 documentation pages:
        /docs/external/
        ├── getting-started.md
        ├── configuration.md
        ├── api-reference/
        │   ├── authentication.md
        │   └── endpoints.md
        ...

Firecrawl + Memory

Remember site structures:

You: Crawl the competitor's site and remember how it's organized

Claude: [Uses Firecrawl to analyze site]
        [Uses Memory to store structure]
        
        Memorized competitor site structure:
        - Product tiers: Free, Pro, Enterprise
        - Content sections: Blog, Docs, Resources
        - Key differentiators: Real-time sync, API access
        
        I'll remember this for comparisons!

Firecrawl + Notion

Store findings in workspace:

You: Crawl the documentation and create a summary page in Notion

Claude: [Uses Firecrawl to extract docs]
        [Uses Notion to create summary page]
        
        Created Notion page: "External API Documentation Summary"
        - 15 endpoints documented
        - Authentication patterns
        - Rate limits and quotas
        - Code examples saved

Troubleshooting

Issue: “API key invalid”

Symptoms: Authentication fails

Solutions:

Cause	Solution
Wrong key	Verify copy from Firecrawl dashboard
Key expired	Generate new key
No key set	Check env variable in config

Issue: “Crawl taking too long”

Symptoms: Timeout or slow progress

Solutions:

Add page limit: limit: 50
Focus on specific paths
Check if site is slow
Large sites may need multiple crawls

Issue: “Blocked by site”

Symptoms: Access denied errors

Solutions:

Cause	Solution
robots.txt blocking	Check site policies
Rate limit	Slow down requests
Bot detection	May not be possible to crawl
IP blocked	Contact Firecrawl support

Issue: “Missing content”

Symptoms: Pages not fully extracted

Solutions:

JS-heavy content should render (Firecrawl uses headless browser)
Check if content is loaded dynamically after delay
Login-required content won’t be accessible

Best Practices

Ethical Crawling

✅ Do	❌ Don’t
Check robots.txt first	Ignore site policies
Respect rate limits	Overload servers
Crawl public content	Scrape personal data
Use for legitimate purposes	Violate terms of service

For more on responsible AI tool usage, see the Understanding AI Safety, Ethics, and Limitations guide.

Efficient Usage

Practice	Why
Start with map	Understand site structure first
Set limits	Avoid unnecessary API usage
Filter paths	Focus on needed content
Cache results	Don’t re-crawl unchanged content

Server	Complements Firecrawl By…
Fetch MCP	Quick single-page fetches
Playwright MCP	Interactive automation
Filesystem MCP	Saving crawled content
Memory MCP	Remembering site analysis

Summary

The Firecrawl MCP Server enables comprehensive website crawling:

✅ Full site crawling with link following
✅ JavaScript rendering for modern sites
✅ Structured extraction for specific data
✅ Site mapping for structure discovery
✅ Fast parallel processing
✅ AI-native - designed for LLMs and RAG

2025 Updates (v2.5):

Semantic Index - 40% faster, historical data access
Agent endpoint - for agentic AI systems
Stealth proxies - access difficult sites
AI-native search - built for LLMs

Best use cases:

Documentation analysis
Competitive research
Content inventories
Site migration prep
Structured data extraction

Comparison:

Firecrawl: Whole sites, structured data
Fetch: Single pages, simple content
Playwright: Interactive, testing

🎉 Phase 3 Complete! Continue to Phase 4 for enterprise integrations.

Questions about Firecrawl MCP? Check firecrawl.dev/docs or the Firecrawl GitHub.

TL;DR - Firecrawl MCP Quick Start

Firecrawl vs Other Web MCPs

Decision Guide

Prerequisites

1. Firecrawl API Key

2025 Updated Pricing

2. Node.js v18+

Installation & Configuration

Claude Desktop Setup

Cursor Setup

Verify Installation

Available Tools

Core Operations

Extraction Options

Practical Workflows

Workflow 1: Documentation Analysis

Workflow 2: Competitive Analysis

Workflow 3: Content Inventory

Workflow 4: Structured Data Extraction

Workflow 5: Site Migration Prep

Example Prompts

Full Site Crawling

Site Mapping

Single Page Scraping

Structured Extraction

Analysis Requests

Advanced Features

Custom Extraction Schemas

Handling Large Sites

Combining with Other MCPs

Firecrawl + Filesystem

Firecrawl + Memory

Firecrawl + Notion

Troubleshooting

Issue: “API key invalid”

Issue: “Crawl taking too long”

Issue: “Blocked by site”

Issue: “Missing content”

Best Practices

Ethical Crawling

Efficient Usage

Related MCP Servers

Summary