TL;DR - Firecrawl MCP Quick Start
Crawl entire websites for AI analysis - Deep extraction and structured data.
🆕 2025: Firecrawl v2.5 introduces Semantic Index for 40% faster data access, Agent endpoint, and AI-native search capabilities! For an introduction to MCP, see the MCP Introduction guide.
Quick Setup:
{
"mcpServers": {
"firecrawl": {
"command": "npx",
"args": ["-y", "firecrawl-mcp"],
"env": {
"FIRECRAWL_API_KEY": "fc-your-api-key-here"
}
}
}
}
What you can do:
- 🕷️ Crawl: Entire websites following links
- 📄 Extract: Structured content and metadata
- 🔍 Scrape: Single pages with JS rendering
- 📊 Map: Discover all pages on a site
- 🎯 Extract: Custom data schemas
Example conversation:
You: Crawl docs.example.com and summarize the documentation structure
Claude: Crawling docs.example.com...
**Site Structure (47 pages)**
📁 Getting Started (5 pages)
├── Introduction
├── Installation
├── Quick Start
├── Configuration
└── First Steps
📁 Core Concepts (12 pages)
├── Architecture
├── Components
...
📁 API Reference (30 pages)
├── Authentication
├── Endpoints
...
💡 Requires Firecrawl API key - Get one at firecrawl.dev
🤖 AI-Native: Designed specifically for LLMs, RAG pipelines, and agentic systems. For more on RAG, see the RAG, Embeddings & Vector Databases guide.
Firecrawl vs Other Web MCPs
When to use each web-focused MCP:
| MCP | Best For | JS Rendering | Speed |
|---|---|---|---|
| Firecrawl | Entire sites, structured extraction | ✅ Yes | Fast (parallel) |
| Fetch | Single pages, simple content | ❌ No | Fastest |
| Playwright | Interactive pages, forms, testing | ✅ Yes | Slower |
Decision Guide
Need to crawl a whole site?
│
┌────┴────┐
▼ ▼
Yes No
│ │
▼ ▼
Firecrawl Single page?
│
┌────┴────┐
▼ ▼
Yes No (interactive)
│ │
▼ ▼
Fetch Playwright
Prerequisites
1. Firecrawl API Key
- Go to firecrawl.dev
- Sign up for an account
- Navigate to API Keys
- Create and copy your API key
Pricing Tiers:
| Tier | Pages/Month | Features |
|---|---|---|
| Free | 500 | Basic crawling |
| Starter | 3,000 | Custom extraction |
| Standard | 50,000 | Priority processing |
| Scale | Unlimited | Dedicated support |
2025 Updated Pricing
| Plan | Price | Credits/Month |
|---|---|---|
| Free | $0 | 500 |
| Hobby | $16/mo | 3,000 |
| Standard | $83/mo | 100,000 |
| Growth | $333/mo | 500,000 |
| Enterprise | Custom | Unlimited |
2. Node.js v18+
node --version # Should be v18+
Installation & Configuration
Claude Desktop Setup
Add to claude_desktop_config.json:
{
"mcpServers": {
"firecrawl": {
"command": "npx",
"args": ["-y", "firecrawl-mcp"],
"env": {
"FIRECRAWL_API_KEY": "fc-xxxxxxxxxxxxx"
}
}
}
}
Cursor Setup
Add to .cursor/mcp.json:
{
"mcpServers": {
"firecrawl": {
"command": "npx",
"args": ["-y", "firecrawl-mcp"],
"env": {
"FIRECRAWL_API_KEY": "fc-xxxxxxxxxxxxx"
}
}
}
}
Verify Installation
Restart your AI client and test:
You: Crawl example.com
Claude: Crawling example.com...
**Crawl Results:**
- Pages found: 1
- Content extracted: Yes
**Page: Example Domain**
This domain is for use in illustrative examples...
Available Tools
Core Operations
| Tool | Description | Example Prompt |
|---|---|---|
firecrawl_crawl | Crawl entire website | ”Crawl all of docs.example.com” |
firecrawl_scrape | Scrape single page | ”Scrape the pricing page” |
firecrawl_map | Get site map/structure | ”Map all pages on example.com” |
Extraction Options
| Option | Description | Use Case |
|---|---|---|
limit | Max pages to crawl | Large sites |
excludePaths | Skip certain paths | Avoid duplicates |
includePaths | Only crawl matching paths | Focus on section |
extractSchema | Custom data extraction | Prices, contacts |
Practical Workflows
Workflow 1: Documentation Analysis
Crawl and analyze documentation sites. For more on AI research tools, see the AI Search Engines guide.
You: Crawl the React documentation and create a summary
of all available hooks
Claude: Crawling React docs...
**React Hooks Summary** (from react.dev)
**State Hooks:**
- `useState` - Add state to components
- `useReducer` - Complex state management
**Context Hooks:**
- `useContext` - Subscribe to context
**Ref Hooks:**
- `useRef` - Reference DOM elements
- `useImperativeHandle` - Customize ref behavior
**Effect Hooks:**
- `useEffect` - Side effects
- `useLayoutEffect` - DOM-sync effects
- `useInsertionEffect` - CSS-in-JS
**Performance Hooks:**
- `useMemo` - Memoize values
- `useCallback` - Memoize functions
- `useTransition` - Non-blocking updates
- `useDeferredValue` - Defer updates
**Other Hooks:**
- `useId` - Unique IDs
- `useSyncExternalStore` - External stores
- `useActionState` - Form actions
Found 14 documented hooks across 47 pages.
Workflow 2: Competitive Analysis
Analyze competitor websites:
You: Crawl competitor.com and analyze their product offerings
Claude: Crawling competitor.com...
**Competitor Analysis**
**Products Found (8):**
| Product | Price | Category |
|---------|-------|----------|
| Pro Plan | $29/mo | SaaS |
| Team Plan | $99/mo | SaaS |
| Enterprise | Custom | SaaS |
| Widget Toolkit | $199 | One-time |
| API Access | $0.01/call | Usage |
**Key Features Advertised:**
- Real-time collaboration
- 99.9% uptime SLA
- SOC 2 compliance
- 24/7 support (Enterprise)
**Content Strategy:**
- 23 blog posts (SEO focus)
- 15 case studies
- 8 integration guides
**Target Audience Signals:**
- Enterprise language prominent
- Security messaging strong
- Developer-focused docs
Workflow 3: Content Inventory
Audit a website’s content:
You: Map all pages on blog.company.com and categorize
the content types
Claude: Mapping blog.company.com...
**Content Inventory** (156 pages)
**By Category:**
| Category | Count | % of Total |
|----------|-------|------------|
| Engineering | 45 | 29% |
| Product Updates | 38 | 24% |
| Tutorials | 32 | 21% |
| Company News | 25 | 16% |
| Case Studies | 16 | 10% |
**By Year:**
- 2024: 42 posts
- 2023: 58 posts
- 2022: 36 posts
- 2021: 20 posts
**Top Authors:**
- Alice Chen: 28 posts
- Bob Developer: 24 posts
- Carol Writer: 19 posts
**Content Gaps Identified:**
- No posts about AI/ML topics
- Limited video content
- Few beginner tutorials
Workflow 4: Structured Data Extraction
Extract specific data fields:
You: Crawl realestate.example.com and extract all property listings
with price, location, and number of bedrooms
Claude: Crawling with extraction schema...
**Property Listings** (45 found)
| Address | Price | Beds | Baths | Sqft |
|---------|-------|------|-------|------|
| 123 Oak St | $450,000 | 3 | 2 | 1,850 |
| 456 Maple Ave | $325,000 | 2 | 1 | 1,200 |
| 789 Pine Rd | $875,000 | 5 | 3 | 3,400 |
...
**Statistics:**
- Average price: $524,000
- Median beds: 3
- Price range: $250K - $1.2M
**Location Breakdown:**
- Downtown: 12 listings
- Suburbs: 28 listings
- Waterfront: 5 listings
Workflow 5: Site Migration Prep
Prepare for website migration. For more on AI-powered workflows, see the AI-Powered Workflows guide.
You: Crawl oldsite.com completely and create an inventory
for migration to the new platform
Claude: Full site crawl starting...
**Migration Inventory** (234 pages)
**Page Types:**
| Type | Count | Action Needed |
|------|-------|---------------|
| Static pages | 45 | Direct migrate |
| Blog posts | 128 | CMS import |
| Product pages | 32 | DB migration |
| Landing pages | 18 | Redesign |
| Redirects needed | 11 | 301 setup |
**URLs with Issues:**
- 5 pages return 404
- 3 pages have duplicate content
- 7 pages missing meta descriptions
**External Links Found:** 156
- Need verification: 23 (old/potentially broken)
**Images:** 892 total
- Average size: 245KB
- Need optimization: 234 (over 500KB)
**Recommended Migration Order:**
1. Static pages (simple)
2. Product pages (business critical)
3. Blog content (volume)
4. Legacy pages (assess need)
Example Prompts
Full Site Crawling
# Complete crawl
"Crawl all of docs.example.com"
"Crawl company.com with a limit of 100 pages"
"Crawl blog.site.com but exclude /archive/*"
# Focused crawl
"Crawl only the /docs section of example.com"
"Crawl pages matching /api/* on the docs site"
"Crawl the getting-started guides only"
Site Mapping
# Structure discovery
"Map all pages on example.com"
"Show me the site structure of docs.company.io"
"What pages exist on this documentation site?"
# Link analysis
"Find all internal links on this site"
"What external sites does this page link to?"
"Show the navigation structure"
Single Page Scraping
# Content extraction
"Scrape the pricing page from competitor.com"
"Extract the main content from this article"
"Get the product details from this page"
# Specific data
"Scrape contact information from the about page"
"Extract all code examples from this tutorial"
"Get the FAQ content from the support page"
Structured Extraction
# E-commerce
"Crawl the shop and extract product names, prices, and availability"
"Find all products under $50 on this store"
# Real estate
"Extract property listings with addresses and prices"
"Get all rental listings from this site"
# Jobs
"Crawl the careers page and list all open positions"
"Extract job postings with titles and locations"
# Events
"Find all upcoming events with dates and venues"
"Extract conference schedule information"
Analysis Requests
# Content analysis
"Crawl the blog and identify the most common topics"
"Analyze the documentation structure and suggest improvements"
"Find content gaps by comparing to competitor's site"
# SEO analysis
"Crawl the site and check for missing meta descriptions"
"Find pages with duplicate content"
"Analyze internal linking patterns"
# Technical audit
"Check for broken internal links"
"Find pages with slow load indicators"
"Identify missing alt text on images"
Advanced Features
Custom Extraction Schemas
Extract specific structured data:
You: Crawl products.example.com and extract:
- Product name
- Price
- SKU
- Availability (in stock / out of stock)
Claude: Crawling with custom schema...
**Extracted Products:**
```json
[
{
"name": "Widget Pro 2000",
"price": 149.99,
"sku": "WP-2000",
"availability": "in_stock"
},
{
"name": "Gadget Plus",
"price": 79.99,
"sku": "GP-100",
"availability": "out_of_stock"
}
...
]
Extracted 45 products with complete data.
### Crawl Limits and Filters
Control crawl scope:
```text
You: Crawl docs.example.com but:
- Limit to 50 pages max
- Only include /guides and /api sections
- Exclude any /deprecated pages
Claude: Crawling with filters...
Configuration:
- Max pages: 50
- Include: /guides/*, /api/*
- Exclude: /deprecated/*
Found 47 matching pages...
Handling Large Sites
For enterprise sites:
You: This is a large site. Crawl it in sections:
First, map the structure, then we'll crawl each section
Claude: Good approach for large sites!
**Step 1: Site Map**
Main sections found:
- /docs (estimated 200 pages)
- /blog (estimated 500 pages)
- /api (estimated 150 pages)
- /help (estimated 100 pages)
Total estimated: ~950 pages
Which section should I crawl first?
Combining with Other MCPs
Firecrawl + Filesystem
Save crawled content:
You: Crawl the documentation and save each page as markdown
in my docs folder
Claude: [Uses Firecrawl to crawl docs]
[Uses Filesystem to save files]
Saved 47 documentation pages:
/docs/external/
├── getting-started.md
├── configuration.md
├── api-reference/
│ ├── authentication.md
│ └── endpoints.md
...
Firecrawl + Memory
Remember site structures:
You: Crawl the competitor's site and remember how it's organized
Claude: [Uses Firecrawl to analyze site]
[Uses Memory to store structure]
Memorized competitor site structure:
- Product tiers: Free, Pro, Enterprise
- Content sections: Blog, Docs, Resources
- Key differentiators: Real-time sync, API access
I'll remember this for comparisons!
Firecrawl + Notion
Store findings in workspace:
You: Crawl the documentation and create a summary page in Notion
Claude: [Uses Firecrawl to extract docs]
[Uses Notion to create summary page]
Created Notion page: "External API Documentation Summary"
- 15 endpoints documented
- Authentication patterns
- Rate limits and quotas
- Code examples saved
Troubleshooting
Issue: “API key invalid”
Symptoms: Authentication fails
Solutions:
| Cause | Solution |
|---|---|
| Wrong key | Verify copy from Firecrawl dashboard |
| Key expired | Generate new key |
| No key set | Check env variable in config |
Issue: “Crawl taking too long”
Symptoms: Timeout or slow progress
Solutions:
- Add page limit:
limit: 50 - Focus on specific paths
- Check if site is slow
- Large sites may need multiple crawls
Issue: “Blocked by site”
Symptoms: Access denied errors
Solutions:
| Cause | Solution |
|---|---|
| robots.txt blocking | Check site policies |
| Rate limit | Slow down requests |
| Bot detection | May not be possible to crawl |
| IP blocked | Contact Firecrawl support |
Issue: “Missing content”
Symptoms: Pages not fully extracted
Solutions:
- JS-heavy content should render (Firecrawl uses headless browser)
- Check if content is loaded dynamically after delay
- Login-required content won’t be accessible
Best Practices
Ethical Crawling
| ✅ Do | ❌ Don’t |
|---|---|
| Check robots.txt first | Ignore site policies |
| Respect rate limits | Overload servers |
| Crawl public content | Scrape personal data |
| Use for legitimate purposes | Violate terms of service |
For more on responsible AI tool usage, see the Understanding AI Safety, Ethics, and Limitations guide.
Efficient Usage
| Practice | Why |
|---|---|
| Start with map | Understand site structure first |
| Set limits | Avoid unnecessary API usage |
| Filter paths | Focus on needed content |
| Cache results | Don’t re-crawl unchanged content |
Related MCP Servers
| Server | Complements Firecrawl By… |
|---|---|
| Fetch MCP | Quick single-page fetches |
| Playwright MCP | Interactive automation |
| Filesystem MCP | Saving crawled content |
| Memory MCP | Remembering site analysis |
Summary
The Firecrawl MCP Server enables comprehensive website crawling:
- ✅ Full site crawling with link following
- ✅ JavaScript rendering for modern sites
- ✅ Structured extraction for specific data
- ✅ Site mapping for structure discovery
- ✅ Fast parallel processing
- ✅ AI-native - designed for LLMs and RAG
2025 Updates (v2.5):
- Semantic Index - 40% faster, historical data access
- Agent endpoint - for agentic AI systems
- Stealth proxies - access difficult sites
- AI-native search - built for LLMs
Best use cases:
- Documentation analysis
- Competitive research
- Content inventories
- Site migration prep
- Structured data extraction
Comparison:
- Firecrawl: Whole sites, structured data
- Fetch: Single pages, simple content
- Playwright: Interactive, testing
🎉 Phase 3 Complete! Continue to Phase 4 for enterprise integrations.
Questions about Firecrawl MCP? Check firecrawl.dev/docs or the Firecrawl GitHub.