How to Convert Any Webpage to Markdown from Claude Code with UnWeb MCP

If you’re building RAG pipelines, you’ve hit this wall: you need clean text from web pages, but 30% of them are JS-rendered SPAs that return empty markup. You don’t find out until your retrieval quality craters.

The UnWeb MCP server (@mbsoftsystems/unweb-mcp) gives Claude Code, Cursor, and Windsurf five tools for web-to-markdown conversion — and every response includes a content quality score (0–100) so your agent knows immediately whether the extraction worked.


Setup: 3 Lines of Config

Claude Code

Add this to ~/.claude/settings.json:

{
  "mcpServers": {
    "unweb": {
      "command": "npx",
      "args": ["-y", "@mbsoftsystems/unweb-mcp"],
      "env": { "UNWEB_API_KEY": "unweb_your_key_here" }
    }
  }
}

Cursor

Same config in .cursor/mcp.json:

{
  "mcpServers": {
    "unweb": {
      "command": "npx",
      "args": ["-y", "@mbsoftsystems/unweb-mcp"],
      "env": { "UNWEB_API_KEY": "unweb_your_key_here" }
    }
  }
}

Windsurf uses the same format. Get your API key at app.unweb.info — the free tier gives you 500 credits/month, no credit card required.


What You Get: 5 Tools

Once configured, your AI assistant has access to:

ToolWhat it doesCredits
convert_urlConvert any webpage URL to markdown1
convert_htmlConvert raw HTML string to markdown1
crawl_startStart crawling a docs site (path-bounded BFS)1/page
crawl_statusCheck crawl progress0
crawl_downloadDownload all crawled pages as markdown0

Every conversion response includes a quality score (0–100). The scorer checks content ratio, SPA framework markers (React, Next.js, Nuxt), script density, and semantic tag presence. A score below 40 means the page likely needs a headless browser — you know this before wasting tokens on garbage.


Use Case 1: Converting Docs for RAG Context

You’re working in Claude Code and need the API reference for a library. Instead of copy-pasting, just ask:

Convert https://docs.stripe.com/api/charges to markdown

Claude Code calls convert_url, gets clean CommonMark markdown with the quality score, and has the full reference in context. No browser switching. No manual formatting.

This is especially useful when you’re deep in a coding session and need to reference documentation for an unfamiliar library. The markdown output is clean — headers, code blocks, links all preserved — not raw text with no structure.

Use Case 2: Crawling an Entire Docs Site for Your Vector Store

This is where UnWeb pulls ahead of single-page converters. Say you’re building a support chatbot and need to ingest an entire documentation site:

Crawl https://docs.example.com starting from /guides/ — get all the pages under that path

Claude Code calls crawl_start with the URL and allowed path. The crawler runs a path-bounded BFS, staying within /guides/ and converting each page to markdown.

Your assistant can then check progress with crawl_status and download results with crawl_download — all pages returned as concatenated markdown with page separators:

--- Page: /guides/getting-started.md ---
# Getting Started
Content here...

--- Page: /guides/authentication.md ---
# Authentication
Content here...

The crawler natively exports LangChain JSONL and LlamaIndex JSON formats. Set exportFormat to langchain or llamaindex and skip the format wrangling between scraping and your vector store loader.


Why Not Just Use Firecrawl or Jina?

Both have MCP servers. Here’s where UnWeb differs:

FeatureUnWeb
Content quality scoreEvery response includes a 0–100 quality score. Neither Firecrawl nor Jina tells you how well the extraction worked. When you’re processing thousands of pages, this saves you from polluting your vector store with skeleton HTML.
PricingStarts at $12/month (Starter, 2,000 credits) vs. Firecrawl at $16/month. Free tier: 500 credits/month — enough to evaluate properly before committing.
Native AI framework exportsThe crawler outputs LangChain and LlamaIndex formats directly. No intermediate parsing step.
Broader toolingPython SDK (pip install unweb), Node.js SDK, Go CLI, and browser extensions for Chrome and Firefox. Pick the interface that fits your workflow.

Getting Started

  1. Get an API key at app.unweb.info (free, no credit card)
  2. Add the config to your Claude Code, Cursor, or Windsurf settings (see above)
  3. Start converting — ask your assistant to convert a URL or crawl a docs site

The MCP server runs via npx — no global install, no version management, always up to date.

Full documentation: docs.unweb.info
npm package: @mbsoftsystems/unweb-mcp
GitHub: github.com/mbsoft-systems/unweb-mcp