Skip to content

web-fetch-tool

Fetches web content and returns page text, truncated to 100KB by default. Works out of the box with no configuration for simple pages.

  1. Direct fetch — makes a standard HTTP request with a browser-like User-Agent. Works for static HTML, APIs, and simple pages.
  2. Firecrawl fallback — if FIRECRAWL_API_KEY is set, automatically falls back to Firecrawl in two cases:
    • Truncated content — response exceeds the size limit
    • JS-dependent pages — the HTML looks like an empty shell that needs JavaScript to render (SPAs, React/Vue/ Angular/Next.js apps)

The JS detection runs automatically on every HTML response and checks three signals:

  • Empty SPA roots<div id="root"></div>, <div id="app"></div>, <div id="__next"></div>, <div id="__nuxt"></div>
  • Low text-to-markup ratio — pages over 1KB with less than 200 characters of visible text after stripping tags
  • Framework bundles with no content_app, main, bundle, or chunk script references combined with less than 500 characters of visible text

If any of these patterns match and Firecrawl is available, it fetches via headless browser automatically — even if the page is well under the size limit. Sites like Excalidraw, Miro, draw.io, and Vue Playground all trigger this detection.

Without Firecrawl, you get the raw HTML as-is (which for JS-dependent pages means an empty shell).

A plain fetch sends an HTTP request and gets back whatever the server returns. For many modern sites, that’s an empty shell with a <script> tag — the actual content is rendered by JavaScript in the browser.

Firecrawl runs a headless browser that:

  • Executes JavaScript and waits for dynamic content to load
  • Handles anti-bot protections and CAPTCHAs
  • Navigates through proxies to avoid IP blocking
  • Extracts the main content as clean markdown, stripping nav, ads, and boilerplate

This means SPAs (React, Vue, Angular), dashboards behind authentication walls, and JS-heavy documentation sites all return usable content instead of empty HTML skeletons.

Firecrawl is optional. Without it, web-fetch still works — you just get raw (possibly truncated) HTML.

Sign up at firecrawl.dev and create an API key. The free tier includes 500 credits (pages) per month — enough for typical agent usage.

Add FIRECRAWL_API_KEY to your environment so tallow can read it at runtime. Where you put it depends on your setup:

Option A: Shell profile (simplest)

Terminal window
# ~/.zshrc or ~/.bashrc
export FIRECRAWL_API_KEY="fc-your-key-here"

Option B: direnv (per-project)

Terminal window
# .envrc in your project root
export FIRECRAWL_API_KEY="fc-your-key-here"

Run tallow and fetch a JavaScript-heavy page. If Firecrawl is active, you’ll see source: "firecrawl" in the tool details and the content will be clean markdown instead of raw HTML.

ParameterTypeDefaultDescription
urlstringURL to fetch
maxBytesnumber100000Max bytes before truncation/Firecrawl
formatstring”text”Output format hint: “text”, “markdown”, “html”
  • Reading documentation or articles
  • Checking API responses
  • Fetching page content for summarization
  • Any URL where you need the text, not the HTML