← Back to Portfolio

Improve Claude Code WebFetch with Zyte API

Claude Code's built-in WebFetch tool is incredibly useful for pulling content from the web during development sessions. However, it has limitations — many sites block automated requests, return incomplete content, or serve CAPTCHA pages. This is where Zyte API comes in.

The Problem

If you've used WebFetch extensively, you've likely run into situations where:

- Sites return 403 Forbidden or CAPTCHA pages
- JavaScript-rendered content doesn't load
- Anti-bot protections strip the content you need
- Rate limiting kicks in after a few requests

These limitations can break your workflow when you're trying to research documentation, pull reference implementations, or gather data during a coding session.

What is Zyte API?

Zyte API is a web scraping and data extraction service that handles all the complexity of reliably fetching web content. It manages browser rendering, proxy rotation, CAPTCHA solving, and anti-bot bypass automatically. You send it a URL, and it returns clean, complete content.

The Setup

The beauty of this approach is how simple it is. All you need to do is add instructions to your CLAUDE.md file. Here's the idea:

1. Get a Zyte API key from zyte.com
2. Add instructions to your CLAUDE.md telling Claude that when WebFetch fails with a 403 or bot protection error, it should fall back to the Zyte API
3. Include your API key and a link to the Zyte API documentation so Claude knows how to use it
4. Specify the two request modes — a simple HTTP proxy request using httpResponseBody, and a JavaScript-rendered request using browserHtml for pages that require a full browser

That's it. Claude is smart enough to figure out when a simple proxy request will work and when it needs full JavaScript rendering. You don't need to configure anything beyond the CLAUDE.md instructions.

How It Works in Practice

When Claude encounters a WebFetch failure, it reads the CLAUDE.md instructions and makes a curl request to the Zyte API endpoint. The API returns base64-encoded HTML, which Claude decodes and processes just like it would with a normal WebFetch response.

For static pages, it uses the httpResponseBody parameter for a fast, lightweight request. For JavaScript-heavy sites like SPAs or pages that load content dynamically, it switches to browserHtml which renders the page in a real browser before returning the content.

I put this in my global CLAUDE.md context so it works across every project without any additional setup. It has been working flawlessly — pages that used to be completely inaccessible now return full content every time.

Why This Approach Works So Well

- Zero infrastructure — no servers to build, deploy, or maintain
- Works immediately — just edit a text file and you're done
- Portable — put it in your global context and it works on every project
- Claude handles the logic — it decides when to use the proxy and when to use browser rendering based on the error it encounters

The Code

Here's exactly what I added to my CLAUDE.md. Drop this into your own and replace the API key with yours:

## Zyte API for Web Fetching

When WebFetch fails with 403 errors (bot protection), use the Zyte API to bypass restrictions.

**API Key:** `YOUR_ZYTE_API_KEY`

### Method 1: Bash Script (Recommended)

Create a reusable fetch script:

  #!/bin/bash
  API_KEY="YOUR_ZYTE_API_KEY"
  URL="$1"
  OUTPUT="$2"

  curl -s --basic --user "${API_KEY}:" \
       -H 'Content-Type: application/json' \
       -d '{"url": "${URL}", "httpResponseBody": true}' \
       --compressed \
       'https://api.zyte.com/v1/extract' | jq -r '.httpResponseBody' | base64 --decode > "$OUTPUT"

Usage: ./fetch_zyte.sh "https://example.com/article" "output.html"

### Method 2: Python

  import requests
  import base64

  API_KEY = "YOUR_ZYTE_API_KEY"

  def fetch_with_zyte(url):
      response = requests.post(
          "https://api.zyte.com/v1/extract",
          auth=(API_KEY, ""),
          json={"url": url, "httpResponseBody": True}
      )
      data = response.json()
      html = base64.b64decode(data["httpResponseBody"]).decode("utf-8")
      return html

### Key Notes

- The API returns Base64-encoded HTML in httpResponseBody
- Always decode with base64 --decode (bash) or base64.b64decode() (Python)
- Auth uses the API key as username with empty password: --user "${API_KEY}:"
- For browser-rendered JavaScript content, use "browserHtml": true instead of "httpResponseBody": true
- Docs: https://docs.zyte.com/zyte-api/usage/http.html

What You Need

- A Zyte API account (they have a free tier to get started)
- The block of text above added to your CLAUDE.md with your API key filled in

If you're a heavy Claude Code user who frequently researches the web during coding sessions, this is one of the highest-value additions you can make to your setup.