PullMD Hilfe

PullMD Help

Diese Seite erklärt, was PullMD macht, wie der Cache funktioniert und wie du den Service in Claude Code oder anderen KI-Agenten einrichtest.

This page explains what PullMD does, how the cache works, and how to wire it into Claude Code or any other AI agent.

Was macht PullMD?What PullMD does/api

PullMD ruft eine beliebige URL ab und liefert sie als sauberes Markdown zurück. Drei Extraktions-Pfade werden je nach Quelle gewählt:

PullMD fetches any URL and returns it as clean Markdown. It picks one of three extraction paths depending on the source:

Reddit — eigener Pipeline für Posts, Kommentare, Subreddit-Listings (mit Tiefe und Limit konfigurierbar).
Cloudflare-Markdown — wenn die Quelle Accept: text/markdown nativ unterstützt, wird das direkt durchgereicht (sauberster Output).
Readability + Turndown — Fallback für alles andere: Mozilla Readability extrahiert den Hauptinhalt, Turndown wandelt nach Markdown.

Reddit — dedicated pipeline for posts, comments, subreddit listings (configurable depth and limit).
Cloudflare-Markdown — when the source supports Accept: text/markdown natively, that's passed through directly (cleanest output).
Readability + Turndown — fallback for everything else: Mozilla Readability pulls the main content, Turndown converts to Markdown.

Welcher Pfad genommen wurde, sieht man im Response-Header X-Source und in der History neben jedem Eintrag.

Which path was used shows up in the response header X-Source and next to each entry in the history.

Cache & TTLCache & TTLSQLite

Jeder Pull wird in einer SQLite-Datenbank gespeichert. Zwei Zeitspannen sind wichtig:

Every pull is stored in a SQLite database. Two timeouts matter:

Was	Wert	Wann zurücksetzen?	What	Value	When does it reset?
Re-Fetch von der Quelle	`1 Stunde`	Bei jedem erfolgreichen Pull derselben URL — egal ob über `/api?url=…` oder über `/s/:id`.	Re-fetch from source	`1 hour`	On every successful pull of the same URL — regardless of whether it came through `/api?url=…` or `/s/:id`.
Share-Link-Lebensdauer	`90 Tage`	Bei jedem Re-Fetch (=> Cache schreibt). Auch `/s/:id`-Aufrufe verlängern, da sie nach 1 h einen Re-Fetch auslösen.	Share link lifetime	`90 days`	On every re-fetch (= cache write). `/s/:id` requests extend it too, since they trigger a re-fetch after 1h.

So funktioniert `/s/:id`

How `/s/:id` behaves

Cache < 1 h alt → liefert die gespeicherte Version sofort.
Cache ≥ 1 h alt → ruft die Quelle frisch ab, schreibt zurück, liefert das neue Markdown.
Quelle nicht erreichbar (404, Netzwerk, …) → liefert den letzten gespeicherten Stand als Snapshot. Keine Stille-Failure-Lücke.
Cache > 90 Tage ohne Re-Fetch → Eintrag wird beim nächsten Schreibvorgang gelöscht; Share-Link gibt 404 zurück.

Cache < 1 h old → returns the stored version immediately.
Cache ≥ 1 h old → re-fetches the source, writes back, serves the new markdown.
Source unreachable (404, network, …) → falls back to the last stored snapshot. No silent failure.
Cache > 90 days without a re-fetch → entry is pruned on next write; share link returns 404.

Tipp: Subreddit als Live-Feed

Tip: subreddit as a live feed

Pull einmal einen Subreddit-Listing-Link, merke dir die Share-ID — und ruf danach nur noch /s/:id auf. Nach jeder Stunde löst der nächste Aufruf einen frischen Fetch aus, die Share-ID bleibt stabil, der Inhalt aktualisiert sich. Praktisch für KI-Agenten, die einen festen Endpoint mit regelmäßig aktuellem Inhalt brauchen.

Pull a subreddit-listing URL once, remember the share ID — and from then on only call /s/:id. After each hour the next request triggers a fresh fetch, the share ID stays stable, and the content updates. Handy for AI agents that need a fixed endpoint with regularly refreshed content.

In KI-Agenten einrichtenSet up in AI agentssetup

Option 1 — Universal: Prompt einfügen

Option 1 — Universal: paste a prompt

Funktioniert in jedem Chat-Agent (ChatGPT, Claude.ai, Gemini, Perplexity, …). Kopiere den Block, füge ihn als System- oder Custom-Instruction ein:

Works in any chat-style agent (ChatGPT, Claude.ai, Gemini, Perplexity, …). Copy the block, paste it as a system or custom instruction:

prompt · drop-in

When you need to read a web page, fetch via PullMD instead of raw HTML:

  GET https://pullmd.hiten-patel.co.uk/api?url=<URL>

Returns clean Markdown (text/markdown). Optional query params:

  comments=false        skip Reddit comments
  comment_depth=N       comment nesting depth (default 3)
  frontmatter=true      prepend YAML metadata block
  format=text           strip Markdown, return plain text
  nocache=true          bypass the 1h cache and refetch
  lang=de|en            language for the comments section header

Response headers worth checking:
  X-Source       reddit | cloudflare | readability
  X-Quality      0.0-1.0 extraction confidence
  X-Share-Id     8-hex permalink, openable as /s/<id>

Reddit URLs are auto-detected (incl. redd.it short links and /s/ shares).
Use this whenever you would otherwise fetch raw HTML — the markdown is
much cleaner and saves significant context window space.

Option 2 — Claude Code Skill

Option 2 — Claude Code skill

Für Claude Code gibt es eine fertige Skill, die WebFetch automatisch durch PullMD ersetzt (mit Fallback). Lade sie als Zip und entpacke nach ~/.claude/skills/:

For Claude Code there's a ready-made skill that automatically routes WebFetch through PullMD (with fallback). Download the zip and unpack into ~/.claude/skills/:

web-reader.zip herunterladenDownload web-reader.zip

install · shell

curl -O https://pullmd.hiten-patel.co.uk/web-reader.zip
mkdir -p ~/.claude/skills
unzip web-reader.zip -d ~/.claude/skills/
# Restart Claude Code; the skill activates on web-reading requests.

Option 3 — MCP-Server (remote)

Option 3 — MCP server (remote)

PullMD läuft als remote MCP-Server unter https://pullmd.hiten-patel.co.uk/mcp (Streamable-HTTP-Transport, stateless). Drei Tools: read_url, get_share, list_recent. Server-seitige Updates erreichen automatisch alle Clients — keine lokale Installation nötig.

PullMD runs as a remote MCP server at https://pullmd.hiten-patel.co.uk/mcp (Streamable-HTTP transport, stateless). Three tools: read_url, get_share, list_recent. Server-side updates reach every client automatically — no local install needed.

Claude Code — Prompt einfügen, Claude installiert es selbst:

Claude Code — paste this prompt and Claude will install it for you:

prompt · claude code

Installiere den PullMD MCP-Server in Claude Code (User-Scope):
- Name: pullmd
- Transport: http
- URL: https://pullmd.hiten-patel.co.uk/mcp

Nutze: claude mcp add --transport http pullmd https://pullmd.hiten-patel.co.uk/mcp
Danach: claude mcp list zur Verifikation.

Install the PullMD MCP server in Claude Code (user scope):
- Name: pullmd
- Transport: http
- URL: https://pullmd.hiten-patel.co.uk/mcp

Run: claude mcp add --transport http pullmd https://pullmd.hiten-patel.co.uk/mcp
Then: claude mcp list to verify.

Claude Code — direkt im Terminal:

Claude Code — directly in the terminal:

claude code · cli

claude mcp add --transport http pullmd https://pullmd.hiten-patel.co.uk/mcp

Claude Desktop / Cursor / andere — JSON-Konfig:

Claude Desktop / Cursor / others — JSON config:

mcp config snippet

{
  "mcpServers": {
    "pullmd": {
      "type": "http",
      "url": "https://pullmd.hiten-patel.co.uk/mcp"
    }
  }
}

Sobald registriert, erscheinen die drei Tools nativ im Agent — keine Prompt-Anweisungen nötig, das LLM erkennt sie über ihre Schema-Beschreibungen.

Once registered, the three tools surface natively in the agent — no prompt instructions needed, the LLM picks them up via their schema descriptions.

API-ParameterAPI parametersGET /api

Param	Default	Beschreibung	Default
`url`	—	Pflicht. Beliebige öffentliche URL.	Required. Any public URL.
`comments`	`true`	Reddit-Kommentare einschließen. `false` liefert nur den Post.	Include Reddit comments. `false` returns just the post.
`comment_depth`	`3`	Maximale Verschachtelungstiefe (1–10).	Max nesting depth (1–10).
`comment_limit`	—	Optionale Obergrenze für Top-Level-Kommentare (Reddit liefert standardmäßig ~200).	Optional cap on top-level comments (Reddit returns ~200 by default).
`frontmatter`	`false`	YAML-Frontmatter mit Metadaten voranstellen.	Prepend YAML frontmatter with metadata.
`format`	`md`	`text` = Markdown-Formatierung entfernen, Plaintext zurückgeben. `json` = strukturiert mit Metadaten.	`text` = strip Markdown, return plain text. `json` = structured with metadata.
`nocache`	`false`	1-h-Cache umgehen, immer frisch holen.	Bypass the 1-hour cache, always refetch.
`lang`	`de`	Sprache des Kommentar-Headers (`de` oder `en`).	Language for the comments header (`de` or `en`).

FrontmatterFrontmatterYAML

Mit ?frontmatter=true wird vor dem Inhalt ein YAML-Block mit Metadaten eingefügt. Felder mit leerem Wert werden weggelassen:

With ?frontmatter=true a YAML metadata block is prepended to the content. Empty fields are omitted:

example

---
title: "Why I migrated my side-project from Postgres to SQLite"
url: https://news.ycombinator.com/item?id=42424242
source: readability
fetched: 2026-04-25T13:53:00Z
quality: 0.85
author: kentonv
published: 2026-04-24T18:42:00Z
description: "After two years on managed Postgres..."
language: en
share_id: a3f9c2
---

Felder: title, url, source, fetched, quality, author, published, modified, description, language, image, site, extractor_reason, share_id.

Fields: title, url, source, fetched, quality, author, published, modified, description, language, image, site, extractor_reason, share_id.

Client-ErkennungClient detectionhistory

Jeder Eintrag in der History kriegt ein Client-Badge — daran sieht man, woher der Pull kam:

Every history entry gets a client badge so you can see where the pull originated:

Browser — normale Web-UI im Browser-Tab.
PWA — installierte Progressive Web App (Standalone-Mode, Android/iOS/Desktop).
Claude — User-Agent enthält „Claude" (z. B. via Skill).
API — alles andere (curl, Skript, MCP-Wrapper, …).

Browser — regular web UI in a browser tab.
PWA — installed Progressive Web App (standalone mode, Android/iOS/desktop).
Claude — User-Agent contains "Claude" (e.g. via the skill).
API — anything else (curl, scripts, MCP wrappers, …).