Use Defuddle CLI to extract clean readable content from web pages. Prefer over WebFetch for standard web pages — it removes navigation, ads, and clutter, reducing token usage.
If not installed: npm install -g defuddle
Always use --md for markdown output:
defuddle parse <url> --md
Save to file:
defuddle parse <url> --md -o content.md
Extract specific metadata:
defuddle parse <url> -p title
defuddle parse <url> -p description
defuddle parse <url> -p domain
| Flag | Format |
|---|---|
--md |
Markdown (default choice) |
--json |
JSON with both HTML and markdown |
| (none) | HTML |
-p <name> |
Specific metadata property |