Mirror websites for AI. Extract HTML, CSS, JS, fonts, screenshots, and design tokens into a structured capture that AI agents can understand and reproduce.
decant is an AI-native website mirroring tool written in Rust. It downloads a site's complete UI layer — HTML, CSS, JS, web fonts, and images — rewrites all references to relative local paths so the capture works offline, and outputs three machine-readable artifacts:
| File | Purpose |
|---|---|
context.md |
Token-budgeted LLM primer — paste this into your prompt first |
design-tokens.json |
Full color palette, type scale, spacing, breakpoints, shadows |
manifest.json |
Page tree, component regions, and asset catalog |
repair-hints.json |
Clone health, failed asset categories, and AI repair actions |
When used with --render chrome, decant also captures browser-observed runtime resources and multi-viewport screenshots (mobile, tablet, desktop) so an AI agent has a visual ground truth to compare against. decant verify then compares the live site and local preview screenshots and writes a machine-readable repair report.
- Static Mirroring — extremely fast HTML/CSS/JS/asset crawl without a browser
- Headless Chrome — full JavaScript rendering (
--render chrome), runtime network resource capture, cookie injection, multi-viewport screenshots - Lightpanda — ultra-fast (~11× faster, ~16× less RAM than Chrome) JS rendering (
--render lightpanda) — no screenshots, pure DOM extraction - Design Token Extraction — colors, font families, spacing scale, breakpoints, radii, shadows from every stylesheet
- URL Rewriting — all absolute and root-relative references rewritten to relative local paths for offline use, with query-distinct assets saved separately
- JS Chunk Discovery — scans compiled JS bundles to find and download dynamically-imported chunks (Vite, webpack, etc.)
- Cookie & Header Auth — inline cookie string (
--cookies) or Netscape cookie jar (--cookie-file) + custom request headers (--header) - Selective Capture — choose exactly which aspects to collect via
--capture html,css,js,fonts,screenshots,tokens,context - Multi-Viewport Screenshots —
mobile(390×844),tablet(768×1024),desktop(1440×900) - TUI Dashboard — real-time crawl progress with a ratatui terminal UI
- Offline Preview — serve any capture directory locally with
decant serve - Visual Verification — compare live/local screenshots with
decant verify - Repair Hints — every clone writes
repair-hints.jsonso agents know whether blocked, missing, malformed, or retryable assets need follow-up - robots.txt — respected by default; bypass with
--ignore-robots
# Static crawl only (no browser dependency — fastest install)
cargo install decant
# With headless-browser support (Chrome + Lightpanda)
cargo install decant --features renderAfter install, decant is on your $PATH immediately.
# Install globally
npm install -g decant-cli
# Or run without installing (one-shot)
npx decant-cli clone https://example.com --output ./capturecurl -fsSL https://raw.githubusercontent.com/codingstark-dev/decant/main/install.sh | shgit clone https://github.com/codingstark-dev/decant
cd decant
# Static crawl only
cargo build --release
# With headless-browser support
cargo build --release --features render
# Add to PATH
export PATH="$PWD/target/release:$PATH"- Chrome/Chromium — required only when using
--render chrome - Lightpanda — required only when using
--render lightpanda; install from lightpanda.io
decant ships a SKILL.md — a structured instruction file that teaches any AI coding assistant (Claude code, antigravity, codex, opencode, etc.) how to use decant autonomously, without you having to explain it.
# Using the skills CLI
npx skills add codingstark-dev/decant --skill decant -gCopy SKILL.md into your project's .agents/skills/decant/ directory:
mkdir -p .agents/skills/decant
curl -fsSL https://raw.githubusercontent.com/codingstark-dev/decant/main/SKILL.md \
-o .agents/skills/decant/SKILL.mdOnce installed, your AI assistant will automatically know how to:
| Task | Command it will use |
|---|---|
| Clone a website statically | decant clone <URL> --output ./site |
| Clone an SPA with Chrome | decant clone <URL> --render chrome |
| Clone with authentication | decant clone <URL> --cookies "session=abc" |
| Deep crawl multiple pages | decant clone <URL> --depth 2 |
| Extract design tokens | decant tokens ./site |
| Serve a capture locally | decant serve ./site --port 8080 |
| Serve without JS (SPA fix) | decant serve ./site --noscript |
| Compare live vs local | decant verify <LIVE_URL> http://127.0.0.1:8080 |
| Inspect repair guidance | cat ./site/repair-hints.json |
decant clone https://example.com --output ./example-capturedecant clone https://example.com \
--render chrome \
--screenshots mobile,tablet,desktop \
--output ./example-chromeRuntime capture is auto for Chrome. Use --runtime-capture on when you want Decant to fail clearly unless Chrome runtime capture is available:
decant clone https://example.com \
--render chrome \
--runtime-capture on \
--output ./example-chromeAfter the clone, check the native repair artifact:
jq . ./example-chrome/repair-hints.jsonIf status is needs_repair, use the listed issue categories before deciding the clone is visually wrong. For example, malformed_css_url means Decant's URL rewriting needs a parser fix; blocked_asset means rerun with cookies or headers; third_party_optional means compare screenshots before treating it as harmless.
Serve and verify the capture:
decant serve ./example-chrome --port 8080 --noscript
decant verify https://example.com http://127.0.0.1:8080 \
--viewports desktop \
--output ./example-chrome/verify-report.json \
--screenshots-dir ./example-chrome/verify-screenshotsverify-report.json contains status: "match" or status: "needs_repair", per-viewport similarity scores, live/local screenshot paths, and AI next steps.
decant clone https://spa.example.com \
--render lightpanda \
--output ./example-lpdecant clone https://app.example.com \
--cookies "session=abc123; csrf=xyz789" \
--header "Authorization: Bearer eyJhbGciOi..." \
--output ./auth-capturedecant clone https://example.com \
--render chrome \
--capture html,screenshots \
--output ./visual-only# Standard serve
decant serve ./example-capture --port 8080
# SPA captures (Next.js / React) — strip scripts to prevent client-side errors
decant serve ./example-capture --port 8080 --noscript
# Open http://localhost:8080./output-dir/
├── index.html # Rewritten HTML (relative links)
├── about/
│ └── index.html # Nested pages follow URL structure
├── assets/
│ ├── app-71RbDVbK.css # Stylesheets with rewritten font/image references
│ ├── index-B8uUuc7f.js # JS entry chunk
│ └── chunk-XYZ.js # Dynamically discovered code-split chunks
├── screenshots/
│ ├── index/
│ │ ├── mobile.png # 390×844 — iPhone 14 viewport
│ │ ├── tablet.png # 768×1024 — iPad viewport
│ │ └── desktop.png # 1440×900 — laptop viewport
│ └── about/
│ └── ...
├── context.md # AI-native LLM primer (start here)
├── design-tokens.json # Full design system tokens
├── manifest.json # Page tree + asset catalog
└── repair-hints.json # Clone health + AI repair actions
Example: context.md (Click to expand)
# decant capture — context.md
> Read this file first. It is a token-budgeted primer.
> Then consult `design-tokens.json` and `manifest.json` for full detail.
## Overview
- **Seed URL**: https://example.com/
- **Captured at**: 2026-06-16 12:56:22 UTC
- **Render mode**: rendered
- **Pages captured**: 1
- **Assets captured**: 23
- **Total size**: 2.1 MB
## Pages
- **/** → `index.html`
- Title: Example Site
- Regions: header, nav, main, footer, hero, testimonial
## Captured Screenshots
- `screenshots/index/mobile.png`
- `screenshots/index/tablet.png`
- `screenshots/index/desktop.png`
## Key Design Tokens
### Colors
Palette: `#0b0b0c` `#ffffff` `#000000` `#6366f1` `#e0e7ff`
### Spacing
`2px` `4px` `8px` `12px` `16px` `24px` `32px` `48px`
### Breakpoints
640px, 768px, 1024px, 1280px
## How to Use This Capture
1. **You are here** — this file orients you without burning context.
2. **`design-tokens.json`** — full color palette, type scale, spacing, radii, shadows, breakpoints. Translate into your target stack (Tailwind config / CSS custom properties / design system tokens).
3. **`manifest.json`** — page tree and component region breakdown. Build one component per listed region.
4. **Raw HTML/CSS files** — open only when you need exact markup or a specific class name. Reproduce structure and styling approach, not class names verbatim.
5. **`screenshots/`** — visual ground truth. Compare your output against these images at matching breakpoints.Example: design-tokens.json (Click to expand)
{
"schemaVersion": "1.0",
"source": "https://example.com/",
"capturedAt": "2026-06-16T12:56:01.537658Z",
"colors": {
"swatches": [
"#fff",
"#000",
"#0b0b0c",
"#6366f1",
"#e0e7ff"
],
"byUsage": {}
},
"typography": {
"fontFamilies": ["Inter", "sans-serif"],
"scale": ["12px", "14px", "16px", "20px", "24px", "32px", "48px"],
"lineHeights": ["1.2", "1.5"],
"fontWeights": ["400", "500", "700"]
},
"spacing": [2.0, 4.0, 8.0, 12.0, 16.0, 24.0, 32.0, 48.0],
"breakpoints": [640, 768, 1024, 1280],
"radii": ["4px", "8px", "12px", "9999px"],
"shadows": ["0 1px 2px 0 rgba(0, 0, 0, 0.05)", "0 4px 6px -1px rgba(0, 0, 0, 0.1)"]
}| Option | Default | Description |
|---|---|---|
--output <DIR> |
<hostname> |
Output directory |
--depth <N> |
0 |
Link-follow recursion depth (0 = single page) |
--render <chrome|lightpanda> |
(off) | Enable JS rendering via headless browser |
--runtime-capture <auto|on|off> |
auto |
Capture browser-observed static resources when rendering with Chrome |
--screenshots <VIEWPORTS> |
mobile,tablet,desktop |
Comma list of viewports (requires --render chrome) |
--no-screenshots |
false | Disable screenshots even with --render chrome |
--capture <ASPECTS> |
html,css,js,fonts,images,tokens,context |
Comma list: html, css, js, fonts, images, screenshots, tokens, context |
--cookies <STRING> |
— | Inline cookie: "name=val; name2=val2" |
--cookie-file <PATH> |
— | Netscape cookie jar file path |
--header <KEY:VALUE> |
— | Extra request header (repeatable) |
--user-agent <STRING> |
decant/ | Override the User-Agent header |
--concurrency <N> |
16 |
Max parallel HTTP requests |
--rate-limit <R> |
4 |
Max requests/second per host |
--ignore-robots |
false | Bypass robots.txt |
--same-origin |
true | Only follow links on the same origin |
--allow-domains <D1,D2> |
— | Extra domains to crawl (e.g. CDN hosts) |
--no-tokens |
false | Skip design-tokens.json |
--no-manifest |
false | Skip manifest.json / context.md |
--tui <true|false> |
auto | Force TUI on/off (auto-detects TTY) |
Note
Why is --depth 0 the default? Unlike general-purpose web crawlers (like HTTrack) that recursively scrape entire websites by default, decant defaults to a single-page mirror (--depth 0). For AI code generation and visual reference, you usually want to feed a single, specific reference page at a time to stay within LLM token budgets. If you need multi-page crawling, pass --depth 2 or higher.
| Option | Default | Description |
|---|---|---|
--port <N> |
8080 |
Port to listen on |
--host <ADDR> |
127.0.0.1 |
Address to bind to |
--noscript |
false |
Strip all script tags on-the-fly to prevent client-side hydration crashes on modern SPAs (Next.js, React, etc.) |
Tip
Hydration Crashes on Local Preview: Modern SPAs (like Next.js or TanStack Start) attempt to execute client-side hydration when loaded in the browser. Since the backend APIs do not exist on the local port, this execution can crash or wipe out the DOM. Use decant serve <DIR> --noscript to dynamically strip scripts on-the-fly, allowing you to preview the static mirrored UI and styling cleanly.
Re-run design-token extraction on an existing capture directory without re-downloading anything.
| Site Type | Support |
|---|---|
| Static HTML sites | ✅ Full support |
| React/Vue SPAs with SSR/prerendering | ✅ Full support |
| React/Vue client-only SPAs | ✅ Use --render chrome or --render lightpanda |
| Sites behind session auth | ✅ Use --cookies or --cookie-file |
| Browser-inserted JS/CSS/images/fonts/media | ✅ Use --render chrome with runtime capture |
| Live backend state, WebSocket/SSE streams, DRM/protected media, bot-gated APIs | |
| Multi-page sites | ✅ Use --depth 2 or higher to follow links |
| Sites with Cloudflare/bot protection | |
| React Native / Electron apps | ❌ Not applicable |
Measured against multigres.com (a real-world Astro/Vite SPA) on Apple M-series.
All three modes use --depth 0 (single-page render) for a fair head-to-head comparison.
Run: python3 benchmark.py https://your-site.com
decant benchmark — target: https://multigres.com
============================================================
RESULTS (depth=0, single-page, same URL)
============================================================
Mode Time (s) Peak RAM (MB)
------------------------------------------------------------
Static (no browser) 23.0 20
Lightpanda (JS) 28.2 45 (1.2× slower)
Chrome (headless) 26.5 1066 (1.2× slower)
------------------------------------------------------------
→ Lightpanda uses 24× less RAM than Chrome
→ Static mode uses 53× less RAM than Chrome
Key observations:
- All three modes are near-identical in speed — decant's rate limiter (4 req/s) is the bottleneck, not the browser. You get JS rendering for free in time terms.
- Lightpanda: 45 MB vs Chrome: 1 GB — 24× less RAM for the same JS execution. The right choice for servers, CI, and Docker containers.
- Static: 20 MB — zero browser overhead. Perfect for sites where JS rendering doesn't change the HTML (most blogs, docs, marketing sites).
- When to use each mode:
Mode Use when static(default)Most sites — fastest, least RAM --render lightpandaSPAs on Linux/Docker servers --render chromeSPAs + browser-observed runtime assets + screenshots
Install the bundled decant coding skill so Claude Code, Cursor, and Codex know how to use decant's output to reproduce a UI:
npx skills add codingstark-dev/decant| Feature | Chrome | Lightpanda |
|---|---|---|
| Full JavaScript execution | ✅ | ✅ |
| Screenshots | ✅ | ❌ |
| Runtime network resource capture | ✅ | ❌ |
| React / Vue / SPA HTML | ✅ | ✅ |
| Cookie injection (CDP) | ✅ | ✅ |
| Speed | Same (rate-limited) | Same (rate-limited) |
| RAM usage | ~1 GB | ~18 MB |
| Protocol | CDP WebSocket | CDP WebSocket |
Both backends speak the same Chrome DevTools Protocol (CDP) WebSocket. When --render chrome is set, runtime capture is enabled by default in auto mode and screenshots are captured automatically unless --no-screenshots is passed.
Runtime capture is static-resource focused. It captures browser-observed scripts, stylesheets, images, fonts, media, manifests, and module chunks such as .mjs; it does not promise perfect cloning of authenticated sessions, live backend APIs, WebSocket/SSE updates, protected media, checkout flows, or server-personalized state.
Note: Lightpanda is in active development. On macOS arm64, some sites trigger a known CDP navigation hang — decant works around this with a 15-second per-page timeout that automatically falls back to static fetch.
decant CLI binary — argument parsing, orchestration, TUI
├── decant-core Fetch engine, URL frontier, rate limiter, robots.txt, file writer
├── decant-extract HTML/CSS/JS parsing, link rewriting, design-token & manifest extraction
├── decant-render Headless browser backends (Chrome + Lightpanda via CDP) [feature-gated]
└── decant-tui ratatui dashboard, progress reporter, app state
- Feature gate:
decant-renderis compiled only when--features renderis passed. The default binary has zero browser dependencies. - No OpenSSL: Uses
rustlsexclusively — no native TLS library is required. - Pure transform:
decant-extracthas no network I/O. It receives bytes and returns structured data. - Arc-based frontier: The
Frontieris shared across all tokio tasks viaArc<Mutex<_>>with fragment-normalized URL deduplication.
# Run all tests
cargo test --workspace
# Run tests including network integration tests (requires internet)
cargo test --workspace -- --include-ignored
# Check all lints
cargo clippy --workspace --all-targets -- -D warnings
# Build docs
cargo doc --workspace --no-deps --open- Fork the repository
- Create a feature branch:
git checkout -b feat/my-feature - Commit your changes:
git commit -m "feat: add my feature" - Push to the branch:
git push origin feat/my-feature - Open a Pull Request
Please make sure cargo clippy --workspace -- -D warnings and cargo test --workspace both pass before submitting.
This project is licensed under the MIT License.
⚠️ Legal notice: Mirroring a site's layout and design system for reference/study purposes is generally permissible, but you may not redistribute a site's copyrighted text, images, or branding. Always review a site's Terms of Service before crawling it.
