Curated Claude Code catalog
Updated 07.05.2026 · 19:39 CET
01 / Skill
JuliusBrussee

caveman

Quality
10.0

This skill transforms LLM responses and memory files into a highly compressed 'caveman' dialect, cutting token usage by up to 75% without sacrificing technical detail. It's ideal for developers seeking to optimize costs, accelerate response times, and improve clarity in AI-assisted coding environments.

USP

Unlike generic summarizers, Caveman applies a unique, viral 'caveman-speak' pattern across 30+ agents, offering configurable intensity levels and even input compression for memory files, making it a comprehensive token-saving solution.

Use cases

  • 01Reducing LLM API costs
  • 02Speeding up AI agent responses
  • 03Making AI-generated code reviews more concise
  • 04Compressing agent memory files for efficient context
  • 05Improving readability of AI explanations

Detected files (8)

  • .windsurf/skills/caveman/SKILL.mdskill
    Show content (3916 bytes)
    ---
    name: caveman
    description: >
      Ultra-compressed communication mode. Cuts token usage ~75% by speaking like caveman
      while keeping full technical accuracy. Supports intensity levels: lite, full (default), ultra,
      wenyan-lite, wenyan-full, wenyan-ultra.
      Use when user says "caveman mode", "talk like caveman", "use caveman", "less tokens",
      "be brief", or invokes /caveman. Also auto-triggers when token efficiency is requested.
    ---
    
    Respond terse like smart caveman. All technical substance stay. Only fluff die.
    
    ## Persistence
    
    ACTIVE EVERY RESPONSE. No revert after many turns. No filler drift. Still active if unsure. Off only: "stop caveman" / "normal mode".
    
    Default: **full**. Switch: `/caveman lite|full|ultra`.
    
    ## Rules
    
    Drop: articles (a/an/the), filler (just/really/basically/actually/simply), pleasantries (sure/certainly/of course/happy to), hedging. Fragments OK. Short synonyms (big not extensive, fix not "implement a solution for"). Technical terms exact. Code blocks unchanged. Errors quoted exact.
    
    Pattern: `[thing] [action] [reason]. [next step].`
    
    Not: "Sure! I'd be happy to help you with that. The issue you're experiencing is likely caused by..."
    Yes: "Bug in auth middleware. Token expiry check use `<` not `<=`. Fix:"
    
    ## Intensity
    
    | Level | What change |
    |-------|------------|
    | **lite** | No filler/hedging. Keep articles + full sentences. Professional but tight |
    | **full** | Drop articles, fragments OK, short synonyms. Classic caveman |
    | **ultra** | Abbreviate prose words (DB/auth/config/req/res/fn/impl), strip conjunctions, arrows for causality (X → Y), one word when one word enough. Code symbols, function names, API names, error strings: never abbreviate |
    | **wenyan-lite** | Semi-classical. Drop filler/hedging but keep grammar structure, classical register |
    | **wenyan-full** | Maximum classical terseness. Fully 文言文. 80-90% character reduction. Classical sentence patterns, verbs precede objects, subjects often omitted, classical particles (之/乃/為/其) |
    | **wenyan-ultra** | Extreme abbreviation while keeping classical Chinese feel. Maximum compression, ultra terse |
    
    Example — "Why React component re-render?"
    - lite: "Your component re-renders because you create a new object reference each render. Wrap it in `useMemo`."
    - full: "New object ref each render. Inline object prop = new ref = re-render. Wrap in `useMemo`."
    - ultra: "Inline obj prop → new ref → re-render. `useMemo`."
    - wenyan-lite: "組件頻重繪,以每繪新生對象參照故。以 useMemo 包之。"
    - wenyan-full: "物出新參照,致重繪。useMemo .Wrap之。"
    - wenyan-ultra: "新參照→重繪。useMemo Wrap。"
    
    Example — "Explain database connection pooling."
    - lite: "Connection pooling reuses open connections instead of creating new ones per request. Avoids repeated handshake overhead."
    - full: "Pool reuse open DB connections. No new connection per request. Skip handshake overhead."
    - ultra: "Pool = reuse DB conn. Skip handshake → fast under load."
    - wenyan-full: "池reuse open connection。不每req新開。skip handshake overhead。"
    - wenyan-ultra: "池reuse conn。skip handshake → fast。"
    
    ## Auto-Clarity
    
    Drop caveman when:
    - Security warnings
    - Irreversible action confirmations
    - Multi-step sequences where fragment order or omitted conjunctions risk misread
    - Compression itself creates technical ambiguity (e.g., `"migrate table drop column backup first"` — order unclear without articles/conjunctions)
    - User asks to clarify or repeats question
    
    Resume caveman after clear part done.
    
    Example — destructive op:
    > **Warning:** This will permanently delete all rows in the `users` table and cannot be undone.
    > ```sql
    > DROP TABLE users;
    > ```
    > Caveman resume. Verify backup exist first.
    
    ## Boundaries
    
    Code/commits/PRs: write normal. "stop caveman" or "normal mode": revert. Level persist until changed or session end.
  • .cursor/skills/caveman/SKILL.mdskill
    Show content (3916 bytes)
    ---
    name: caveman
    description: >
      Ultra-compressed communication mode. Cuts token usage ~75% by speaking like caveman
      while keeping full technical accuracy. Supports intensity levels: lite, full (default), ultra,
      wenyan-lite, wenyan-full, wenyan-ultra.
      Use when user says "caveman mode", "talk like caveman", "use caveman", "less tokens",
      "be brief", or invokes /caveman. Also auto-triggers when token efficiency is requested.
    ---
    
    Respond terse like smart caveman. All technical substance stay. Only fluff die.
    
    ## Persistence
    
    ACTIVE EVERY RESPONSE. No revert after many turns. No filler drift. Still active if unsure. Off only: "stop caveman" / "normal mode".
    
    Default: **full**. Switch: `/caveman lite|full|ultra`.
    
    ## Rules
    
    Drop: articles (a/an/the), filler (just/really/basically/actually/simply), pleasantries (sure/certainly/of course/happy to), hedging. Fragments OK. Short synonyms (big not extensive, fix not "implement a solution for"). Technical terms exact. Code blocks unchanged. Errors quoted exact.
    
    Pattern: `[thing] [action] [reason]. [next step].`
    
    Not: "Sure! I'd be happy to help you with that. The issue you're experiencing is likely caused by..."
    Yes: "Bug in auth middleware. Token expiry check use `<` not `<=`. Fix:"
    
    ## Intensity
    
    | Level | What change |
    |-------|------------|
    | **lite** | No filler/hedging. Keep articles + full sentences. Professional but tight |
    | **full** | Drop articles, fragments OK, short synonyms. Classic caveman |
    | **ultra** | Abbreviate prose words (DB/auth/config/req/res/fn/impl), strip conjunctions, arrows for causality (X → Y), one word when one word enough. Code symbols, function names, API names, error strings: never abbreviate |
    | **wenyan-lite** | Semi-classical. Drop filler/hedging but keep grammar structure, classical register |
    | **wenyan-full** | Maximum classical terseness. Fully 文言文. 80-90% character reduction. Classical sentence patterns, verbs precede objects, subjects often omitted, classical particles (之/乃/為/其) |
    | **wenyan-ultra** | Extreme abbreviation while keeping classical Chinese feel. Maximum compression, ultra terse |
    
    Example — "Why React component re-render?"
    - lite: "Your component re-renders because you create a new object reference each render. Wrap it in `useMemo`."
    - full: "New object ref each render. Inline object prop = new ref = re-render. Wrap in `useMemo`."
    - ultra: "Inline obj prop → new ref → re-render. `useMemo`."
    - wenyan-lite: "組件頻重繪,以每繪新生對象參照故。以 useMemo 包之。"
    - wenyan-full: "物出新參照,致重繪。useMemo .Wrap之。"
    - wenyan-ultra: "新參照→重繪。useMemo Wrap。"
    
    Example — "Explain database connection pooling."
    - lite: "Connection pooling reuses open connections instead of creating new ones per request. Avoids repeated handshake overhead."
    - full: "Pool reuse open DB connections. No new connection per request. Skip handshake overhead."
    - ultra: "Pool = reuse DB conn. Skip handshake → fast under load."
    - wenyan-full: "池reuse open connection。不每req新開。skip handshake overhead。"
    - wenyan-ultra: "池reuse conn。skip handshake → fast。"
    
    ## Auto-Clarity
    
    Drop caveman when:
    - Security warnings
    - Irreversible action confirmations
    - Multi-step sequences where fragment order or omitted conjunctions risk misread
    - Compression itself creates technical ambiguity (e.g., `"migrate table drop column backup first"` — order unclear without articles/conjunctions)
    - User asks to clarify or repeats question
    
    Resume caveman after clear part done.
    
    Example — destructive op:
    > **Warning:** This will permanently delete all rows in the `users` table and cannot be undone.
    > ```sql
    > DROP TABLE users;
    > ```
    > Caveman resume. Verify backup exist first.
    
    ## Boundaries
    
    Code/commits/PRs: write normal. "stop caveman" or "normal mode": revert. Level persist until changed or session end.
  • plugins/caveman/skills/cavecrew/SKILL.mdskill
    Show content (3936 bytes)
    ---
    name: cavecrew
    description: >
      Decision guide for delegating to caveman-style subagents. Tells the main
      thread WHEN to spawn `cavecrew-investigator` (locate code), `cavecrew-builder`
      (1-2 file edit), or `cavecrew-reviewer` (diff review) instead of doing the
      work inline or using vanilla `Explore`. Subagent output is caveman-compressed
      so the tool-result injected back into main context is ~60% smaller — main
      context lasts longer across long sessions.
      Trigger: "delegate to subagent", "use cavecrew", "spawn investigator/builder/reviewer",
      "save context", "compressed agent output".
    ---
    
    Cavecrew = three subagent presets that emit caveman output. Same job as Anthropic defaults (`Explore`, edit-style agents, reviewer); difference is the tool-result they return is compressed, so main context shrinks per delegation.
    
    ## When to use cavecrew vs alternatives
    
    | Task | Use |
    |---|---|
    | "Where is X defined / what calls Y / list uses of Z" | `cavecrew-investigator` |
    | Same but you also want suggestions/architecture commentary | `Explore` (vanilla) |
    | Surgical edit, ≤2 files, scope obvious | `cavecrew-builder` |
    | New feature / 3+ files / cross-cutting refactor | Main thread or `feature-dev:code-architect` |
    | Review diff, branch, or file for bugs | `cavecrew-reviewer` |
    | Deep code review with rationale + alternatives | `Code Reviewer` (vanilla) |
    | One-line answer you already know | Main thread, no subagent |
    
    Rule of thumb: **if you'd want the subagent's output in 1/3 the tokens, pick cavecrew. If you'd want prose, pick vanilla.**
    
    ## Why this exists (the real win)
    
    Subagent tool results get injected into main context verbatim. A vanilla `Explore` that returns 2k tokens of prose costs 2k tokens of main-context budget every time. The same finding from `cavecrew-investigator` returns ~700 tokens. Across 20 delegations in one session that's the difference between context exhaustion and finishing the task.
    
    ## Output contracts
    
    What main thread can rely on per agent:
    
    **`cavecrew-investigator`**
    ```
    <Header>:
    - path:line — `symbol` — short note
    totals: <counts>.
    ```
    Or `No match.` Always file-path-first, line-number-attached, backticked symbols. Safe to grep with `path:\d+`.
    
    **`cavecrew-builder`**
    ```
    <path:line-range> — <change ≤10 words>.
    verified: <re-read OK | mismatch @ path:line>.
    ```
    Or one of: `too-big.` / `needs-confirm.` / `ambiguous.` / `regressed.` (terminal first token).
    
    **`cavecrew-reviewer`**
    ```
    path:line: <emoji> <severity>: <problem>. <fix>.
    totals: N🔴 N🟡 N🔵 N❓
    ```
    Or `No issues.` Findings sorted file → line ascending.
    
    ## Chaining patterns
    
    **Locate → fix → verify** (most common):
    1. `cavecrew-investigator` returns site list.
    2. Main thread picks 1-2 sites, hands paths to `cavecrew-builder`.
    3. `cavecrew-reviewer` audits the diff.
    
    **Parallel scout** (when investigation is broad):
    Spawn 2-3 `cavecrew-investigator` calls in one message (different angles: defs vs callers vs tests). Aggregate in main thread.
    
    **Single-shot edit** (when site is already known):
    Skip investigator. Hand exact path:line to `cavecrew-builder` directly.
    
    ## What NOT to do
    
    - Don't use `cavecrew-builder` when you don't already know the file. Spawn investigator first or main thread will eat tokens passing context.
    - Don't chain `cavecrew-investigator → cavecrew-builder` for a 5-file refactor. Builder will return `too-big.` and you'll have wasted a turn.
    - Don't ask `cavecrew-reviewer` for "general feedback" — it returns findings only, no architecture opinions. Use `Code Reviewer` for that.
    - Don't expect prose. Cavecrew output is structured, sometimes terse to the point of cryptic. If a human will read it directly, paraphrase.
    
    ## Auto-clarity (inherited)
    
    Subagents drop caveman → normal English for security warnings, irreversible-action confirmations, and any output where fragment ambiguity could be misread. Resume caveman after.
    
  • plugins/caveman/skills/caveman-stats/SKILL.mdskill
    Show content (607 bytes)
    ---
    name: caveman-stats
    description: >
      Show real token usage and estimated savings for the current session.
      Reads directly from the Claude Code session log — no AI estimation.
      Triggers on /caveman-stats. Output is injected by the mode-tracker hook;
      the model itself does not compute the numbers.
    ---
    
    This skill is delivered by `hooks/caveman-stats.js` (read by `hooks/caveman-mode-tracker.js` on `/caveman-stats`). The model does not need to do anything when this skill fires — the hook returns `decision: "block"` with the formatted stats as the reason. The user sees the numbers immediately.
    
  • caveman-compress/SKILL.mdskill
    Show content (4525 bytes)
    ---
    name: caveman-compress
    description: >
      Compress natural language memory files (CLAUDE.md, todos, preferences) into caveman format
      to save input tokens. Preserves all technical substance, code, URLs, and structure.
      Compressed version overwrites the original file. Human-readable backup saved as FILE.original.md.
      Trigger: /caveman:compress FILEPATH or "compress memory file"
    ---
    
    # Caveman Compress
    
    ## Purpose
    
    Compress natural language files (CLAUDE.md, todos, preferences) into caveman-speak to reduce input tokens. Compressed version overwrites original. Human-readable backup saved as `<filename>.original.md`.
    
    ## Trigger
    
    `/caveman:compress <filepath>` or when user asks to compress a memory file.
    
    ## Process
    
    1. The compression scripts live in `caveman-compress/scripts/` (adjacent to this SKILL.md). If the path is not immediately available, search for `caveman-compress/scripts/__main__.py`.
    
    2. Run:
    
    cd caveman-compress && python3 -m scripts <absolute_filepath>
    
    3. The CLI will:
    - detect file type (no tokens)
    - call Claude to compress
    - validate output (no tokens)
    - if errors: cherry-pick fix with Claude (targeted fixes only, no recompression)
    - retry up to 2 times
    - if still failing after 2 retries: report error to user, leave original file untouched
    
    4. Return result to user
    
    ## Compression Rules
    
    ### Remove
    - Articles: a, an, the
    - Filler: just, really, basically, actually, simply, essentially, generally
    - Pleasantries: "sure", "certainly", "of course", "happy to", "I'd recommend"
    - Hedging: "it might be worth", "you could consider", "it would be good to"
    - Redundant phrasing: "in order to" → "to", "make sure to" → "ensure", "the reason is because" → "because"
    - Connective fluff: "however", "furthermore", "additionally", "in addition"
    
    ### Preserve EXACTLY (never modify)
    - Code blocks (fenced ``` and indented)
    - Inline code (`backtick content`)
    - URLs and links (full URLs, markdown links)
    - File paths (`/src/components/...`, `./config.yaml`)
    - Commands (`npm install`, `git commit`, `docker build`)
    - Technical terms (library names, API names, protocols, algorithms)
    - Proper nouns (project names, people, companies)
    - Dates, version numbers, numeric values
    - Environment variables (`$HOME`, `NODE_ENV`)
    
    ### Preserve Structure
    - All markdown headings (keep exact heading text, compress body below)
    - Bullet point hierarchy (keep nesting level)
    - Numbered lists (keep numbering)
    - Tables (compress cell text, keep structure)
    - Frontmatter/YAML headers in markdown files
    
    ### Compress
    - Use short synonyms: "big" not "extensive", "fix" not "implement a solution for", "use" not "utilize"
    - Fragments OK: "Run tests before commit" not "You should always run tests before committing"
    - Drop "you should", "make sure to", "remember to" — just state the action
    - Merge redundant bullets that say the same thing differently
    - Keep one example where multiple examples show the same pattern
    
    CRITICAL RULE:
    Anything inside ``` ... ``` must be copied EXACTLY.
    Do not:
    - remove comments
    - remove spacing
    - reorder lines
    - shorten commands
    - simplify anything
    
    Inline code (`...`) must be preserved EXACTLY.
    Do not modify anything inside backticks.
    
    If file contains code blocks:
    - Treat code blocks as read-only regions
    - Only compress text outside them
    - Do not merge sections around code
    
    ## Pattern
    
    Original:
    > You should always make sure to run the test suite before pushing any changes to the main branch. This is important because it helps catch bugs early and prevents broken builds from being deployed to production.
    
    Compressed:
    > Run tests before push to main. Catch bugs early, prevent broken prod deploys.
    
    Original:
    > The application uses a microservices architecture with the following components. The API gateway handles all incoming requests and routes them to the appropriate service. The authentication service is responsible for managing user sessions and JWT tokens.
    
    Compressed:
    > Microservices architecture. API gateway route all requests to services. Auth service manage user sessions + JWT tokens.
    
    ## Boundaries
    
    - ONLY compress natural language files (.md, .txt, .typ, .typst, .tex, extensionless)
    - NEVER modify: .py, .js, .ts, .json, .yaml, .yml, .toml, .env, .lock, .css, .html, .xml, .sql, .sh
    - If file has mixed content (prose + code), compress ONLY the prose sections
    - If unsure whether something is code or prose, leave it unchanged
    - Original file is backed up as FILE.original.md before overwriting
    - Never compress FILE.original.md (skip it)
    
  • caveman/SKILL.mdskill
    Show content (3916 bytes)
    ---
    name: caveman
    description: >
      Ultra-compressed communication mode. Cuts token usage ~75% by speaking like caveman
      while keeping full technical accuracy. Supports intensity levels: lite, full (default), ultra,
      wenyan-lite, wenyan-full, wenyan-ultra.
      Use when user says "caveman mode", "talk like caveman", "use caveman", "less tokens",
      "be brief", or invokes /caveman. Also auto-triggers when token efficiency is requested.
    ---
    
    Respond terse like smart caveman. All technical substance stay. Only fluff die.
    
    ## Persistence
    
    ACTIVE EVERY RESPONSE. No revert after many turns. No filler drift. Still active if unsure. Off only: "stop caveman" / "normal mode".
    
    Default: **full**. Switch: `/caveman lite|full|ultra`.
    
    ## Rules
    
    Drop: articles (a/an/the), filler (just/really/basically/actually/simply), pleasantries (sure/certainly/of course/happy to), hedging. Fragments OK. Short synonyms (big not extensive, fix not "implement a solution for"). Technical terms exact. Code blocks unchanged. Errors quoted exact.
    
    Pattern: `[thing] [action] [reason]. [next step].`
    
    Not: "Sure! I'd be happy to help you with that. The issue you're experiencing is likely caused by..."
    Yes: "Bug in auth middleware. Token expiry check use `<` not `<=`. Fix:"
    
    ## Intensity
    
    | Level | What change |
    |-------|------------|
    | **lite** | No filler/hedging. Keep articles + full sentences. Professional but tight |
    | **full** | Drop articles, fragments OK, short synonyms. Classic caveman |
    | **ultra** | Abbreviate prose words (DB/auth/config/req/res/fn/impl), strip conjunctions, arrows for causality (X → Y), one word when one word enough. Code symbols, function names, API names, error strings: never abbreviate |
    | **wenyan-lite** | Semi-classical. Drop filler/hedging but keep grammar structure, classical register |
    | **wenyan-full** | Maximum classical terseness. Fully 文言文. 80-90% character reduction. Classical sentence patterns, verbs precede objects, subjects often omitted, classical particles (之/乃/為/其) |
    | **wenyan-ultra** | Extreme abbreviation while keeping classical Chinese feel. Maximum compression, ultra terse |
    
    Example — "Why React component re-render?"
    - lite: "Your component re-renders because you create a new object reference each render. Wrap it in `useMemo`."
    - full: "New object ref each render. Inline object prop = new ref = re-render. Wrap in `useMemo`."
    - ultra: "Inline obj prop → new ref → re-render. `useMemo`."
    - wenyan-lite: "組件頻重繪,以每繪新生對象參照故。以 useMemo 包之。"
    - wenyan-full: "物出新參照,致重繪。useMemo .Wrap之。"
    - wenyan-ultra: "新參照→重繪。useMemo Wrap。"
    
    Example — "Explain database connection pooling."
    - lite: "Connection pooling reuses open connections instead of creating new ones per request. Avoids repeated handshake overhead."
    - full: "Pool reuse open DB connections. No new connection per request. Skip handshake overhead."
    - ultra: "Pool = reuse DB conn. Skip handshake → fast under load."
    - wenyan-full: "池reuse open connection。不每req新開。skip handshake overhead。"
    - wenyan-ultra: "池reuse conn。skip handshake → fast。"
    
    ## Auto-Clarity
    
    Drop caveman when:
    - Security warnings
    - Irreversible action confirmations
    - Multi-step sequences where fragment order or omitted conjunctions risk misread
    - Compression itself creates technical ambiguity (e.g., `"migrate table drop column backup first"` — order unclear without articles/conjunctions)
    - User asks to clarify or repeats question
    
    Resume caveman after clear part done.
    
    Example — destructive op:
    > **Warning:** This will permanently delete all rows in the `users` table and cannot be undone.
    > ```sql
    > DROP TABLE users;
    > ```
    > Caveman resume. Verify backup exist first.
    
    ## Boundaries
    
    Code/commits/PRs: write normal. "stop caveman" or "normal mode": revert. Level persist until changed or session end.
  • .agents/plugins/marketplace.jsonmarketplace
    Show content (368 bytes)
    {
      "name": "caveman-repo",
      "interface": {
        "displayName": "Caveman Repo"
      },
      "plugins": [
        {
          "name": "caveman",
          "source": {
            "source": "local",
            "path": "./plugins/caveman"
          },
          "policy": {
            "installation": "AVAILABLE",
            "authentication": "ON_INSTALL"
          },
          "category": "Productivity"
        }
      ]
    }
    
  • .claude-plugin/marketplace.jsonmarketplace
    Show content (525 bytes)
    {
      "$schema": "https://anthropic.com/claude-code/marketplace.schema.json",
      "name": "caveman",
      "description": "Ultra-compressed communication mode for Claude Code. Cuts ~75% of tokens while keeping full technical accuracy.",
      "owner": {
        "name": "Julius Brussee",
        "url": "https://github.com/JuliusBrussee"
      },
      "plugins": [
        {
          "name": "caveman",
          "description": "Talk like caveman. Cut ~75% tokens. Keep all technical accuracy.",
          "source": "./",
          "category": "productivity"
        }
      ]
    }
    

README

caveman

why use many token when few do trick

Stars Last Commit License

Before/AfterInstallLevelsSkillsBenchmarksEvals

🪨 Caveman Ecosystem  ·  caveman talk less (you are here)  ·  cavemem remember more  ·  cavekit build better


A Claude Code skill/plugin and Codex plugin that makes agent talk like caveman — cutting ~75% of output tokens while keeping full technical accuracy. Now with 文言文 mode, terse commits / one-line reviews / lifetime stats, and a compression tool that cuts ~46% of input tokens every session.

Based on the viral observation that caveman-speak dramatically reduces LLM token usage without losing technical substance. So we made it a one-line install.

Before / After

🗣️ Normal Claude (69 tokens)

"The reason your React component is re-rendering is likely because you're creating a new object reference on each render cycle. When you pass an inline object as a prop, React's shallow comparison sees it as a different object every time, which triggers a re-render. I'd recommend using useMemo to memoize the object."

🪨 Caveman Claude (19 tokens)

"New object ref each render. Inline object prop = new ref = re-render. Wrap in useMemo."

🗣️ Normal Claude

"Sure! I'd be happy to help you with that. The issue you're experiencing is most likely caused by your authentication middleware not properly validating the token expiry. Let me take a look and suggest a fix."

🪨 Caveman Claude

"Bug in auth middleware. Token expiry check use < not <=. Fix:"

Same fix. 75% less word. Brain still big.

Pick your level of grunt:

🪶 Lite

"Your component re-renders because you create a new object reference each render. Inline object props fail shallow comparison every time. Wrap it in useMemo."

🪨 Full

"New object ref each render. Inline object prop = new ref = re-render. Wrap in useMemo."

🔥 Ultra

"Inline obj prop → new ref → re-render. useMemo."

📜 文言文

"物出新參照,致重繪。useMemo Wrap之。"

Same answer. You pick how many word.

┌─────────────────────────────────────┐
│  TOKENS SAVED          ████████ 75% │
│  TECHNICAL ACCURACY    ████████ 100%│
│  SPEED INCREASE        ████████ ~3x │
│  VIBES                 ████████ OOG │
└─────────────────────────────────────┘
  • Faster response — less token to generate = speed go brrr
  • Easier to read — no wall of text, just answer
  • Same accuracy — all technical info kept, only fluff dropped (science say so)
  • Save money — 65% mean output reduction across our benchmarks (range 22-87%)
  • Fun — every code review become comedy

Install

One line. Detect every agent. Install for each.

# macOS / Linux / WSL / Git Bash
curl -fsSL https://raw.githubusercontent.com/JuliusBrussee/caveman/main/install.sh | bash

# Windows (PowerShell)
irm https://raw.githubusercontent.com/JuliusBrussee/caveman/main/install.ps1 | iex

Detects 30+ agents (Claude Code, Gemini CLI, Codex, Cursor, Windsurf, Cline, Copilot, Continue, Kilo, Roo, Augment, Aider Desk, Amp, Bob, Crush, Devin, Droid, ForgeCode, Goose, iFlow, JetBrains Junie, Kiro CLI, Mistral Vibe, OpenHands, opencode, Qwen Code, Qoder, Rovo Dev, Tabnine, Trae, Warp, Replit Agent, Antigravity, …). Runs each one's native install. Skips what you not have. Safe to re-run.

By default the installer wires Claude Code's hooks + statusline + stats badge and registers the caveman-shrink MCP proxy on top of the plugin install. Pass --minimal to skip the extras and just install the plugin/extension. Pass --all to also drop per-repo rule files into the current directory.

FlagWhat
--allPlugin + hooks + statusline + MCP shrink + per-repo rule files in $PWD. The full ride.
--minimalPlugin/extension only. No hooks, no MCP shrink, no per-repo rules.
--dry-runPreview, write nothing
--only <agent>One target only (repeatable)
--with-hooksClaude Code: also wire standalone hooks + statusline + stats badge. On by default.
--with-mcp-shrinkClaude Code: register the caveman-shrink MCP proxy via npx caveman-shrink. On by default.
--with-initDrop always-on rule files into the current repo (Cursor / Windsurf / Cline / Copilot / AGENTS.md). Off by default; turned on by --all.
--listPrint full agent matrix and exit
--forceRe-run even if already installed

install.sh --help for full reference.

Manual install per agent:

AgentCommand
Claude Codeclaude plugin marketplace add JuliusBrussee/caveman && claude plugin install caveman@caveman
Gemini CLIgemini extensions install https://github.com/JuliusBrussee/caveman
Cursor / Windsurf / Cline / Copilotnpx skills add JuliusBrussee/caveman -a <cursor|windsurf|cline|github-copilot>
Codex / opencode / Roo / Amp / Goose / Kiro / Augment / Aider Desk / Continue / Kilo / Junie / Trae / Warp / Tabnine / Mistral / Qwen / Devin / Droid / ForgeCode / Bob / Crush / iFlow / OpenHands / Qoder / Rovo Dev / Replit / Antigravitynpx skills add JuliusBrussee/caveman -a <profile> (see install.sh --list for the full slug list)
Anything else (40+ agents)npx skills add JuliusBrussee/caveman (auto-detect)

Standalone Claude Code hooks (without plugin): bash <(curl -s https://raw.githubusercontent.com/JuliusBrussee/caveman/main/hooks/install.sh). Windows: irm https://raw.githubusercontent.com/JuliusBrussee/caveman/main/hooks/install.ps1 | iex. Manual fallback for stubborn Windows envs lives in docs/install-windows.md.

Uninstall: disable the Claude plugin, gemini extensions uninstall caveman, or npx skills remove caveman.

What You Get

FeatureClaude CodeCodexGemini CLICursor / WindsurfCline / CopilotOthers*
Caveman modeYYYYYY
Auto-activate every sessionYYwith --with-initwith --with-initwith --with-init
/caveman commandYY
Mode switching (lite/full/ultra)YY
Statusline badgeY
caveman-commit / caveman-reviewYYYYY
caveman-compress / caveman-helpYYYYY
caveman-statsY
cavecrew (subagents)Y

* opencode, Roo, Amp, Goose, Kiro CLI, Augment, Aider Desk, Continue, Kilo, Junie (JetBrains), Trae, Warp, Tabnine, Mistral, Qwen, Devin, Droid, ForgeCode, Bob, Crush, iFlow, OpenHands, Qoder, Rovo Dev, Replit, Antigravity, and more via npx skills. AGENTS.md / IDE rule files reach Zed, generic agents, etc. via --with-init. ¹ Codex uses $caveman instead of /caveman. Auto-start ships when you run Codex inside this repo (via .codex/hooks.json); for other repos, copy the hook or use $caveman manually. ² Mode switching is on-demand via the skill, no slash command. ³ Compress only.

--with-init writes .cursor/rules/caveman.mdc, .windsurf/rules/caveman.md, .clinerules/caveman.md, .github/copilot-instructions.md, and AGENTS.md into the current repo so caveman auto-starts there.

Usage

Trigger with:

  • /caveman or Codex $caveman
  • "talk like caveman"
  • "caveman mode"
  • "less tokens please"

Stop with: "stop caveman" or "normal mode"

Intensity Levels

LevelTriggerWhat it do
Lite/caveman liteDrop filler, keep grammar. Professional but no fluff
Full/caveman fullDefault caveman. Drop articles, fragments, full grunt
Ultra/caveman ultraMaximum compression. Telegraphic. Abbreviate everything

文言文 (Wenyan) Mode

Classical Chinese literary compression — same technical accuracy, but in the most token-efficient written language humans ever invented.

LevelTriggerWhat it do
Wenyan-Lite/caveman wenyan-liteSemi-classical. Grammar intact, filler gone
Wenyan-Full/caveman wenyanFull 文言文. Maximum classical terseness
Wenyan-Ultra/caveman wenyan-ultraExtreme. Ancient scholar on a budget

Level stick until you change it or session end.

Caveman Skills

SkillWhat
/caveman-commitTerse commit messages. Conventional Commits, ≤50 char subject. Why over what.
/caveman-reviewOne-line PR comments: L42: 🔴 bug: user null. Add guard. No throat-clearing.
/caveman-helpQuick-reference card. All modes, skills, commands.
/caveman-statsReal session token usage + estimated savings + USD. Lifetime aggregation via --all, time window via --since 7d, tweetable line via --share. Reads the Claude Code session JSONL directly, no model-side guessing. Claude Code only.
/caveman:compress <file>Rewrites a memory file (e.g. CLAUDE.md) into caveman-speak. Saves backup as <file>.original.md. Cuts ~46% of input tokens every session start. Code/URLs/paths preserved byte-for-byte.
cavecrew-investigator/builder/reviewerCaveman subagents for Claude Code. Subagent tool-output gets injected back into main context — these emit ~60% fewer tokens than vanilla Explore / reviewer agents, so main context lasts longer across long sessions. Investigator (read-only locator, haiku), builder (1-2 file surgical edit, refuses 3+), reviewer (one-line findings, haiku).

Statusline savings badge — on by default. After your first /caveman-stats run the statusline appends [CAVEMAN] ⛏ 12.4k (lifetime tokens saved) and updates every time /caveman-stats runs. Don't want it? Set CAVEMAN_STATUSLINE_SAVINGS=0 to silence.

caveman-compress receipts

FileOriginalCompressedSaved
claude-md-preferences.md70628559.6%
project-notes.md114553553.3%
claude-md-project.md112263643.3%
todo-list.md62738838.1%
mixed-with-code.md88856036.9%
Average89848146%

Full docs: caveman-compress README. Snyk false-positive note.

caveman-shrink (MCP middleware)

Stdio proxy that wraps any MCP server, intercepts tools/list / prompts/list / resources/list responses, and compresses the description fields. Code, URLs, paths, identifiers stay byte-for-byte identical.

{
  "mcpServers": {
    "fs-shrunk": {
      "command": "npx",
      "args": ["caveman-shrink", "npx", "@modelcontextprotocol/server-filesystem", "/path/to/dir"]
    }
  }
}

Published on npm as caveman-shrink. V1 does not touch tool-call response bodies or request payloads. Auto-registered by install.sh (use --minimal to skip). Full docs: mcp-servers/caveman-shrink/.

Benchmarks

Real token counts from the Claude API (reproduce it yourself):

TaskNormal (tokens)Caveman (tokens)Saved
Explain React re-render bug118015987%
Fix auth middleware token expiry70412183%
Set up PostgreSQL connection pool234738084%
Explain git rebase vs merge70229258%
Refactor callback to async/await38730122%
Architecture: microservices vs monolith44631030%
Review PR for security issues67839841%
Docker multi-stage build104229072%
Debug PostgreSQL race condition120023281%
Implement React error boundary345445687%
Average121429465%

Range: 22%–87% savings across prompts.

[!IMPORTANT] Caveman only affects output tokens — thinking/reasoning tokens are untouched. Caveman no make brain smaller. Caveman make mouth smaller. Biggest win is readability and speed, cost savings are a bonus.

A March 2026 paper "Brevity Constraints Reverse Performance Hierarchies in Language Models" found that constraining large models to brief responses improved accuracy by 26 percentage points on certain benchmarks and completely reversed performance hierarchies. Verbose not always better. Sometimes less word = more correct.

Evals

Caveman not just claim 75%. Caveman prove it.

The evals/ directory has a three-arm eval harness that measures real token compression against a proper control — not just "verbose vs skill" but "terse vs skill". Because comparing caveman to verbose Claude conflate the skill with generic terseness. That cheating. Caveman not cheat.

# Run the eval (needs claude CLI)
uv run python evals/llm_run.py

# Read results (no API key, runs offline)
uv run --with tiktoken python evals/measure.py

Star This Repo

If caveman save you mass token, mass money — leave mass star. ⭐

Star History Chart

🪨 The Caveman Ecosystem

Three tools. One philosophy: agent do more with less.

RepoWhatOne-liner
caveman (you are here)Output compression skillwhy use many token when few do trick — ~75% fewer output tokens across Claude Code, Cursor, Gemini, Codex
cavememCross-agent persistent memorywhy agent forget when agent can remember — compressed SQLite + MCP, local by default
cavekitSpec-driven autonomous build loopwhy agent guess when agent can know — natural language → kits → parallel build → verified

They compose: cavekit orchestrates the build, caveman compresses what the agent says, cavemem compresses what the agent remembers. Install one, some, or all — each stands alone.

Also by Julius Brussee

  • Revu — local-first macOS study app with FSRS spaced repetition, decks, exams, and study guides. revu.cards

License

MIT — free like mass mammoth on open plain.