How to Extract Claude Code's System Prompt

A step-by-step guide to intercepting Claude Code's API traffic and extracting the full system prompt.

Prerequisites

Node.js 18+
Claude Code CLI installed (npm install -g @anthropic-ai/claude-code)
Valid Anthropic API key configured

Quick Start (5 minutes)

1. Create a Proxy Server

Create proxy.js:

const http = require('http');
const https = require('https');
const fs = require('fs');

const PROXY_PORT = 8888;
let requestCounter = 0;

// Ensure captures directory exists
if (!fs.existsSync('captures')) fs.mkdirSync('captures');

const server = http.createServer((clientReq, clientRes) => {
  const id = String(++requestCounter).padStart(4, '0');
  let body = '';

  clientReq.on('data', chunk => body += chunk);
  clientReq.on('end', () => {
    // Save request
    const reqData = {
      url: clientReq.url,
      method: clientReq.method,
      headers: clientReq.headers,
      body: body ? JSON.parse(body) : null
    };
    fs.writeFileSync(`captures/${id}_request.json`, JSON.stringify(reqData, null, 2));

    // Forward to Anthropic
    const options = {
      hostname: 'api.anthropic.com',
      port: 443,
      path: clientReq.url,
      method: clientReq.method,
      headers: { ...clientReq.headers, host: 'api.anthropic.com' }
    };

    const proxyReq = https.request(options, proxyRes => {
      let responseBody = '';

      // Pass through headers
      clientRes.writeHead(proxyRes.statusCode, proxyRes.headers);

      proxyRes.on('data', chunk => {
        responseBody += chunk;
        clientRes.write(chunk); // Stream through
      });

      proxyRes.on('end', () => {
        // Save response
        fs.writeFileSync(`captures/${id}_response.json`, JSON.stringify({
          status: proxyRes.statusCode,
          body: responseBody
        }, null, 2));
        clientRes.end();
      });
    });

    if (body) proxyReq.write(body);
    proxyReq.end();
  });
});

server.listen(PROXY_PORT, () => {
  console.log(`Proxy running on http://localhost:${PROXY_PORT}`);
  console.log('Run Claude Code with:');
  console.log(`  ANTHROPIC_BASE_URL=http://localhost:${PROXY_PORT} claude "hello"`);
});

2. Run the Proxy

node proxy.js

3. Run Claude Code Through the Proxy

In another terminal:

ANTHROPIC_BASE_URL=http://localhost:8888 claude "What is 2+2?"

4. Extract the System Prompt

# The system prompt is in the first main request (usually 0002)
cat captures/0002_request.json | jq -r '.body.system[].text'

Or for just the first block (identity + instructions):

cat captures/0002_request.json | jq -r '.body.system[0].text' | head -100

What You'll Find

The system prompt is ~13,000 characters and includes:

Identity - "You are Claude Code, Anthropic's official CLI for Claude"
Tone/Style - No emojis, concise output, markdown formatting
Professional Objectivity - Technical accuracy over validation
Tool Usage Policy - When to use each tool
Git/PR Guidelines - Commit message format, PR creation
Security Rules - What not to do
Environment Info - Working directory, OS, date

Understanding the Request Structure

Request #1: Haiku warmup (quota check, max_tokens: 1)
Request #2: Main Opus request (contains system prompt)
Request #3-13: Token counting requests (parallel)
Request #14+: Continuation requests (tool results, etc.)

Extracting Tool Definitions

Tools are in the same request:

# List all tools
cat captures/0002_request.json | jq -r '.body.tools[].name'

# Get full tool definition
cat captures/0002_request.json | jq '.body.tools[] | select(.name == "Bash")'

Key Insights

Component	Size	Notes
System prompt	~13KB	Multiple text blocks with cache_control
Tools	18-30	Depends on MCP servers registered
Total request	~110KB	System + tools + messages
Cache hit rate	99%+	After first request

Pain Points in the Process

1. SSE Streaming Complexity

Responses use Server-Sent Events with incremental JSON deltas:

event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"input_json_delta","partial_json":"{\""}}

You need to reassemble these fragments to get complete tool calls.

2. Request Numbering

The first request (#1) is just a warmup check. The actual system prompt is in request #2, but there's no guarantee of this - it depends on session state.

3. Large Payloads

Requests are 100KB+, responses can be larger. Pretty-printing JSON can slow down analysis.

4. Parallel Token Counting

10+ parallel requests happen for token counting after the main request. They clutter the captures.

5. Interactive Mode Required for Full Flow

With --print flag, tools like AskUserQuestion error out. You need true interactive mode to capture the full conversation flow.

Advanced: Filtering Captures

To focus on just the main conversation:

# Skip token counting requests
for f in captures/*_request.json; do
  if ! cat "$f" | jq -r '.url' | grep -q 'count_tokens'; then
    echo "=== $f ==="
    cat "$f" | jq -r '.body.model'
  fi
done

Security Note

This technique uses the documented ANTHROPIC_BASE_URL environment variable. We're inspecting our own API traffic, similar to browser DevTools. No binary patching or security bypass is involved.

Created as part of the Claude Code Reverse Engineering project.