A step-by-step guide to intercepting Claude Code's API traffic and extracting the full system prompt.
Prerequisites
- Node.js 18+
- Claude Code CLI installed (
npm install -g @anthropic-ai/claude-code) - Valid Anthropic API key configured
Quick Start (5 minutes)
1. Create a Proxy Server
Create proxy.js:
const http = require('http');
const https = require('https');
const fs = require('fs');
const PROXY_PORT = 8888;
let requestCounter = 0;
// Ensure captures directory exists
if (!fs.existsSync('captures')) fs.mkdirSync('captures');
const server = http.createServer((clientReq, clientRes) => {
const id = String(++requestCounter).padStart(4, '0');
let body = '';
clientReq.on('data', chunk => body += chunk);
clientReq.on('end', () => {
// Save request
const reqData = {
url: clientReq.url,
method: clientReq.method,
headers: clientReq.headers,
body: body ? JSON.parse(body) : null
};
fs.writeFileSync(`captures/${id}_request.json`, JSON.stringify(reqData, null, 2));
// Forward to Anthropic
const options = {
hostname: 'api.anthropic.com',
port: 443,
path: clientReq.url,
method: clientReq.method,
headers: { ...clientReq.headers, host: 'api.anthropic.com' }
};
const proxyReq = https.request(options, proxyRes => {
let responseBody = '';
// Pass through headers
clientRes.writeHead(proxyRes.statusCode, proxyRes.headers);
proxyRes.on('data', chunk => {
responseBody += chunk;
clientRes.write(chunk); // Stream through
});
proxyRes.on('end', () => {
// Save response
fs.writeFileSync(`captures/${id}_response.json`, JSON.stringify({
status: proxyRes.statusCode,
body: responseBody
}, null, 2));
clientRes.end();
});
});
if (body) proxyReq.write(body);
proxyReq.end();
});
});
server.listen(PROXY_PORT, () => {
console.log(`Proxy running on http://localhost:${PROXY_PORT}`);
console.log('Run Claude Code with:');
console.log(` ANTHROPIC_BASE_URL=http://localhost:${PROXY_PORT} claude "hello"`);
});
2. Run the Proxy
node proxy.js
3. Run Claude Code Through the Proxy
In another terminal:
ANTHROPIC_BASE_URL=http://localhost:8888 claude "What is 2+2?"
4. Extract the System Prompt
# The system prompt is in the first main request (usually 0002)
cat captures/0002_request.json | jq -r '.body.system[].text'
Or for just the first block (identity + instructions):
cat captures/0002_request.json | jq -r '.body.system[0].text' | head -100
What You'll Find
The system prompt is ~13,000 characters and includes:
- Identity - "You are Claude Code, Anthropic's official CLI for Claude"
- Tone/Style - No emojis, concise output, markdown formatting
- Professional Objectivity - Technical accuracy over validation
- Tool Usage Policy - When to use each tool
- Git/PR Guidelines - Commit message format, PR creation
- Security Rules - What not to do
- Environment Info - Working directory, OS, date
Understanding the Request Structure
Request #1: Haiku warmup (quota check, max_tokens: 1)
Request #2: Main Opus request (contains system prompt)
Request #3-13: Token counting requests (parallel)
Request #14+: Continuation requests (tool results, etc.)
Extracting Tool Definitions
Tools are in the same request:
# List all tools
cat captures/0002_request.json | jq -r '.body.tools[].name'
# Get full tool definition
cat captures/0002_request.json | jq '.body.tools[] | select(.name == "Bash")'
Key Insights
| Component | Size | Notes |
|---|---|---|
| System prompt | ~13KB | Multiple text blocks with cache_control |
| Tools | 18-30 | Depends on MCP servers registered |
| Total request | ~110KB | System + tools + messages |
| Cache hit rate | 99%+ | After first request |
Pain Points in the Process
1. SSE Streaming Complexity
Responses use Server-Sent Events with incremental JSON deltas:
event: content_block_delta
data: {"type":"content_block_delta","delta":{"type":"input_json_delta","partial_json":"{\""}}
You need to reassemble these fragments to get complete tool calls.
2. Request Numbering
The first request (#1) is just a warmup check. The actual system prompt is in request #2, but there's no guarantee of this - it depends on session state.
3. Large Payloads
Requests are 100KB+, responses can be larger. Pretty-printing JSON can slow down analysis.
4. Parallel Token Counting
10+ parallel requests happen for token counting after the main request. They clutter the captures.
5. Interactive Mode Required for Full Flow
With --print flag, tools like AskUserQuestion error out. You need true interactive mode to capture the full conversation flow.
Advanced: Filtering Captures
To focus on just the main conversation:
# Skip token counting requests
for f in captures/*_request.json; do
if ! cat "$f" | jq -r '.url' | grep -q 'count_tokens'; then
echo "=== $f ==="
cat "$f" | jq -r '.body.model'
fi
done
Security Note
This technique uses the documented ANTHROPIC_BASE_URL environment variable. We're inspecting our own API traffic, similar to browser DevTools. No binary patching or security bypass is involved.
Created as part of the Claude Code Reverse Engineering project.