1. Sub-Agents — Scaling Beyond a Single Loop #
The single-threaded agentic loop is simple and predictable, but it cannot parallelize work. Claude Code addresses this with sub-agents — child agent instances that run their own isolated loops.
How sub-agents work #
When the main agent encounters a task that benefits from parallelism (e.g., “run tests, check linting, and update docs”), it can spawn sub-agents via the SpawnAgent tool. Each sub-agent:
- Has its own isolated context window — preventing “context collapse” in the parent session.
- Receives a scoped task description — a focused instruction, not the full conversation history.
- Has restricted tool permissions — sub-agents can be granted a subset of the parent’s tools.
- Returns a structured result to the parent when complete.
Implementation #
// The SpawnAgent tool — creates a child AgentLoop with isolated context
const SpawnAgentTool: Tool = {
name: "SpawnAgent",
description:
"Spawn a sub-agent with its own isolated context to perform a focused task.",
permissionCategory: "spawn",
inputSchema: z.object({
task: z.string().describe("The focused task description for the sub-agent"),
allowedTools: z
.array(z.string())
.optional()
.describe("Subset of tools the sub-agent can use"),
}),
async execute(input) {
const { task, allowedTools } = input as {
task: string;
allowedTools?: string[];
};
// Create a scoped tool registry for the sub-agent
const scopedRegistry = new ToolRegistry();
const parentTools = registry; // reference to parent's registry
// Only register allowed tools (or all if not specified)
const toolNames = allowedTools ?? Array.from(parentTools.listNames());
for (const name of toolNames) {
if (name === "SpawnAgent") continue; // prevent recursive spawning
try {
scopedRegistry.register(parentTools.get(name));
} catch {
// Tool not found — skip
}
}
// Sub-agent gets its own context manager (isolated context window)
const childContextManager = new ContextManager(
process.cwd(),
100_000, // sub-agents get a smaller context budget
);
// Sub-agent gets full permission (parent already approved the spawn)
const childPermissions = new PermissionSystem("auto", {
denyPatterns: [],
allowedPaths: [process.cwd()],
});
const childLoop = new AgentLoop(
scopedRegistry,
childPermissions,
new HookRunner(), // sub-agents inherit hook config in production
childContextManager,
);
// Run the sub-agent and return its result to the parent
const result = await childLoop.run(task);
return `[Sub-agent completed]\n${result}`;
},
};// --- Parallel sub-agent orchestration ---
// The parent agent does not call SpawnAgent in parallel itself —
// it issues multiple SpawnAgent tool_use blocks in a single response,
// and the harness executes them concurrently:
async function executeToolsConcurrently(
toolCalls: ToolUseBlock[],
executeTool: (tc: ToolUseBlock) => Promise<string>,
): Promise<Map<string, string>> {
const results = new Map<string, string>();
// Separate SpawnAgent calls (can run in parallel) from others (sequential)
const spawnCalls = toolCalls.filter((tc) => tc.name === "SpawnAgent");
const otherCalls = toolCalls.filter((tc) => tc.name !== "SpawnAgent");
// Run spawn calls concurrently
const spawnResults = await Promise.all(
spawnCalls.map(async (tc) => ({
id: tc.id,
result: await executeTool(tc),
})),
);
for (const { id, result } of spawnResults) {
results.set(id, result);
}
// Run other calls sequentially (preserve ordering guarantees)
for (const tc of otherCalls) {
results.set(tc.id, await executeTool(tc));
}
return results;
}This is architecturally similar to a worker pool in distributed systems: the parent acts as an orchestrator, the sub-agents are workers, and the tool interface is the communication protocol.
Why isolation matters #
Without isolation, parallel tool execution would mutate the parent’s conversation history concurrently — creating race conditions and incoherent context. By giving each sub-agent its own context, the harness maintains the single-writer invariant that keeps the system predictable.
2. MCP — Model Context Protocol #
Claude Code supports the Model Context Protocol (MCP), an open standard for connecting AI assistants to external tools and data sources. MCP acts as a universal adapter layer:
- Tool servers — External services that expose tools (databases, APIs, monitoring systems) via a standardized protocol.
- Resource providers — Services that provide context (documentation, codebase indices, knowledge bases).
Implementation #
// MCP tools are registered into the same ToolRegistry as built-in tools.
// The harness treats them identically — same schema validation,
// same permission gates, same hook system.
interface MCPServerConfig {
name: string;
url: string; // e.g. "http://localhost:3001/mcp"
}
async function registerMCPTools(
server: MCPServerConfig,
registry: ToolRegistry,
): Promise<void> {
// 1. Discover available tools from the MCP server
const response = await fetch(`${server.url}/tools/list`, { method: "POST" });
const { tools } = (await response.json()) as {
tools: { name: string; description: string; inputSchema: object }[];
};
// 2. Register each MCP tool as a local tool with a remote executor
for (const mcpTool of tools) {
registry.register({
name: `mcp_${server.name}_${mcpTool.name}`,
description: `[MCP: ${server.name}] ${mcpTool.description}`,
permissionCategory: "network", // all MCP tools go through network gates
inputSchema: z.any(), // schema comes from the MCP server
async execute(input) {
const result = await fetch(`${server.url}/tools/call`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ name: mcpTool.name, arguments: input }),
});
const { content } = (await result.json()) as {
content: { type: string; text: string }[];
};
return content.map((c) => c.text).join("\n");
},
});
}
}From the harness’s perspective, MCP tools are indistinguishable from built-in tools: they go through the same schema validation, permission gates, and hook system. This means organizations can extend Claude Code’s capabilities without modifying the harness itself — a critical property for enterprise adoption.
3. Skills — On-Demand Procedural Knowledge #
MCP gives the agent new tools (the ability to do things). Skills give the agent new expertise (knowledge of how to do things). A skill is a self-contained directory — containing instructions, scripts, templates, and configuration — that the harness injects into the conversation on demand, teaching the model a specific workflow and giving it executable utilities to carry it out, without permanently consuming context tokens.
Skill directory structure #
A skill is not just a single file; it’s a directory:
.claude/skills/
└── deploy/
├── SKILL.md # Required — entry point (instructions + config)
├── scripts/
│ ├── deploy.sh # Helper script the skill references
│ └── health-check.py # Another utility
├── assets/
│ └── deploy-config.yaml # Reference implementation
└── references/
└── topic1.md # Additional documentationThe scripts/ directory is particularly important: skills can bundle executable helpers that the model runs via the Bash tool during skill execution. This makes skills more than just instructions — they’re portable workflow packages.
The progressive-disclosure pattern #
Skills use a three-level loading strategy designed to conserve the context window:
| Level | What’s loaded | When | Context cost |
|---|---|---|---|
| Level 1: Metadata | Skill name and description from YAML frontmatter |
Always — injected at session start | Very low (~50 tokens per skill) |
| Level 2: Instructions | Full SKILL.md body (the “playbook”) |
On demand — when the skill is triggered | Moderate (hundreds to low thousands of tokens) |
| Level 3: Supporting files | Scripts, examples, templates in the skill directory | Lazy — only when the running skill reads them | Variable |
This is analogous to how an operating system loads shared libraries: metadata (the symbol table) is always available, but the actual code is only paged in when a symbol is referenced.
The SKILL.md file #
The SKILL.md file has two parts: YAML frontmatter (configuration) and a markdown body (instructions).
---
name: deploy
description: Deploy the application to staging or production using our CI/CD pipeline
allowed-tools: [Bash, ReadFile, Grep] # restrict which tools this skill can use
disable-model-invocation: true # prevent autonomous triggering (require /deploy)
context: fork # run in an isolated sub-agent context
---
## Steps
1. Run `npm run build` and verify it exits cleanly.
2. Run the test suite with `npm test`. If any tests fail, stop and report.
3. Check the current branch — only `main` can deploy to production.
4. For staging: run the bundled deploy script:
```bash
bash scripts/deploy.sh staging
```- For production: run
bash scripts/deploy.sh production, then verify the health check using the bundled script:python3 scripts/health-check.py https://api.example.com/health
Rules #
- Never deploy if there are uncommitted changes.
- Always run tests before deploying, even if the user says to skip them.
- After a production deploy, post a summary to #deployments on Slack.
Frontmatter configuration #
| Field | Purpose |
|---|---|
name |
Becomes the /slash-command and the identifier used by the UseSkill tool. Level 1 — always in context. |
description |
The signal Claude uses to match user intent to this skill. Level 1 — always in context. |
allowed-tools |
Restricts which tools the model can call while this skill is active. Omit to allow all tools. |
disable-model-invocation |
When true, prevents Claude from triggering this skill autonomously — it can only be invoked manually via /deploy. Essential for workflows with side effects. |
context |
Set to fork to run the skill in an isolated sub-agent context, preventing it from polluting the parent session’s history. |
The markdown body is Level 2 — loaded only when the skill is triggered. Notice that the instructions freely reference bundled scripts (scripts/deploy.sh, scripts/health-check.py) and harness tools (Bash, ReadFile). The model uses these references to orchestrate tool calls during execution.
How skills are triggered #
Skills can be activated in two ways:
- Autonomous discovery — The model reads the skill descriptions (Level 1) and decides, based on the user’s task, that a skill is relevant. It then invokes the skill to load Level 2 instructions. This requires no user action.
- Manual invocation — The user types a slash command (e.g.,
/deploy). This is preferred for workflows with side effects, where timing matters.
Personal vs project skills #
| Scope | Location | Use case |
|---|---|---|
| Personal | ~/.claude/skills/ |
Preferences that follow you across projects — commit message style, preferred test frameworks, code review checklists |
| Project | /skills/ (in the repo) |
Team workflows that travel with the codebase — deployment procedures, coding standards, architecture patterns |
Project skills are version-controlled and shared automatically with anyone who clones the repository.
Implementation #
import * as fs from "fs/promises";
import * as path from "path";
import * as yaml from "yaml";
interface SkillConfig {
allowedTools?: string[]; // e.g. ["Bash", "ReadFile", "Grep"]
disableModelInvocation?: boolean; // true = manual /slash-command only
context?: "inline" | "fork"; // fork = run in isolated sub-agent
}
interface SkillMetadata {
name: string;
description: string;
basePath: string; // directory containing SKILL.md
config: SkillConfig;
}
interface LoadedSkill extends SkillMetadata {
instructions: string; // the markdown body (Level 2)
scripts: string[]; // relative paths to files in scripts/
}
class SkillRegistry {
private skills = new Map<string, SkillMetadata>();
// Called at startup — discovers all skills and loads Level 1 (metadata only)
async discoverSkills(searchPaths: string[]): Promise<void> {
for (const searchPath of searchPaths) {
const entries = await fs.readdir(searchPath, { withFileTypes: true });
for (const entry of entries) {
if (!entry.isDirectory()) continue;
const skillDir = path.join(searchPath, entry.name);
const skillFile = path.join(skillDir, "SKILL.md");
try {
const raw = await fs.readFile(skillFile, "utf-8");
const metadata = this.parseFrontmatter(raw);
// Discover bundled scripts (Level 3)
const scripts = await this.discoverScripts(skillDir);
this.skills.set(metadata.name, {
...metadata,
basePath: skillDir,
scripts,
});
} catch {
// No SKILL.md in this directory — skip
}
}
}
}
// Scan the scripts/ directory for executable helpers
private async discoverScripts(skillDir: string): Promise<string[]> {
const scriptsDir = path.join(skillDir, "scripts");
try {
const entries = await fs.readdir(scriptsDir);
return entries.map((e) => path.join("scripts", e));
} catch {
return []; // no scripts/ directory
}
}
// Level 1: returns metadata for all skills (always in context)
getMetadataSummary(): string {
const lines = ["Available skills:"];
for (const [name, skill] of this.skills) {
lines.push(` /${name} — ${skill.description}`);
}
return lines.join("\n");
}
// Level 2: loads the full instructions for a specific skill
async loadSkill(name: string): Promise<LoadedSkill> {
const metadata = this.skills.get(name);
if (!metadata) throw new Error(`Unknown skill: ${name}`);
const raw = await fs.readFile(
path.join(metadata.basePath, "SKILL.md"),
"utf-8",
);
const instructions = this.extractBody(raw);
const scripts =
metadata.scripts ?? (await this.discoverScripts(metadata.basePath));
return { ...metadata, instructions, scripts };
}
// Level 3: read a supporting file from the skill's directory
async loadSupportingFile(
skillName: string,
relativePath: string,
): Promise<string> {
const metadata = this.skills.get(skillName);
if (!metadata) throw new Error(`Unknown skill: ${skillName}`);
const filePath = path.join(metadata.basePath, relativePath);
return await fs.readFile(filePath, "utf-8");
}
private parseFrontmatter(raw: string): Omit<SkillMetadata, "scripts"> {
const match = raw.match(/^---\n([\s\S]*?)\n---/);
if (!match) throw new Error("No frontmatter found");
const parsed = yaml.parse(match[1]) as {
name: string;
description: string;
"allowed-tools"?: string[];
"disable-model-invocation"?: boolean;
context?: "inline" | "fork";
};
return {
name: parsed.name,
description: parsed.description,
basePath: "", // filled in by caller
config: {
allowedTools: parsed["allowed-tools"],
disableModelInvocation: parsed["disable-model-invocation"],
context: parsed.context,
},
};
}
private extractBody(raw: string): string {
return raw.replace(/^---[\s\S]*?---\n*/, "").trim();
}
}// --- The Skill tool: a meta-tool that loads instructions into context ---
const UseSkillTool: Tool = {
name: "UseSkill",
description:
"Load a skill's instructions into the conversation to guide task execution.",
permissionCategory: "read",
inputSchema: z.object({
skillName: z.string().describe("Name of the skill to load"),
}),
async execute(input) {
const { skillName } = input as { skillName: string };
try {
const skill = await skillRegistry.loadSkill(skillName);
// The skill's instructions + metadata are returned as a tool result,
// which means they enter the conversation history and guide
// the model's next steps.
const sections = [
`[Skill loaded: ${skill.name}]`,
`Base path: ${skill.basePath}`,
];
// Surface bundled scripts so the model knows what's available
if (skill.scripts.length > 0) {
sections.push(
`\nBundled scripts (can be executed via Bash):`,
...skill.scripts.map((s) => ` - ${s}`),
);
}
// Surface tool restrictions if configured
if (skill.config.allowedTools) {
sections.push(
`\nAllowed tools: ${skill.config.allowedTools.join(", ")}`,
);
}
sections.push("", skill.instructions);
return sections.join("\n");
} catch (err: any) {
return `Error loading skill: ${err.message}`;
}
},
};// --- Skill-aware context building ---
// During context initialization, skill metadata (Level 1) is injected
// alongside CLAUDE.md so the model knows what skills exist.
class SkillAwareContextManager extends ContextManager {
private skillRegistry: SkillRegistry;
constructor(
projectRoot: string,
maxTokens: number,
skillRegistry: SkillRegistry,
) {
super(projectRoot, maxTokens);
this.skillRegistry = skillRegistry;
}
override buildInitialContext(): Message[] {
const messages = super.buildInitialContext();
// Inject skill metadata as a system message
// This is Level 1 — just names and descriptions, very cheap
const skillSummary = this.skillRegistry.getMetadataSummary();
if (skillSummary) {
messages.push({
role: "system",
content: `[Skills]\n${skillSummary}\n\nYou can use the UseSkill tool to load any skill when relevant.`,
});
}
return messages;
}
}The key insight: skills are instructions + utilities, not services #
Unlike MCP, skills do not run as separate processes. They are loaded into the conversation as instructions, and the model uses the harness’s existing tools to act on them. But skills are not “just markdown” either — they can bundle:
- Executable scripts (
scripts/) that the model calls via the Bash tool during execution. - Templates and examples (
examples/,resources/) that the model reads for reference. - Tool restrictions (
allowed-tools) that scope what the model can do while the skill is active. - Isolation config (
context: fork) that runs the skill in a sub-agent to protect the parent session.
The result is a portable workflow package — instructions plus the utilities needed to carry them out — that requires no server, no daemon, and no deployment. A skill is just a directory you can git push.
4. Skills vs MCP — When to Use Which #
Skills and MCP are complementary but serve fundamentally different purposes. The simplest mental model: MCP gives Claude new hands; Skills give Claude new expertise.
But can’t skills just call APIs? #
Yes — and this is worth being precise about because the overlap is real.
A skill can bundle a scripts/jira-client.py that handles OAuth, manages tokens, retries on failure, and returns structured JSON. The model reads the skill’s instructions, which describe exactly how to call the script:
## Available scripts
- `python3 scripts/jira-client.py get-issue --key <ISSUE_KEY>` — returns issue JSON
- `python3 scripts/jira-client.py create-comment --key <ISSUE_KEY> --body <TEXT>` — posts a commentThe model is perfectly capable of reasoning about this interface from the instructions. It knows the flag names, the expected values, and the script’s capabilities — because the skill told it. For simple and moderate API usage, this works well and is often the better choice because it’s simpler to set up than an MCP server.
So when does MCP actually earn its complexity? Three situations:
1. Harness-level validation (catching errors before execution)
When the model calls a script via Bash, the harness sees one parameter: a command string. If the model hallucinates a flag name (--issue-key instead of --key), the error surfaces after the script runs and returns stderr. The model then has to parse the error, understand what went wrong, and retry — burning a full agentic loop iteration.
With MCP, the tool’s JSON Schema is registered with the harness. The Zod layer validates the input before the call reaches the server:
// MCP: harness catches this BEFORE execution
tool_use: { name: "mcp_jira_get_issue", input: { issueKey: 123 } }
// → Zod error: "issueKey must be a string" — returned instantly, no execution
// Skill script: error surfaces AFTER execution
tool_use: { name: "Bash", input: { command: "python3 scripts/jira-client.py get-issue --key" } }
// → Script runs, fails with "error: --key requires an argument", model parses stderr
This matters at scale. If the model makes ten tool calls per task, even a 5% error rate means a wasted iteration every other task. Pre-execution validation eliminates an entire class of errors.
2. Tool discovery at scale
When you have 5 scripts, the model can learn their interfaces from skill instructions. When you have 50 MCP tools across 8 servers, something changes: all MCP tool schemas are always visible in the API’s tools array. The model can browse them, compare parameters, and pick the right tool without loading any skill instructions first.
With skills, tool discovery requires loading skill instructions (Level 2) before the model even knows what’s available. For large tool ecosystems — an organization with MCP servers for GitHub, Jira, Postgres, Slack, Datadog, and more — the “always visible” property of MCP schemas is a significant advantage.
3. Cross-platform portability
An MCP server works with Claude Code, Cursor, Windsurf, Copilot, and any other MCP-compatible AI assistant. A skill script in .claude/skills/deploy/scripts/ is tied to Claude Code’s Bash tool. If your team uses multiple AI tools, MCP gives you one interface that works everywhere.
What this means in practice #
| Capability | Skill script | MCP tool | Skill workaround | Verdict |
|---|---|---|---|---|
| Model reasoning | Reads interface from instructions | Reads JSON Schema from tools array | N/A — both work | Draw |
| Input validation | Errors surface at runtime | Zod rejects before execution | Script validates its own args before calling the API | Draw — both prevent the bad call; MCP is marginally faster |
| Discovery (5 tools) | Skill descriptions cover it | Schemas in tools array | N/A — both work | Draw |
| Discovery (50+ tools) | Must load skill instructions | All schemas always visible | Rich Level 1 descriptions or a “catalog” skill | Slight MCP edge — but skill catalogs close the gap |
| Authentication | Env vars, token cache | Server manages OAuth/refresh | Script handles tokens itself | Draw |
| Persistent state | Fresh process each call | Server holds connections | Sidecar daemon via Unix socket | Draw — but the sidecar is an MCP server without the protocol |
| Cross-platform | Tied to Claude Code | Any MCP-compatible assistant | Ship scripts with adapter wrappers per platform | MCP wins — one interface vs N adapters |
The real decision: skills can do almost everything MCP does, but the workarounds add up. A sidecar daemon for persistence, a catalog skill for discovery, adapter wrappers for portability — at some point you’ve built an MCP-equivalent system without the standardized protocol. MCP’s value isn’t any single capability; it’s that one protocol solves all of these at once.
Comparison #
| Dimension | Skills | MCP |
|---|---|---|
| What it provides | Procedural knowledge + utility scripts — how to do something | Typed, authenticated connectivity — the ability to do something reliably |
| Analogy | An SOP manual with utility scripts attached | A typed SDK for an external system |
| Implementation | Markdown instructions + bundled scripts (SKILL.md + scripts/) |
Client-server architecture via JSON-RPC |
| Runs as | Injected instructions; bundled scripts run via Bash | Persistent external process (MCP server) |
| API calls | Yes — via curl, Python, etc. in shell scripts (untyped) |
Yes — via typed, schema-validated tool definitions |
| Token cost | Very low (Level 1 always; Level 2+ on demand) | Higher (full tool schemas always exposed) |
| Requires infrastructure | No — just a directory you can git push |
Yes — an MCP server process must be running |
| Tool control | Can restrict available tools via allowed-tools |
No built-in tool restrictions |
| Shareable | Via git (project skills in .claude/skills/) |
Via server deployment or npm packages |
| Best for | Workflows, runbooks, scripts, encoding judgment | Reliable interfaces to APIs, databases, SaaS platforms |
Can Skills Completely Replace MCP? #
Yes. If you look closely at the architecture of the Claude Code harness, every capability that MCP provides can be completely replaced by a well-architected Skills implementation.
1. Replacing Pre-execution Validation
Instead of relying on the harness’s Zod layer, a skill script can implement robust internal validation before making any API calls. For example, python3 scripts/billing.py charge --amount 100 --currency USD can validate that --amount is positive and --currency is a valid ISO code using argparse or pydantic before hitting the billing API. The functional result is identical: the costly call never happens. The only difference is that the validation runs in the script process rather than the harness process, surfacing errors to the model via standard output/error (which the model handles effortlessly).
2. Replacing Tool Discovery at Scale You can replace MCP’s always-visible tool schemas by using a “catalog” skill. The Level 1 metadata (name + description) is always in context, so a rich description serves as a discovery mechanism:
---
name: infra-tools
description: |
Infrastructure CLI tools:
- query-db: Run SQL queries against staging/production Postgres
- deploy: Deploy services to staging or production
- metrics: Query Datadog metrics for the last N hours
- slack-notify: Post messages to Slack channels
---When managing 50+ tools, a catalog skill lists all available scripts. Because Level 1 descriptions are tiny compared to full JSON Schema definitions, this approach is actually more context-efficient than loading 50 full MCP schemas into the harness at startup.
3. Replacing Persistent Connections MCP servers hold persistent connections (database pools, WebSockets, long-lived sessions). Skills can achieve this exact architecture by talking to a sidecar daemon. You run the daemon in the background to hold the persistent connections, and the skill’s Bash scripts communicate with it via Unix sockets or localhost HTTP:
# scripts/db-query.sh
# Talks to a persistent sidecar instead of opening a new connection each time
curl -s --unix-socket /tmp/db-proxy.sock \
-X POST -d "{\"sql\": \"$1\", \"params\": $2}" \
http://localhost/queryThis transforms the skill from a stateless script into an interface for a stateful microservice, matching MCP’s persistence capability.
4. Replacing Cross-Platform Portability While MCP defines a standard JSON-RPC protocol across tools like Cursor and Windsurf, Python and Bash scripts are inherently portable themselves. To support multiple AI assistants, you simply ship your scripts with thin adapter wrappers (e.g., a Cursor extension that shells out to your python script, or a Windsurf plugin that does the same). The core logic remains in the script, making it deeply agnostic to the specific AI agent running it.
The Architecture of a Full Replacement: If you want to bypass the complexity of deploying and managing MCP servers, you can build a complete equivalent using Skills + Scripts + Sidecars + Catalogs. While this involves writing validation logic and managing daemon processes yourself, it provides supreme flexibility—you are working entirely with standard scripts and bash commands, completely decoupled from the JSON-RPC spec of the Model Context Protocol.
When to use Skills #
- You need procedural guidance — a repeatable workflow with specific steps, conditions, and rules.
- You want to encode judgment — “if the PR touches the payments module, always run the fraud-detection test suite.”
- You want consistency — the same workflow applied identically across sessions without re-explaining it.
- You’re making one-off API calls — a quick
curlin a script is simpler than standing up an MCP server. - You’re optimizing for context — skills load just-in-time, keeping the baseline context footprint minimal.
How they compose #
The most powerful workflows stack Skills on top of MCP:
- MCP provides the connection — e.g., an MCP server exposes your JIRA API.
- A Skill provides the methodology — e.g., a
review-prskill says: “First use the JIRA MCP to fetch the linked ticket. Then read the changed files. Then check for breaking changes against our API compatibility guidelines. Finally, post a review comment.”
5. Putting It All Together #
With all the layers defined, here is how the harness bootstraps and runs:
async function main() {
// 1. Build the tool registry
const registry = new ToolRegistry();
registry.register(ReadFileTool);
registry.register(BashTool);
registry.register(EditFileTool);
registry.register(SpawnAgentTool);
registry.register(UseSkillTool);
// 2. Connect MCP servers (if configured)
await registerMCPTools(
{ name: "postgres", url: "http://localhost:3001/mcp" },
registry,
);
// 3. Discover skills (personal + project)
const skillRegistry = new SkillRegistry();
await skillRegistry.discoverSkills([
path.join(process.env.HOME || "~", ".claude", "skills"), // personal
path.join(process.cwd(), ".claude", "skills"), // project
]);
// 4. Configure permissions
const permissions = new PermissionSystem("default", {
denyPatterns: [/rm\s+-rf\s+\//, /curl.*\|.*sh/],
allowedPaths: [process.cwd()],
});
// 5. Register hooks
const hooks = new HookRunner();
hooks.register({
event: "PreToolUse",
command: `bash -c 'if echo "$TOOL_INPUT" | grep -q "node_modules"; then echo "BLOCKED"; exit 1; fi'`,
});
// 6. Initialize skill-aware context manager
const contextManager = new SkillAwareContextManager(
process.cwd(),
200_000,
skillRegistry,
);
// 7. Create the agent loop and run
const agent = new AgentLoop(registry, permissions, hooks, contextManager);
const result = await agent.run("Deploy the app to staging");
// The agent will autonomously discover the 'deploy' skill from metadata,
// load its instructions via UseSkill, and follow the steps.
console.log(result);
}
main().catch(console.error);6. Architectural Lessons #
Stepping back, the Claude Code harness teaches several generalizable lessons about building agentic systems:
The model is not the product #
Only ~2% of Claude Code’s codebase is “AI-related” in the sense of prompt engineering or model interaction. The remaining 98% is operational infrastructure: state management, safety, tool execution, context optimization. If you are building an agentic system, expect a similar ratio.
Distributed systems patterns apply #
The harness is effectively a distributed system with a single worker (the LLM) and multiple services (the tools):
| Pattern | Harness analogue |
|---|---|
| Worker pool | Sub-agents |
| Service interface | Tool registry |
| Middleware | Hooks |
| Log rotation | Context compaction |
| Configuration management | CLAUDE.md |
| Circuit breaker | Reactive compact + retry |
If you have experience building distributed systems, you already have the mental models needed to reason about agentic architectures.
Safety is infrastructure, not a feature #
The permission system, hooks, and schema validation are not bolted-on safety features — they are load-bearing infrastructure that the entire execution model depends on. The deny-first design, deterministic hooks, and layered gates are what make it safe to give an LLM write access to your codebase.
Statelessness is a feature, not a bug #
The model’s statelessness is often framed as a limitation, but Claude Code leverages it as a feature. Because every API call is independent, the harness can:
- Compact the context without side effects — the model doesn’t “notice” missing history.
- Fork sessions — two users can branch from the same conversation and diverge.
- Resume sessions — the harness reconstructs context from persisted state; the model doesn’t need to “wake up.”
The harness transforms a liability (no memory) into a capability (flexible state management).
Conclusion #
Claude Code is a masterclass in the unglamorous but essential work of building agentic infrastructure. The agentic loop is simple; the tool registry is modular; the permission system is layered; the context management is multi-staged; and the extensibility surfaces (hooks, MCP, skills, sub-agents) are designed for growth without touching the core loop.
The real insight is architectural: the intelligence is in the model, but the reliability is in the harness. If you’re building systems that give LLMs agency over real-world environments, the harness is where most of your engineering effort should go.