Don't Build Agents, Build Skills (the Lesson Anthropic Teaches in 16 Minutes)

Two Anthropic engineers explain why skills beat custom agents. Summary, analysis and practical lessons from Barry Zhang and Mahesh Murag's talk.

claude anthropic agents skills claude-code architecture ai

There’s a 16-minute talk that’s worth more than months of trial and error with AI agents. Barry Zhang and Mahesh Murag, two Anthropic engineers who built Claude’s Skills system, make a simple observation: we’re building agents where we should be building skills.

No marketing. No hype. Real lessons from people who actually build.

Here’s what they say, why it matters, and what it changes for you.

The Problem: Agents Are Smart but Incompetent

Barry Zhang drops a killer analogy. You need to file your taxes. You have a choice between:

  • Mahesh: 300 IQ, mathematical genius, but has never seen a tax form in his life
  • Barry: Tax professional with 15 years of experience

You pick Barry every time. And that’s exactly the problem with AI agents today.

LLMs have the reasoning. They have the connectivity (MCP, tools, APIs). But they don’t have domain expertise. They don’t know how YOUR team does things. They don’t know YOUR conventions. They don’t retain what they learned yesterday.

Result: you spend your time repeating the same instructions, correcting the same mistakes, and rebuilding agents from scratch for every new domain.

The Solution: Skills, Not Agents

What is a skill, concretely? It’s a folder of files that packages procedural know-how. Markdown, scripts, assets, metadata. That’s it.

Not a complex framework. Not an API. A folder that anyone can create, version with Git, and share.

The fundamental difference from a custom agent:

Custom AgentSkill
Cost$500K per domain10-20% of a custom agent
Who creates itAI engineersAnyone (even non-technical)
ReusableNo, coupled to infraYes, portable everywhere
MaintenanceFull rebuild if it changesUpdate the file
Scalability1 agent = 1 domain1 agent + N skills = N domains

The 3-Layer Architecture

The speakers describe an emerging 3-layer stack:

1. The agent loop: Manages context and token flow. The generic “brain.”

2. The runtime environment: File system access and code execution. The “hands.”

3. External connectivity: MCP servers connecting to tools and data. The “senses.”

And skills? They’re the application layer. Hundreds or thousands of skills available to a single agent, loaded into context only when needed.

It’s exactly the computing history analogy:

  • Models = Processors: Powerful but limited alone
  • Agent runtime = Operating system: Orchestrates resources
  • Skills = Applications: Where millions of people solve concrete problems

The Key Concept: Progressive Disclosure

This is where the design gets smart. A skill isn’t loaded entirely into the agent’s context. The system uses progressive disclosure:

  1. First, only the skill’s metadata is visible (name, description, when to use it)
  2. The agent decides if it needs this skill for the current task
  3. If needed, the full content (scripts, instructions) is loaded

Result: the context window stays lean even with hundreds of available skills. The agent loads only what it needs, when it needs it.

Code as the Universal Interface

The other major insight from the talk: code is the universal interface for agent execution.

Give an LLM file system access and code execution, and a single agent loop can handle everything: data analysis, API calls, automation, without custom builds for each use case.

Scripts in skills are self-documenting, modifiable, and degrade far less than static instructions. A Python script that formats a slide will be reused identically 100 times. A natural language instruction will be reinterpreted differently every time.

The Concrete Example: Slide Formatting

Barry shares a real case. Claude was writing the same Python script every time to format slides. Every new session, it rewrote everything from scratch. Same result, same effort, zero memory.

The solution: create a skill that stores the script. Now Claude loads the skill, runs the existing script, and moves on. Consistent, efficient, zero redundancy.

That’s the power of skills: turning repetitive patterns into reusable know-how.

The Emerging Ecosystem

Within 5 weeks of launch, thousands of skills were created. Three categories emerged:

Foundational skills: General or domain-specific capabilities. Document manipulation, scientific research, bioinformatics.

Partner skills: Companies integrating their products. Browserbase for web automation, Notion for workspace understanding.

Enterprise skills: Organization-specific. Fortune 100 companies use them as organizational playbooks. Developer productivity teams standardize coding conventions.

And the most surprising: non-technical people are creating skills. Finance, legal, recruiting, accounting. A skill can be as simple as a well-written SKILL.md file.

What This Changes for Builders

1. Stop Rebuilding Agents

If you have 5 custom agents for 5 different domains, you have 5 codebases to maintain. With the skills approach, you have 1 generic agent + 5 skills. Maintenance is divided by 5.

2. Package Your Expertise, Not Your Infra

The value isn’t in the agent loop (everyone uses the same one). The value is in the domain expertise you package as skills. That’s your moat.

3. Let Non-Technical People Contribute

Your domain experts know the processes better than any AI engineer. Give them a skill template and let them package their know-how. It’s 10x more efficient than having a dev translate their expertise.

4. Think Composability

A single skill solves a problem. Multiple composed skills solve a workflow. The agent orchestrates, skills execute. It’s the same logic as microservices, but for expertise.

5. Let the Agent Generate Its Own Skills

Anthropic’s long-term vision: Claude creates and refines its own skills over time. Day 1, it’s generic. Day 30, it knows your project, your conventions, your patterns. Knowledge accumulates, not in a growing prompt, but in skills that get refined.

My Take

This talk confirms what we experience daily with Claude Code. Skills (custom commands, specialized sub-agents, CLAUDE.md files) are what make the difference between a dev who “uses Claude” and a dev who multiplies their productivity by 10x with Claude.

The mistake most devs make: they think “agent” when they should think “skill.” They build complex systems when they should package simple know-how.

The talk is 16 minutes long. It’s the best time investment you’ll make this week if you work with AI agents.


Summary and analysis of the talk “Don’t Build Agents, Build Skills Instead” by Barry Zhang and Mahesh Murag (Anthropic), presented at the AI Engineering Code Summit. Watch the video.

Pierre Rondeau

Pierre Rondeau

Developer and indie builder. I build products and automations with AI. Creator of Claude Hub.

LinkedIn