Don't Build Agents, Build Skills (the Lesson Anthropic Teaches in 16 Minutes)
Two Anthropic engineers explain why skills beat custom agents. Summary, analysis and practical lessons from Barry Zhang and Mahesh Murag's talk.
There’s a 16-minute talk that’s worth more than months of trial and error with AI agents. Barry Zhang and Mahesh Murag, two Anthropic engineers who built Claude’s Skills system, make a simple observation: we’re building agents where we should be building skills.
No marketing. No hype. Real lessons from people who actually build.
Here’s what they say, why it matters, and what it changes for you.
The Problem: Agents Are Smart but Incompetent
Barry Zhang drops a killer analogy. You need to file your taxes. You have a choice between:
- Mahesh: 300 IQ, mathematical genius, but has never seen a tax form in his life
- Barry: Tax professional with 15 years of experience
You pick Barry every time. And that’s exactly the problem with AI agents today.
LLMs have the reasoning. They have the connectivity (MCP, tools, APIs). But they don’t have domain expertise. They don’t know how YOUR team does things. They don’t know YOUR conventions. They don’t retain what they learned yesterday.
Result: you spend your time repeating the same instructions, correcting the same mistakes, and rebuilding agents from scratch for every new domain.
The Solution: Skills, Not Agents
What is a skill, concretely? It’s a folder of files that packages procedural know-how. Markdown, scripts, assets, metadata. That’s it.
Not a complex framework. Not an API. A folder that anyone can create, version with Git, and share.
The fundamental difference from a custom agent:
| Custom Agent | Skill | |
|---|---|---|
| Cost | $500K per domain | 10-20% of a custom agent |
| Who creates it | AI engineers | Anyone (even non-technical) |
| Reusable | No, coupled to infra | Yes, portable everywhere |
| Maintenance | Full rebuild if it changes | Update the file |
| Scalability | 1 agent = 1 domain | 1 agent + N skills = N domains |
The 3-Layer Architecture
The speakers describe an emerging 3-layer stack:
1. The agent loop: Manages context and token flow. The generic “brain.”
2. The runtime environment: File system access and code execution. The “hands.”
3. External connectivity: MCP servers connecting to tools and data. The “senses.”
And skills? They’re the application layer. Hundreds or thousands of skills available to a single agent, loaded into context only when needed.
It’s exactly the computing history analogy:
- Models = Processors: Powerful but limited alone
- Agent runtime = Operating system: Orchestrates resources
- Skills = Applications: Where millions of people solve concrete problems
The Key Concept: Progressive Disclosure
This is where the design gets smart. A skill isn’t loaded entirely into the agent’s context. The system uses progressive disclosure:
- First, only the skill’s metadata is visible (name, description, when to use it)
- The agent decides if it needs this skill for the current task
- If needed, the full content (scripts, instructions) is loaded
Result: the context window stays lean even with hundreds of available skills. The agent loads only what it needs, when it needs it.
Code as the Universal Interface
The other major insight from the talk: code is the universal interface for agent execution.
Give an LLM file system access and code execution, and a single agent loop can handle everything: data analysis, API calls, automation, without custom builds for each use case.
Scripts in skills are self-documenting, modifiable, and degrade far less than static instructions. A Python script that formats a slide will be reused identically 100 times. A natural language instruction will be reinterpreted differently every time.
The Concrete Example: Slide Formatting
Barry shares a real case. Claude was writing the same Python script every time to format slides. Every new session, it rewrote everything from scratch. Same result, same effort, zero memory.
The solution: create a skill that stores the script. Now Claude loads the skill, runs the existing script, and moves on. Consistent, efficient, zero redundancy.
That’s the power of skills: turning repetitive patterns into reusable know-how.
The Emerging Ecosystem
Within 5 weeks of launch, thousands of skills were created. Three categories emerged:
Foundational skills: General or domain-specific capabilities. Document manipulation, scientific research, bioinformatics.
Partner skills: Companies integrating their products. Browserbase for web automation, Notion for workspace understanding.
Enterprise skills: Organization-specific. Fortune 100 companies use them as organizational playbooks. Developer productivity teams standardize coding conventions.
And the most surprising: non-technical people are creating skills. Finance, legal, recruiting, accounting. A skill can be as simple as a well-written SKILL.md file.
What This Changes for Builders
1. Stop Rebuilding Agents
If you have 5 custom agents for 5 different domains, you have 5 codebases to maintain. With the skills approach, you have 1 generic agent + 5 skills. Maintenance is divided by 5.
2. Package Your Expertise, Not Your Infra
The value isn’t in the agent loop (everyone uses the same one). The value is in the domain expertise you package as skills. That’s your moat.
3. Let Non-Technical People Contribute
Your domain experts know the processes better than any AI engineer. Give them a skill template and let them package their know-how. It’s 10x more efficient than having a dev translate their expertise.
4. Think Composability
A single skill solves a problem. Multiple composed skills solve a workflow. The agent orchestrates, skills execute. It’s the same logic as microservices, but for expertise.
5. Let the Agent Generate Its Own Skills
Anthropic’s long-term vision: Claude creates and refines its own skills over time. Day 1, it’s generic. Day 30, it knows your project, your conventions, your patterns. Knowledge accumulates, not in a growing prompt, but in skills that get refined.
My Take
This talk confirms what we experience daily with Claude Code. Skills (custom commands, specialized sub-agents, CLAUDE.md files) are what make the difference between a dev who “uses Claude” and a dev who multiplies their productivity by 10x with Claude.
The mistake most devs make: they think “agent” when they should think “skill.” They build complex systems when they should package simple know-how.
The talk is 16 minutes long. It’s the best time investment you’ll make this week if you work with AI agents.
Summary and analysis of the talk “Don’t Build Agents, Build Skills Instead” by Barry Zhang and Mahesh Murag (Anthropic), presented at the AI Engineering Code Summit. Watch the video.
Pierre Rondeau
Developer and indie builder. I build products and automations with AI. Creator of Claude Hub.
LinkedIn