Agent Teams: When Sub-Agents Aren't Enough
I’ve written a lot about sub-agents on this blog. How to build them, when to split work into them, where the boundaries should be. Sub-agents changed how I use Claude Code—instead of one overwhelmed AI doing everything, I had a team of specialists.
But they couldn’t talk to each other.
That always bugged me. I’d have a code reviewer that found something, and I’d have to manually relay the findings to the implementation agent. Or I’d have two agents researching different parts of a problem, and there was no way for them to share what they’d found without me playing middleman.
Anthropic recently shipped Opus 4.6, and with it comes something called agent teams. It’s still experimental—research preview, they call it—but I’ve been poking at it, and I think this is a pretty big deal.
What’s Different from Sub-Agents
Sub-agents are isolated workers. You give them a task, they do it, they report back to you. That’s it. They can’t talk to each other, can’t coordinate, can’t share findings. Every piece of communication goes through the main agent.
Agent teams are different. You spawn a group of agents that can message each other directly. One agent can tell another “hey, I just changed the database schema, update your API endpoints.” They have a shared task list where work items can have dependencies—so agent B automatically unblocks when agent A finishes the migration it was waiting on.
There’s also a team lead (whichever agent creates the team), but it’s not a strict hierarchy. You can message individual teammates directly. They can message each other. It’s peer-to-peer, not hub-and-spoke.
The mental model shift is real. Sub-agents are like delegating tasks to contractors who each work alone. Agent teams are more like… an actual team.
Turning It On
This is still experimental, so it’s disabled by default. You’ve got two options:
Set an environment variable:
export CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1
Or add it to your settings.json:
{
"experimental": {
"agentTeams": true
}
}
Once it’s on, you just describe what you want in natural language. Something like “create a team with one agent working on the API, one on the frontend, and one writing tests.” Claude figures out the rest.
How Teams Actually Work
Under the hood, there’s a tool called TeammateTool that handles all the coordination. You don’t need to know the internals to use it, but I found it helpful to understand what’s happening.
When a team spins up, it creates a config file and message directories under ~/.claude/teams/. Each teammate gets their own context window—completely independent. They load your CLAUDE.md automatically, but they don’t inherit the conversation history from the lead. If there’s context they need, it has to go in the prompt when they’re spawned.
The shared task list is the main coordination mechanism. Tasks can be pending, in progress, or completed, and they can have dependencies on other tasks. When a dependency finishes, the blocked task unblocks automatically. It’s simple but it works.
Communication happens through two channels: direct messages to a specific teammate, or broadcasts to everyone (which should be used sparingly—think breaking changes that affect the whole team).
Display Modes
There are two ways to see what your team is doing:
In-process mode runs all teammates inside your main terminal. You use Shift+Up/Down to select a teammate and type to message them directly. This works everywhere.
Split panes give each teammate its own terminal pane so you can watch everyone’s output simultaneously. This is nicer but requires tmux or iTerm2. Doesn’t work in VS Code’s integrated terminal, which is a bit annoying.
By default it picks automatically based on your environment. If you’re already in tmux, you get split panes. Otherwise it’s in-process.
Patterns I’ve Been Trying
The documentation describes a few orchestration patterns. I haven’t used all of them extensively yet, but here’s what stood out.
The Swarm
Leader creates a team and a list of tasks. Workers self-assign from the queue. This is great for work that’s embarrassingly parallel—think migrating a bunch of files, or running different types of analysis on a codebase. The workers are interchangeable, and the task queue handles coordination.
Pipeline
Agent A finishes something, which unblocks Agent B, which unblocks Agent C. Sequential processing with automatic handoffs. I can see this being useful for something like: generate database migration, then update the API layer, then update the frontend types.
Debate
This one’s interesting. Multiple agents get the same task but approach it differently. The lead picks the best solution. I haven’t tried it yet, but the idea of getting competing architecture proposals from different agents is appealing.
Watchdog
One agent does the work, another monitors it. The watcher can trigger rollback if something goes wrong. For critical operations where you want a safety check built into the process.
Where I Think This Shines
Based on what I’ve seen so far, a few use cases feel natural:
Code review from multiple angles. Spawn three reviewers—one focused on security, one on performance, one on test coverage. They each review and share findings. Way more thorough than a single pass.
Feature work across layers. One agent on the API, one on the database migration, one on the frontend. They coordinate schema changes and types without you relaying messages between them.
Debugging competing hypotheses. Two agents investigating different theories about why something’s broken. They can share what they’ve ruled out so nobody’s duplicating effort.
The Gotchas
I’d be lying if I said it was all smooth. Some things to know:
Token costs multiply fast. A five-person team burns roughly five times the tokens of a single session. That’s not surprising, but it adds up. Don’t spin up a team for something a single agent can handle.
File conflicts are real. Two agents editing the same file will cause problems. You need to break work up so each agent owns different files. This requires thinking about boundaries upfront—which, if you’ve read my earlier posts, you know I have strong opinions about.
No session resumption. If your session dies, /resume doesn’t restore the teammates. The lead might try to message agents that don’t exist anymore. This is probably the most painful limitation right now.
Teammates don’t inherit your conversation. They get CLAUDE.md but not the chat history. Anything important needs to go in the spawn prompt. I’ve already been burned by this—spawning an agent that was missing context I’d discussed ten minutes earlier with the lead.
One team at a time. You can’t run multiple teams from the same session. You have to clean up the current team before starting a new one.
No nesting. Teammates can’t spawn their own teams. Sub-agents within teams might still work (haven’t tested this thoroughly), but you can’t go recursive with the team structure.
Agent Teams vs. Sub-Agents: When to Use Which
I don’t think agent teams replace sub-agents. They solve different problems.
Use sub-agents when:
- The work is focused and independent
- You just need results reported back
- Token cost matters (sub-agents are cheaper)
- The task is quick and well-defined
Use agent teams when:
- Agents need to share findings with each other
- Work spans multiple parts of the codebase that interact
- You want competing perspectives on the same problem
- Coordination between agents matters more than raw speed
My sub-agent setup for code review still makes sense for a quick review of a single PR. But if I’m doing a deep architectural review where security implications connect to performance implications connect to test coverage, that’s where a team of agents talking to each other would be more effective.
Things I’m Still Figuring Out
It’s still early days. I don’t have all the answers yet. Some open questions I’m thinking about:
How granular should team composition be? Is three agents the sweet spot, or does it depend entirely on the task? I suspect there’s a point of diminishing returns where adding more agents creates more coordination overhead than it saves.
How do you handle context compaction across a team? Opus 4.6 also introduced automatic context compaction (summarizes older context when you’re approaching the limit), and each teammate compacts independently. But if important shared context gets compacted differently by different agents, could they end up out of sync?
What’s the right way to structure CLAUDE.md when you know teams will be reading it? Should there be team-specific instructions?
I’ll probably write more about this as I get more experience with it. For now, I’m cautiously optimistic. The gap between “sub-agents that can’t talk to each other” and “agents that actually coordinate” is significant, and this feels like the right direction.
Getting Started
If you want to try it yourself, the barrier is low. Enable the experimental flag, start a Claude Code session, and describe the team you want. Start small—two agents working on different files in the same feature. See how the coordination feels.
Don’t start by spawning a huge team of agents. That’s the kind of mistake I would make, and I’m telling you not to.
One team. Two agents. A real task. See what happens.