What is Centralized vs. Decentralized Agent Coordination?
The Architecture Behind AI Teamwork
When most people imagine AI getting smarter, they picture a single, monolithic intelligence becoming more powerful. But there’s another path: multiple AI agents working together as a team. It sounds promising in theory. In practice, AI teams often perform worse than a single agent working alone.
In December 2025, a landmark study tested 180 different configurations of AI agent teams to answer a deceptively simple question: when do multiple agents actually help, and when do they just get in each other’s way? The answer turns out to depend heavily on a choice most people never think about—how the agents are coordinated.
The Central Question That Nobody Was Asking
AI researchers have been building multi-agent systems for years, but they’ve largely been flying blind. Some teams boost performance by 80%. Others crater by 70%. The problem? No one had systematically studied why.
The research paper “Towards a Science of Scaling Agent Systems“ changed that. Rather than building yet another clever multi-agent system and hoping it works, the researchers stepped back and asked a more fundamental question: what organizational principles determine whether agent teamwork succeeds or fails?
Two Ways to Organize an AI Team
Simply put, when you have multiple AI agents working together, someone needs to coordinate them. There are two fundamentally different approaches.
Centralized coordination works like a traditional corporate hierarchy. One agent acts as the “manager,” understanding the big picture, breaking down the work, assigning tasks to worker agents, and synthesizing their results. The worker agents don’t talk to each other—they only communicate with the central coordinator. Think of it like a hub-and-spoke airline network, where everything flows through a central hub.
Decentralized coordination works like a group of colleagues on equal footing. Every agent can communicate directly with every other agent. There’s no boss. Agents share information peer-to-peer, negotiate who does what, and figure things out collaboratively. It’s more like a mesh network where every node can talk to every other node.
Each approach has a radically different information flow. In centralized systems, the coordinator becomes a bottleneck—all communication flows through it. In decentralized systems, the communication burden explodes—if you have five agents, that’s ten possible connections between them.
When Centralized Coordination Dominates
The research revealed a striking pattern: centralized coordination shines on tasks that can be broken into independent, parallel subtasks.
Financial reasoning is a perfect example. Imagine analyzing a company’s quarterly report to predict stock performance. You need to evaluate revenue trends, assess debt levels, analyze competitive positioning, and review management commentary. These analyses can happen simultaneously. A central coordinator can assign each worker agent a specific analysis, then synthesize their findings into a final recommendation.
On the Finance-Agent benchmark, centralized coordination improved performance by 80.8% compared to a single agent. That’s not a typo—teams were nearly twice as effective as individuals.
Why does this work so well? Because the task has natural parallelism with minimal interdependencies. One agent can analyze revenue while another reviews debt, and they don’t need to coordinate mid-task. The central orchestrator simply needs to collect their outputs and integrate them.
When Decentralized Coordination Wins
But not all tasks fit neatly into independent boxes. Some require constant adaptation based on emerging information.
Web navigation exemplifies this. When AI agents browse the web to research a topic, they don’t know ahead of time what they’ll find. One agent might discover that a company has been acquired, which completely changes what information other agents should be looking for. That real-time information sharing is crucial.
On web navigation tasks, decentralized coordination improved performance by 9.2%, while centralized coordination managed only 0.2%. The difference comes down to adaptability.
In a decentralized setup, if Agent A discovers something important while browsing, it can immediately tell Agent B, C, and D to adjust their search strategies. In a centralized system, Agent A would have to report back to the coordinator, who would then need to reassign tasks to the other agents—adding latency and reducing responsiveness.
The web is unpredictable. Decentralized teams handle unpredictability better because they can adapt locally without waiting for central approval.
The Dark Side: When Teams Make Everything Worse
Here’s the uncomfortable truth the research uncovered: on many tasks, multi-agent systems—regardless of coordination strategy—perform significantly worse than a single agent.
Sequential reasoning tasks are the clearest example. Imagine working through a complex logic puzzle where each step depends on the previous one. Adding more agents doesn’t help—it actively hurts. On these tasks, even the best multi-agent approaches degraded performance by 39-70%.
Why? Three reasons.
First, there’s coordination overhead. Agents spend computational resources explaining plans to each other instead of actually solving the problem. When a task doesn’t benefit from parallelism, this overhead is pure waste.
Second, capability saturation sets in. If a single strong agent can already solve a task correctly 45% of the time or more, adding more agents rarely helps. You’re essentially adding mediocre contributors to a team that already has an expert.
Third, and most concerning, is error amplification. When agents work independently without coordination, errors multiply. If Agent A makes a mistake and Agent B builds on that mistake, the final answer can be catastrophically wrong. The research found that independent agents amplify errors by a factor of 17.2x. Centralized coordination reduces this to 4.4x by having one agent review all outputs, but the problem doesn’t disappear—it just becomes more manageable.
The Tool-Coordination Trade-off
One of the most surprising findings was what the researchers called the “tool-coordination trade-off.”
AI agents don’t just think—they use tools. They can run calculations, query databases, call APIs, or search the web. These tool calls consume computational budget. So does inter-agent communication.
On tool-heavy tasks, the budget you spend on coordination is budget you can’t spend on tool usage. And for many tasks, running one more database query is far more valuable than having agents explain their plans to each other.
This creates a zero-sum trade-off. Financial analysis tasks benefited from coordination because agents didn’t need many tool calls—they mostly reasoned about data that was already provided. Web navigation tasks needed both tools and coordination, making the choice more nuanced. Sequential reasoning tasks needed neither parallelism nor tool calls, making any coordination pure overhead.
A Common Misconception About Agent Architecture
Many people assume that “more agents” automatically means “more intelligence,” similar to how more GPUs means faster training. This is wrong.
Intelligence doesn’t scale like compute. Adding a second agent doesn’t give you “twice the intelligence” any more than adding a second human to a team doubles productivity.
What matters is task structure. If your task has parallelizable subtasks with minimal interdependencies, more agents can help. If your task requires sequential reasoning or if a single capable agent can already solve it well, additional agents are likely to reduce performance, not improve it.
The architecture—centralized versus decentralized—determines how well agents can leverage parallelism while minimizing coordination costs. But if the task fundamentally doesn’t benefit from parallelism, no coordination architecture will save you.
What This Means in Practice
The research derived a predictive model that can recommend the optimal coordination strategy for 87% of unseen configurations, based on a few key task properties.
Here’s a simplified decision framework for practitioners:
Use centralized coordination when:
The task can be divided into independent subtasks
Subtasks don’t generate information that changes other subtasks mid-execution
You need strong error control (one agent reviews all outputs)
Example: financial analysis, report generation, multi-source data collection
Use decentralized coordination when:
The task requires real-time adaptation based on emerging information
Agents need to negotiate or share discoveries dynamically
The environment is unpredictable or changes during execution
Example: web research, dynamic planning, exploration tasks
Use a single agent when:
The task requires sequential reasoning where each step depends on the previous
A capable single agent already achieves >45% success rate
The task is tool-heavy and coordination would consume valuable budget
Example: complex math problems, step-by-step logic puzzles, most code generation
The Bigger Picture
For years, the AI community has treated agent coordination as an implementation detail—something you figure out after deciding to use multiple agents. This research inverts that relationship.
Coordination architecture isn’t a detail. It’s a fundamental choice that determines whether your system will work at all.
The right coordination strategy can double performance. The wrong one can reduce it by 70%. And sometimes, the right answer is to not use multiple agents in the first place.
As AI systems become more agentic and handle more complex workflows, this matters more. A customer service AI might route different aspects of a problem to specialized agents (centralized). A research AI might have agents exploring different information sources and sharing discoveries (decentralized). A coding AI might use a single agent for sequential logic but spawn multiple agents to run tests in parallel (hybrid).
The key insight is that these aren’t arbitrary choices—they’re engineering decisions that should be grounded in task properties, not developer preference or fashionable architecture trends.
What Comes Next
This research provides the first quantitative framework for predicting when multi-agent systems will outperform single agents. But it’s just the beginning.
Current limitations remain. The study evaluated four benchmarks across three LLM families. Real-world tasks are more diverse. The framework also assumes fixed computational budgets, but in production, you might be willing to spend more if it improves quality.
There’s also the question of hybrid approaches. What if you use centralized coordination for some subtasks and decentralized for others? The research tested a hybrid architecture, but there’s a vast space of hybrid designs to explore.
Most importantly, the research focused on task completion accuracy. But production systems care about latency, cost, and reliability too. An architecture that’s 5% more accurate but three times slower might be the wrong choice in practice.
The path forward is clear: treat coordination architecture as a first-class design decision, measurable and predictable, rather than an afterthought. As agents move from research demos to production systems handling real work, these architectural choices will determine what works and what fails.
References and Further Reading
Towards a Science of Scaling Agent Systems (arXiv:2512.08296) - The primary research paper
Google Research Blog: When and Why Agent Systems Work - Accessible overview from the research team
Paper Discussion on Hugging Face - Community discussion and additional context
Finance-Agent, BrowseComp-Plus, PlanCraft, and Workbench benchmarks - Evaluation datasets used in the study


