MCP Is Broken and Anthropic Just Admitted It

Let's be honest. MCP works beautifully in demos and breaks the moment you try to scale it.

Cordero Core

~6 min read · November 13, 2025 (Updated: December 17, 2025) · Free: Yes

Let's be honest.

One GitHub MCP server can expose ninety-plus tools. That's more than fifty thousand tokens of JSON schemas and descriptions loaded into the model's working memory before it even starts thinking about your prompt.

So when Anthropic introduced Skills, I didn't see a new feature.

I saw a course correction.

A quiet one.

But a course correction all the same.

The Real Problem with MCP Isn't the Tools

Under the hood, MCP is pretty simple conceptually.

You run one or more MCP servers.

Each server exposes a collection of tools, each tool has a schema, and the client loads those definitions into the model's context so it can decide which tools to call and how to call them. In theory, this gives the model a clean, typed interface to the outside world.

In practice, that "clean interface" turns into a firehose.

Every tool schema gets stuffed into the context window up front. It doesn't matter if you're asking the model to read a file, query a database, or just summarize a paragraph. The model still has to wade through the full inventory of tools, arguments, and descriptions to figure out what might be relevant.

Now layer on the fact that tool call accuracy compounds. If a single tool decision is 90% accurate, chaining five of them together takes you to 0.90⁵ = 0.59, which is a little under 60%. The endless Reddit posts say it plainly: tool call accuracy effectively drops off exponentially as you stack calls.

The result is a nasty combination:

Huge chunks of the context window burned on schemas that might never be used.
Conversation history plus global tool definitions plus task-specific context all fighting for space.
Multi-step workflows that start strong and then drift into failure as the model loses track of earlier constraints.

I've watched this happen in real systems. The task is simple. The tools are correct. The failure comes from cognitive overload. The model is being asked to reason, plan, and choose tools while holding an entire tool universe in its head.

MCP, as most people implement it today, doesn't just expose tools.

It exposes too much.

Anthropic's Answer Wasn't to Patch MCP

They went after the pattern that was actually broken: static, upfront exposure.

Skills flip the flow.

A Skill is a folder. Inside that folder you get a SKILL.md file with YAML frontmatter (name, description, metadata), followed by detailed instructions, references, and optionally links to other files in the same directory.

At startup, the agent doesn't read every skill in full. It only takes the minimal metadata for each skill and tucks that into the system prompt. That gives Claude just enough information to know when a skill might be relevant, without paying the cost of loading everything.

When the user asks for something, Claude goes through a progressive disclosure process:

Look at the installed skills' names and descriptions.
If a skill seems relevant, use a filesystem tool to open SKILL.md.
If that file references additional documents (like forms.md or reference.md), read only those, and only if needed.
If the skill includes scripts, run them via the code execution environment instead of trying to "simulate" them through token generation.

The key is that context is layered instead of dumped.

A skill is a directory containing a SKILL.md file that contains organized folders of instructions, scripts, and resources that give agents additional capabilities.

You can bundle a lot of knowledge into a skill. Multiple markdown files. Examples. Metric definitions. Even full Python scripts that act like deterministic microservices inside the agent's environment. But none of that hits the context window until Claude decides it's relevant for the current task.

That is the opposite of how most MCP setups behave.

And Skills aren't just a Claude UI trick. They're supported across Claude apps, Claude Code, the API, and the Agent SDK, all wired into the same code execution tool that gives agents a secure environment to run scripts.

So instead of saying "MCP is broken," Anthropic did something much more interesting: they changed how the model meets MCP.

Skills sit in front. MCP sits behind.

Skills = Retrieval. MCP = Tools. Together = RAG-MCP.

If you've ever built retrieval-augmented generation (RAG), the pattern should feel familiar.

In RAG, you don't shove the entire knowledge base into the context window. You store it separately, use an index to retrieve only what's relevant, and then let the model read that slice while it works.

Skills do the same thing for tools and procedural knowledge.

The "index" is the skill metadata: name, description, tags.
The "documents" are the SKILL.md body plus any linked files.
The "answering step" is the combination of instructions, code, and (when necessary) MCP calls wrapped in that skill.

Instead of dumping every MCP tool schema into the context, you bind them to specific workflows:

"PDF form filling" skill that knows which MCP server and script to use.
"Marketing analytics insight" skill that uses Python to crunch CSVs and only calls tools when needed.
"Tweet to newsletter" skill built from examples of your own writing style plus helper code.

The model doesn't need to know the entire tool universe anymore.

It only needs to know which skill is relevant, and the skill knows how to orchestrate the rest.

That's RAG-MCP in practice:

Retrieve the right skill.
Load only the instructions and references that matter.
Execute code for deterministic steps.
Call MCP tools as a last-mile integration when the workflow requires it.

Skills don't "compete" with MCP.

They domesticate it.

Why This Matters for People Actually Building Agents

If you're just wiring up toy projects, you can brute-force a lot of this. Context windows are large, demos are short, tools are few. MCP feels fine.

As soon as you step into:

multi-tenant systems,
long-lived conversations,
high-stakes workflows (compliance, finance, safety),
or multi-agent networks where tools call tools,

the cracks show.

You start to see:

weird tool selection that made sense to the model but not to you,
hallucinated parameters that technically match the schema but violate the business logic,
workflows that work on day one and drift on day seven because the conversation history is now competing with tool definitions for context space.

You can patch around it. I've done that too.

You can trim schemas, split servers, rewrite descriptions to be shorter.

You can spin up supervising agents that try to guide the main agent back onto the rails.

But you're still fighting the underlying pattern: the model is overloaded with tools before it ever touches the task.

Skills change the shape of the problem.

You shift from "the model must understand every tool in the system" to "the model must pick the right skill for the job, and the skill will bring only what it needs."

That's a much saner mental model.

It matches how we onboard humans, too.

We don't hand a new teammate the full wiki on their first day and expect them to memorize it. We give them a starting guide, then point them to specific documents and scripts as they hit real tasks.

Skills formalize that pattern for agents.

A Better Path Forward

There's something I respect about the way Anthropic handled this.

They didn't publish a "State of MCP" essay.

They didn't declare the protocol dead.

They didn't try to pretend context bloat was just user error.

They quietly shipped a mechanism that accepts the reality of large tool ecosystems and gives agents a way to interact with them without collapsing under the weight.

Skills are composable.

Portable across products.

Efficient because they only load what they need.

And powerful because they can bring code execution into the mix whenever reliability matters more than pure generation.

Most importantly, they acknowledge the thing many of us have felt building in the wild:

The future of agents is not a bigger pile of tools.

It is a better way of relating to tools.

One where retrieval and orchestration come first.

Where context is treated as a scarce resource, not an infinite dumping ground.

Where MCP is still there, but it's no longer sitting on the model's chest.

Anthropic didn't abandon MCP.

They made it survivable.

And for those of us trying to build agents that do real work, that quiet shift matters more than any protocol announcement ever could.

If you enjoyed this breakdown or learned something new, tap the 👏 button, leave a comment, or share it with someone building in this space.

You can also follow me on Medium to catch future deep dives on AI systems, open-source infrastructure, and research software engineering.

#artificial-intelligence #ai #technology #software-development #entrepreneurship