Rekt - The Stack Nobody Checked

The attack began before anyone noticed the position was open. No exploit announcement. No white hat disclosure. No price impact to trigger an alert.

The protocol had been live for months, widely adopted, integrated into hundreds of production environments. The team trusted it. Why wouldn't they? Everyone was using it.

The vulnerability wasn't in the smart contract. It wasn't in the bridge. It was in the layer sitting between the user and everything they had access to.

Read permissions. Write permissions. The inbox. The drive. The internal docs. The deal flow. All of it reachable through a single connection nobody thought to audit.

The attacker didn't need to break in. The door was already open. They just walked through it, read what they needed, and left.

By the time anyone looked, the session logs were clean. The agent had behaved exactly as designed. Anthropic called it expected behavior.

When was the last time your team audited the infrastructure sitting between your AI and everything you've ever trusted it with?

Credit: Anthropic, Pivot Point Security, The Register, Ox Security, Invariantlabs, AuthZed, ITPro, Gambit Security, eSecurityPlanet, Code Wall, ByteBridge, Rafter

Anthropic released the Model Context Protocol in November 2024.

The pitch in their own words: "A universal, open standard for connecting AI systems with data sources, replacing fragmented integrations with a single protocol."

Google Drive, Slack, GitHub, Postgres - had pre-built servers shipped on day one.

Within months of open-sourcing it, Amazon, Microsoft, Google, and OpenAI had all adopted the protocol - making it the de facto infrastructure standard for AI agent connectivity across the industry.

By April 2026, over 10,000 published MCP servers existed across the ecosystem.

The results were not encouraging. A large-scale analysis of 5,200 open-source MCP server implementations found that 53% relied on hard-coded credentials, static API keys and personal access tokens baked directly into the server.

Only 8.5% used OAuth or any modern authentication method. Meanwhile, 88% of those servers required credentials to function at all.

This was not a theoretical finding about prototype code. It described production environments, actively running, connected to real organizational data.

Executives across industries received maturity matrices and productivity playbooks in early 2026 recommending Gmail MCP, Slack MCP, Drive MCP, and shared AI project environments as the path to operational transformation.

Connect your legal department. Connect your deal flow. Connect your communications. The pitch was productivity.

What it didn't include was a disclosure that the protocol underneath had already been compromised - repeatedly, across a full year - before the first recommendation was ever printed.

If the infrastructure is load-bearing before anyone checks the foundations, what exactly collapses when the inspection finally happens?

The Flaw They Wouldn’t Fix

OX Security ran more than 30 responsible disclosure processes across the vendor ecosystem, beginning in November 2025.

Their core finding was precise and damning: MCP executes local commands before verifying whether those commands are legitimate.

A rogue prompt delivered through a UI, a poisoned marketplace listing, or a compromised IDE, anywhere the agent reads input it hasn't verified, can execute arbitrary commands on the host system without the user doing a single thing wrong.

No malicious download. No phishing click. The agent reads the content, processes the instruction embedded inside it, and acts. That is the design.

When OX brought this to Anthropic, the response was not a patch timeline or an acknowledgment of risk.

Anthropic classified the behavior as expected.

A week later, according to The Register, a quiet policy update appeared.

OX's assessment of that update was direct: It didn't fix anything.

Anthropic did not respond to The Register's press inquiry on the story.

The estimated exposure at time of publication, up to 200,000 servers running against the unfixed architectural flaw.

This is not a zero-day discovered in the wild and raced to remediation. This is a flaw that was disclosed, reviewed, classified as intentional, and left in place.

Every MCP connector wired into an organization's email, legal files, or internal communications has been operating on infrastructure its creator was told was broken and declined to repair.

When a researcher tells you the door doesn't lock and the manufacturer calls that a feature, who bears the liability for what walks through it?

The Rug Pull

The crypto community already has a name for this attack.

An MCP rug pull works exactly the way the on-chain version does: A tool behaves legitimately at first, passes the checks, earns the approval, appears in logs as routine activity.

Then, silently, it becomes something else.

The user consented to version A.

They are running version B.

A malicious server can change the tool description after the client has already approved it, and the audit trail looks clean because the agent did exactly what it was instructed to do, by the version of the tool the user never approved .

At installation time, the user sees the benign tool description and is not alerted to any malicious behavior.

The attack remains largely invisible to the user, the agent did exactly what it was instructed to do, by the version of the tool they never approved.

A server presenting itself as a harmless "random fact of the day" tool gained user approval and sat dormant.

On its second load, it activated a sleeper attack, switching to a malicious tool definition the user had never seen or approved.

The rewritten instructions hijacked the agent's behavior with respect to a trusted WhatsApp MCP server running in the same system, exfiltrating the user's entire WhatsApp chat history through an attacker-controlled proxy number.

The attack confirmation dialog appeared non-problematic, the exfiltration payload was hidden behind a scrollbar in the message body.

The agent had behaved exactly as designed, by the attacker.

Every approved MCP tool is a standing authorization that can be rewritten after trust is established.

Every connector wired into email, storage, or internal communications is a version A that can silently become version B. The approval happened once. The tool runs indefinitely.

If consent can be granted once and revoked never, what does approval actually mean?

The Breach Timeline

This is not theoretical. More than a dozen documented incidents mapped directly to the connectors being promoted in AI-native transformation decks between April 2025 and April 2026.

Each one followed the same logic: An agent with privileged access, an input nobody sanitized, and in most cases no detection until after the damage was done.

In May 2025, a malicious public GitHub issue hijacked an AI assistant and instructed it to pull from private repositories, then deposit the contents - private code, internal project details, salary data - into a public pull request visible to anyone.

The root cause was a single over-privileged Personal Access Token wired into the MCP server.

One token. One poisoned issue. Everything behind it, exposed.

Four months later, a package masquerading as a legitimate Postmark MCP server began BCC'ing every email it processed to an attacker-controlled address.

Emails, internal memos, invoices - essentially all mail traffic processed by that MCP server were exposed.

Standard supply chain compromise. MCP servers run with high-privilege access by design. The attacker used all of it.

By February 2026, attackers cloned the legitimate Oura MCP project - a tool connecting AI assistants to Oura Ring health data - and distributed a trojanized version through public MCP registries.

To manufacture credibility, they built a deceptive GitHub ecosystem complete with fake contributors and multiple forks. The malicious package delivered StealC, an information-stealing malware designed to harvest developer credentials, browser passwords, API keys, and cryptocurrency wallets.

The Oura MCP server itself was never compromised. The attack didn't need it to be. The registry was enough.

Those three represent the shape of it.

The remaining incidents touched Anthropic's own developer tooling, cross-tenant enterprise project data, a supply chain package with 437,000 downloads, filesystem sandboxes, a design tool integration, and an MCP hosting platform with downstream control over 3,000 applications.

The full timeline is documented at AuthZed. A year of incidents, each one building on the access model the previous one established.

When the breach log runs to more than a dozen entries across twelve months and the architecture hasn't changed, at what point does the pattern become the policy?

The Attacks That Used The Weapon

Everything in the previous sections describes attacks on AI infrastructure.

What follows is AI being used as the attack infrastructure itself.

One attacker, nine government agencies in Mexico, Claude Code, and the GPT-4.1 API.

The attacker reportedly stole massive volumes of taxpayer, civil registry, health, electoral, procurement, and infrastructure data while also building tools for live querying and document forgery.

The attacker used AI to accelerate both hands-on intrusion work and broad internal intelligence collection, allowing one operator to function with the output of a larger team.

Forensic analysis recovered 1,088 individually logged prompts across 34 live sessions, generating 5,317 commands on live government infrastructure.

Claude Code was responsible for approximately 75% of remote command execution.

The attacker used false framing, claiming the activity was authorized security work, and introduced a penetration testing cheat sheet that shaped subsequent sessions.

By the end of the operation, they had built a live API into compromised tax infrastructure and a working system for generating forged official tax certificates using real government data.

195 million taxpayer records.

220 million civil registry records.

15.5 million vehicle registry records.

Gambit Security, which published the findings, described the AI-assisted methods as representing a significant evolution in offensive capability.

Then came McKinsey.

Security startup CodeWall pointed an autonomous offensive AI agent at the open internet and let it choose its own target.

It selected McKinsey's internal AI platform, Lilli - used by more than 43,000 consultants for strategy work, competitive analysis, and client research.

Two hours later, the agent had full read-and-write access to the production database.

The damage surface: 46.5 million chat messages, 728,000 files, 57,000 user accounts, and 95 system prompts governing how the AI responded to every consultant in the firm.

Those system prompts were writable.

An attacker could have silently reprogrammed what 43,000 consultants were told - about clients, mergers, acquisitions, and strategy - without deploying a single line of code, without a deployment review, without triggering a standard security alert.

The exploit was SQL injection. On a platform processing 500,000 prompts per month for two years.

SQL Injection has been documented since 1998.

McKinsey patched within 24 hours of disclosure. That is not the story. The story is that nobody found it for two years.

When AI can be turned into an attack instrument faster than organizations can audit what they've already connected it to, what does the next two years look like?

Who Can Even See This?

Here is the part nobody wants to read.

Only 24.4% of organizations have full visibility into which AI agents are communicating with each other.

More than half of all deployed agents are running with zero oversight or logging.

The average organization now manages 37 deployed agents.

88% of organizations reported confirmed or suspected AI agent security incidents in the past year.

82% of executives report confidence that existing policies protect against unauthorized agent actions.

Read that last number again. Four out of five executives are confident. More than half of their agents are invisible.

There is no unified audit logging mechanism in the MCP ecosystem, no standard way to capture the full chain of an agent action across tools and servers.

Each MCP server logs only its own perspective, when an agent interacts with multiple servers, the events are scattered with no shared identifiers to link them.

When something goes wrong, whether through exploit or misconfiguration, reconstructing a full picture of an agent's activity means manually stitching together logs from each component, a tedious and error-prone process.

For any organization running agents against legal files, deal flow, or client correspondence, the implication is direct: if the agent is compromised, the compromise may be undetectable.

Not difficult to detect, undetectable.

The incidents documented in the previous sections are not anomalies. They are the predictable output of an environment where most of the attack surface is invisible to the people responsible for defending it.

The maturity matrix doesn't measure what you can't see. It just tells you to connect more.

When the dashboard shows everything is fine because the dashboard can only see half the picture, what's happening in the other half?

The stack nobody checked is already running in your organization.

Someone connected the Gmail MCP. Someone wired in the Slack integration. Someone shared the Drive folder through a Claude Project because the productivity playbook said it was the future of tech.

They weren't being reckless, they were following instructions from people they trusted.

That's the attack surface. Not the exploit, the trust.

Executives were handed maturity matrices recommending infrastructure that had been actively breached for a year before the ink dried.

The protocol underneath carries an architectural flaw that its creator classified as expected behavior.

More than half of the agents now running have no audit log. The session records are clean. The agents are behaving exactly as designed.

Nobody had to break in.

If you were designing the perfect attack surface, you would make it look like the responsible choice.

You would give it a framework. You would call it transformation. You would make the organizations that didn't adopt it feel like they were falling behind.

And you would make sure that by the time anyone thought to audit what they'd connected, the answer was already: Everything that matters.

Nobody built this by accident. The accident was trusting it without checking.

The clock on the next one started the moment someone clicked connect.

Who on your team connected something last week that nobody else has checked?

REKT представляет собой общественную площадку для анонимных авторов. Мы не несём ответственность за выражаемые точки зрения или контент на этом веб-сайте.

Пожертвование (ETH / ERC20): 0x3C5c2F4bCeC51a36494682f91Dbc6cA7c63B514C

Дисклеймер:

REKT не несет никакой ответственности за любое содержание, размещенное на нашем Веб-сайте или имеющее какое-либо отношение к оказываемым нами Услугам, независимо от того, было ли оно опубликовано или создано Анонимным Автором нашего Веб-сайта или REKT. Не смотря на то, что мы устанавливаем правила поведения и нормы публикаций для Анонимных Авторов, мы не контролируем и не несем ответственность за содержание публикаций Анонимных Авторов, а также за то, чем делятся и что передают Авторы с помощью нашего Сайта и наших Сервисов, и не несем ответственность за любое оскорбительное, неуместное, непристойное, незаконное или спорное содержание, с которым вы можете столкнуться на нашем Веб-сайте и на наших Сервисах. REKT не несет ответственность за поведение, будь то онлайн или офлайн, любого пользователя нашего Веб-сайта или наших Сервисов.

The Flaw They Wouldn’t Fix

The Rug Pull

The Breach Timeline

The Attacks That Used The Weapon

Who Can Even See This?

ПОДПИСАТЬСЯ СЕЙЧАС