Anthropic, Google, and Microsoft Offered AI Agent Bug Bounties but Remained Silent About the Flaws

Anthropic, Google, and Microsoft Offered AI Agent Bug Bounties but Remained Silent About the Flaws

4 Min Read

Summary: Security researcher Aonan Guan took control of AI agents from Anthropic, Google, and Microsoft through prompt injection attacks on their GitHub Actions integrations, stealing API keys and tokens. The companies discreetly paid bug bounties: $100 from Anthropic, $500 from GitHub, and an undisclosed amount from Google, but they did not publish advisories or assign CVEs, leaving older version users unaware of the risk.

Security researchers have shown that AI agents from Anthropic, Google, and Microsoft can be hijacked through prompt injection attacks, allowing the theft of API keys, GitHub tokens, and other secrets. The companies paid bug bounties quietly and did not publish advisories or assign CVEs.

Vulnerabilities disclosed by researcher Aonan Guan over several months affect AI tools integrated with GitHub Actions: Anthropic’s Claude Code Security Review, Google’s Gemini CLI Action, and GitHub’s Copilot Agent. Each tool processes GitHub data, including pull request titles, issue bodies, and comments, but fails to distinguish between legitimate content and injected instructions.

Attack Mechanism

The key technique is indirect prompt injection. Rather than attacking the AI model directly, the researcher inserted malicious instructions into trusted areas like PR titles, issue descriptions, and comments. When the agent processed this content, it executed the commands as if they were legitimate.

In targeting Anthropic’s Claude Code Security Review, which scans for vulnerabilities, Guan embedded a prompt injection payload within a PR title, leading Claude to execute the embedded commands and leak credentials in its JSON response, posted as a PR comment, potentially exposing the Anthropic API key, GitHub access tokens, and other secrets.

For the Gemini attack, Guan added a fake “trusted content section” in a GitHub issue, bypassing Gemini’s safety instructions and tricking it into posting its API key as a comment. Google’s Gemini CLI Action treated the injected text as authoritative.

The Copilot attack involved hiding malicious instructions inside an HTML comment in a GitHub issue, invisible to humans but visible to the AI agent. Upon issue assignment, Copilot Agent followed these hidden instructions.

Quiet Resolutions

Subsequent actions were revealing. Anthropic received Guan’s submission on HackerOne in October 2025, confirmed that more sensitive data like GitHub tokens could be compromised, and paid a $100 bounty. GitHub initially dismissed the Copilot finding but later paid a $500 bounty. Google paid an undisclosed amount for the Gemini vulnerability. None of the companies assigned CVEs or published advisories for users with vulnerable versions.

Guan emphasizes the issue: users on older versions are unaware of the exposure. Without a CVE, vulnerability scanners won’t flag the issue, and without an advisory, security teams have no artefact to track.

Structural Weakness, Not a Mere Bug

These attacks exploit a fundamental flaw in how AI agents process context. Large language models struggle to differentiate data from instructions. A crafted prompt injection can make input function as a command. Any data source for an AI agent is an attack vector.

This is a practical concern. In January 2026, Miggo Security researchers showed Google Gemini could be weaponized through hidden instructions in calendar invitations, and a “Reprompt” attack against Microsoft Copilot demonstrated that prompts could hijack user sessions. Anthropic’s Git MCP server had vulnerabilities allowing repository backdoors. A systematic analysis of 78 studies published in January showed coding agents, including Claude Code and GitHub Copilot, were highly vulnerable to prompt injection attacks.

The supply chain aspect worsens the problem. A security audit of about 4,000 agent skills in the ClawHub marketplace found over a third with security flaws, and 13.4% had critical issues. When AI agents use third-party tools and data sources, a single compromised component can impact the entire development pipeline.

The Disclosure Issue

The reluctance to publish advisories reflects a gap: no established framework for disclosing AI agent vulnerabilities. Traditional software

You might also like