Meta Halts AI Data Work Following Breach That Jeopardizes Training Secrets

Meta Halts AI Data Work Following Breach That Jeopardizes Training Secrets

4 Min Read

Summary: Meta has halted its partnership with Mercor, a $10 billion AI data firm, following a supply chain attack that revealed highly confidential AI training methods. This breach, involving a tainted version of the LiteLLM open-source library, has led to probes at OpenAI and Anthropic, along with a class action lawsuit impacting over 40,000 individuals.

Last month, when hackers sabotaged a commonly used open-source library, they didn’t just swipe personal data. Wired reports that they may have obtained blueprints for constructing some of the world’s most formidable AI models.

Meta ceased its collaboration with Mercor, a San Francisco-based AI data company producing custom training datasets for top AI firms, after a cyberattack exposed critical details about how the company and potentially some of its other clients train their models. The suspension is indefinite, causing anxiety within an industry that has invested billions in keeping proprietary methods secret.

The Overlooked Force

Mercor might not be widely known, but it plays a crucial role in the AI sector. Founded in 2023 by Brendan Foody, Adarsh Hiremath, and Surya Midha, three high school friends from the Bay Area, the company employs networks of contractors across various fields to deliver proprietary training data for AI labs. Clients include Meta, OpenAI, Anthropic, and Google.

Mercor’s growth has been extraordinary even for Silicon Valley. In October 2025, it secured a $350 million Series C funding, valuing it at $10 billion and making the founders the youngest self-made billionaires at 22. By September 2025, its annual revenue reached $500 million, up from $100 million six months earlier, positioning it as one of the AI supply chain’s most valuable private firms. This prominence, however, now exposes it to risks.

A Cascade of Exposures

The attack on Mercor originated upstream. Wiz, Snyk, and Datadog Security Labs reveal that a group dubbed TeamPCP compromised the CI/CD pipeline of LiteLLM, a popular open-source Python library, to insert malware. With 97 million monthly downloads, LiteLLM connects applications to AI services and is used in 36% of cloud environments.

TeamPCP initially targeted Trivy, a security scanner, to acquire LiteLLM maintainer credentials. On 27 March 2026, they used these credentials to release two malicious LiteLLM versions, 1.82.7 and 1.82.8, on PyPI. These were available for about 40 minutes before removal.

The payload was complex. Version 1.82.7 embedded base64-encoded malware in proxy server code, executing on import. Version 1.82.8 employed a harmful path configuration file triggering on every Python process startup. Both versions aimed to collect environment variables, API keys, SSH keys, cloud credentials (AWS, Google Cloud, Azure), Kubernetes configurations, CI/CD secrets, and database credentials, sending them to models.litellm[.]cloud.

Mercor, acknowledging it was “one of thousands” affected, reported that the breach exposed about four terabytes of data. Court documents and hacker claims state that the stolen data includes 939 GB of platform source code, a 211 GB user database, and nearly three TB of video interviews and identity verification documents. This may involve the full names and Social Security numbers of over 40,000 Mercor contractors and clients.

The Most Valued Secrets

The exposure of personal data is concerning, but what alarms Meta and other AI labs is a different type of data entirely.

Since Mercor operates within data pipelines of multiple AI firms simultaneously, the breach might have unveiled data selection criteria, labeling protocols, and training strategies that companies have spent millions developing. While the data sets might be replicated, replicating these methodologies is challenging, safeguarding a competitive advantage. Wired indicates that multiple AI labs are investigating what precisely might have been compromised.

OpenAI, a Mercor client, is examining the incident but hasn’t paused current projects. Anthropic, which raised $3 billion in early 2026 and is rapidly expanding, has yet to comment. Google, which maintains similar data vendor relationships, is also assessing the breach’s impact.

This incident underscores a structural risk the AI industry

You might also like