He acknowledged it was “partly” accurate that model distillation had been utilized to enhance xAI’s models.
In a California federal courtroom on Thursday, Elon Musk testified that his AI startup, xAI, has leveraged OpenAI’s models to advance its own. The issue at hand involves model distillation, a prevalent industry practice where a larger AI model serves as a “teacher” to transfer knowledge to a smaller model known as the “student.” This process is often legitimately used within companies to train different AI models using their own resources. However, smaller AI labs occasionally exploit the technique to emulate the performance of more robust competitor models.
Upon being queried about his understanding of model distillation, Musk explained it as using one AI model to train another. When pressed on whether xAI employed OpenAI’s technology in this manner, Musk appeared evasive, suggesting, “generally all the AI companies” engage in such activities. To the question of whether this meant a yes, he responded, “Partly.”
Musk elaborated, “It is standard practice to use other AIs to validate your AI.”
Model distillation’s popularity and accompanying controversies have surged as more AI labs engage in the practice. The legality often remains ambiguous, straddling the boundaries of specific company policies. In recent years, companies like OpenAI and Anthropic have accused Chinese firms of deploying distillation techniques. OpenAI has openly expressed concerns about DeepSeek, while Anthropic has specifically pointed fingers at DeepSeek, Moonshot, and MiniMax. Additionally, Google has initiated measures to prevent “distillation attacks,” which it deems a form of intellectual property theft violating its terms of service.
Anthropic addressed the subject in a blog post, stating, “Distillation is a widely used and legitimate training method. Frontier AI labs routinely distill their own models to create smaller, cheaper versions for their customers. Yet, distillation can also serve illicit purposes: competitors might use it to gain advanced capabilities from other labs with significantly less time and cost than required to develop them on their own.”
