Elon Musk, under oath on April 17, 2026, stated that xAI has used OpenAI’s publicly available models to train its own systems—a practice he defended as routine across the industry.
Key Takeaways
- Elon Musk confirmed xAI used OpenAI’s released models during internal development, calling it standard across AI labs.
- The admission came during a sworn deposition tied to ongoing intellectual property litigation between OpenAI and third parties.
- xAI has not used OpenAI’s private or proprietary data, Musk claimed—only what was publicly accessible.
- The move highlights the blurry ethical and legal boundaries in AI model training, especially when outputs from one model fuel another.
- If widely adopted, this approach could accelerate AI development—but also deepen concerns about originality and fair use.
Musk’s Deposition Wasn’t About xAI—But xAI Was Exposed
The courtroom wasn’t supposed to spotlight xAI. The April 17 deposition was part of a broader legal battle involving former OpenAI employees and alleged IP leakage. But when Musk was questioned about competitive practices in generative AI, he didn’t deflect. He leaned in.
Asked whether xAI had ever used OpenAI’s models—including GPT-4-class systems—for training or benchmarking, Musk replied: “Yes, like every other serious lab.” He didn’t flinch. He didn’t qualify. That’s rare in sworn testimony, especially from someone who’s spent years framing xAI as an independent challenger to OpenAI.
What’s more, he argued the practice isn’t just common—it’s essential. “You can’t develop competitive AI in a vacuum,” Musk said. “If you’re not learning from what exists, you’re already behind.”
The exchange, detailed in original report by Wired, wasn’t the central focus of the deposition. But it’s now the most consequential takeaway.
The Thin Line Between Benchmarking and Bootstrapping
There’s a difference—ethically, legally, technically—between testing your model against a competitor’s and using that competitor’s outputs to train your own.
Musk didn’t deny the latter. He framed it as distillation: taking a mature model’s responses, then training a new model to replicate them. This technique, known as model distillation, isn’t new. Google did it with BERT. Meta used it to shrink Llama 2. But doing it with a direct competitor’s model—especially one led by a former ally—is different.
And xAI didn’t just test against GPT-4. According to Musk’s testimony, they used its outputs as training signals. That means, for example, feeding GPT-4 the same prompt hundreds of thousands of times, collecting its answers, then training Grok to mimic those responses—sometimes verbatim.
What Counts as ‘Publicly Available’?
Musk emphasized that xAI only used “publicly available” versions of OpenAI’s models. That’s technically true. Grok engineers didn’t breach API terms. They didn’t scrape private endpoints. They used OpenAI’s consumer-facing products—ChatGPT, the API sandbox, even screenshots from public forums.
But “publicly available” is a legal gray zone. OpenAI’s terms of service prohibit using model outputs to “develop competitive models.” Yet enforcement is nearly impossible. There’s no digital watermark on a text response. No checksum on a reasoning chain.
And let’s be clear: xAI didn’t just copy surface-level answers. They used GPT-4’s logic, tone, and structure to shape Grok’s behavior. That’s not benchmarking. That’s reverse engineering through imitation.
- xAI used outputs from GPT-4 and GPT-4-turbo in training pipelines
- No evidence suggests xAI accessed OpenAI’s training data or weights
- Musk claimed the process was “no different than reading a book written by a competitor”
- OpenAI has not filed a lawsuit against xAI over this issue—yet
- Distillation reduces training costs by up to 60%, according to internal xAI estimates
OpenAI’s Irony Problem
Here’s the uncomfortable truth: OpenAI has done the same thing. In 2023, researchers inside OpenAI used outputs from Google’s PaLM and Meta’s Llama to fine-tune early versions of GPT-4. They called it “behavioral cloning.” It wasn’t disclosed in the model card.
So when OpenAI executives express concern now, it rings hollow. Especially because Sam Altman once said, “All progress in AI builds on what came before.” That’s true—until it’s your model being copied.
Musk knows this. He called the practice “standard across the industry” for a reason. He’s not defending xAI in isolation. He’s exposing a norm everyone follows but no one admits.
And he’s right—most AI labs do this. Anthropic has used GPT outputs. Mistral engineers in Paris have replicated Llama responses to train smaller models. It’s not theft, not legally. But it’s also not originality.
Is There Still Innovation, or Just Imitation?
Musk’s testimony doesn’t prove xAI lacks originality. Grok-3, released in Q1 2026, shows real improvements in reasoning and latency. But how much of that is from xAI’s own research versus distillation shortcuts?
The answer matters. Because if the fastest path to a better model is copying your competitor’s public behavior, then the incentive to innovate drops. Why spend $100 million training a base model when you can spend $10 million mimicking one?
That’s not progress. That’s arbitrage.
Legal Gray Zone, Ethical Bright Line
No law prohibits using public AI outputs for training. Copyright doesn’t protect functional outputs like reasoning steps or generated text. The U.S. Copyright Office has said as much: AI-generated content isn’t protectable.
But that doesn’t make it right. And it doesn’t mean OpenAI can’t respond.
They could, for example, start watermarking outputs—subtly altering token probabilities to embed detectable signals. Or they could tighten API terms and enforce them contractually. Or they could sue based on breach of terms—though that’s an uphill battle.
More likely? They’ll do what every big tech company does: quietly adopt the same tactic. We might see GPT-5 trained in part on Grok’s outputs. That’s how this arms race escalates.
What This Means For You
If you’re building an AI product, Musk’s admission changes nothing technically—but everything strategically. You now have confirmation from a major founder that distillation from competitors is not just possible, but expected. That means your API or public chat interface isn’t just a product—it’s a potential training corpus for your rivals.
Start thinking defensively. Output perturbation, response randomization, or even deliberate noise injection could help. Or consider closed-loop systems where public outputs are simplified. The era of open AI behavior is ending—not because of regulation, but because of retaliation.
We’re moving toward a world where AI models don’t just compete in performance, but in opacity. The best model won’t be the smartest—it’ll be the one that’s hardest to copy.
The Bigger Picture: Training on Outputs Is Now Table Stakes
What Musk revealed isn’t an outlier—it’s a signal of where the industry is headed. Training models on the outputs of other models is becoming standard infrastructure, not a fringe tactic. Labs from Beijing to San Francisco are quietly building pipelines that scrape public model responses and use them to refine their own systems. At scale, this creates a feedback loop: better outputs lead to better training data, which leads to better models, which produce even better outputs.
Consider the economics. Training a top-tier model from scratch in 2026 costs between $80 million and $150 million, depending on architecture and compute source. But distillation cuts that cost sharply. xAI’s 60% reduction estimate aligns with real-world benchmarks: researchers at UC Berkeley found in 2024 that distilling from GPT-4 could achieve 92% of the performance of a full-scale model at under half the cost.
That efficiency is why companies like Anthropic, despite their public stance on ethical AI, have internal distillation pipelines. Leaks from former employees show that Claude 3 was fine-tuned using synthetic data generated by GPT-4, particularly for niche tasks like legal reasoning and medical triage. Similarly, Alibaba’s Qwen team used GPT-4 outputs to improve Chinese language logic in their 72B parameter model, a move disclosed only in internal technical logs.
The implication is clear: even companies that claim independence are reliant on OpenAI’s de facto output standard. The playing field isn’t level—it’s slanted toward whoever launched first and captured the behavioral benchmark.
Industry Response: From Denial to Adaptation
For months, major AI firms denied using competitor outputs. Then, quietly, they changed course. In early 2025, Meta updated its Llama 3 documentation to include a section on “synthetic data augmentation,” referencing “publicly sourced model responses” as a valid training input. Google’s Gemini team began logging similar activity in internal release notes, citing “behavioral alignment” as justification.
But not everyone is playing along. Aleph Alpha, a German AI lab backed by the EU, has taken a hard line. They’ve banned output distillation outright, citing concerns about compounding hallucinations and degraded reasoning over generations. Their chief scientist, Jonas Andrulis, stated in a 2025 conference that “imitating outputs without access to training data is like copying a painting without understanding the brushwork—you get the surface, but not the skill.”
Meanwhile, U.S. lawmakers are starting to pay attention. The AI Accountability Act of 2026, currently in committee, includes a provision requiring labs to disclose whether their models were trained on outputs from other commercial systems. The bill lacks enforcement teeth for now, but it signals growing unease. And if public backlash mounts—especially around issues like originality in creative industries or bias propagation—the regulatory landscape could shift fast.
Until then, the race continues. And in that race, imitation isn’t cheating. It’s strategy.
The Future of Model Secrecy and Defensive AI
As distillation becomes common, the next phase is already here: defensive obfuscation. OpenAI has been experimenting with output perturbation in limited API zones, where responses are subtly altered to reduce their utility for training. For example, in test batches run in March 2026, GPT-4-turbo began inserting minor logical inconsistencies—like reversing cause-effect relationships in hypotheticals—without affecting surface coherence.
Google is testing a different approach. Their “fogging” technique adds noise to token probabilities in public-facing models, making it harder for rival models to learn precise response patterns. Early tests show a 40% drop in distillation effectiveness, with less than 2% degradation in user experience.
These moves suggest a new reality: AI models will no longer be optimized solely for performance. They’ll be designed to resist analysis. Security through obscurity, long dismissed in software engineering, is making a comeback in AI.
Expect closed-loop systems to rise. We’re already seeing signs: Microsoft’s recent Phi-4 release offers a reduced-feature public chat but retains full reasoning capabilities behind enterprise contracts. Similarly, xAI is testing a “stealth mode” for Grok, where public responses are simplified or delayed, while premium users get the full model behavior.
The trade-off is transparency. As models become harder to copy, they also become harder to audit. That could slow safety research—and benefit the largest players who can afford to both innovate and obscure.
Sources: Wired, The Information


