AI & ML // // 4 min read

OpenAI Just Launched GPT-5.6 Sol Under Government Lockdown – and the Real Story Is Who Gets the Keys

OpenAI dropped GPT-5.6 Sol this week, and on paper it is a beast. Terminal-Bench 2.1? 88.8 percent. The Ultra variant? 91.9. Claude Mythos 5 sits at 88.0. Fable 5, the model the US government already yanked off the market, trails

Bala Kumar Senior Software Engineer

OpenAI dropped GPT-5.6 Sol this week, and on paper it is a beast. Terminal-Bench 2.1? 88.8 percent. The Ultra variant? 91.9. Claude Mythos 5 sits at 88.0. Fable 5, the model the US government already yanked off the market, trails at 84.3. On ExploitBench, Sol matches Mythos Preview while burning roughly a third of the output tokens. On GeneBench v1, it jumps from GPT-5.5's 22 percent to 30 percent, again with fewer tokens.

The numbers are not subtle. OpenAI built a flagship that outcodes Anthropic's best and does it cheaper per task. But here is the punchline: you probably cannot use it.

The Government Is Now the Gatekeeper

The US government is restricting Sol to "select partners" through the API and Codex. This is not OpenAI's choice. The same administration already pulled Anthropic's Fable 5. Now OpenAI's turn. And OpenAI is not happy about it.

"We don't believe this kind of government access process should become the long-term default," the company said. "It keeps the best tools from users, developers, enterprises, cyber defenders, and global partners who need them."

That is a fascinating line to read from the company that once argued GPT-2 was "too dangerous" to release fully. Seven years ago, OpenAI pioneered the playbook of withholding models for safety. Now the government is running that playbook against them, and OpenAI is the one calling it unsustainable.

The New Tier System

GPT-5.6 introduces a layered naming scheme that looks a lot like Claude's. The generation number (5.6) is fixed. The tiers - Sol, Terra, Luna - are permanent performance classes that can evolve independently.

TierInput (per 1M tokens)Output (per 1M tokens)Positioning
Sol$5.00$30.00Flagship, matches or beats Mythos 5
Terra$2.50$15.00Matches GPT-5.5 at half the cost
Luna$1.00$6.00Budget tier

On top of these tiers, there is a "max" mode for deeper reasoning and an "ultra" mode that farms complex tasks out to sub-agents running in parallel. The pricing is aggressive. Sol undercuts the trend of each new generation getting more expensive, which matters when Chinese models are eating market share on cost alone.

What the Benchmarks Actually Say

OpenAI's own charts tell a nuanced story. Yes, Sol leads on Terminal-Bench 2.1. But on ExploitBench, Mythos 5 still leads at around 80 percent full-chain exploit success, while Sol and Mythos Preview sit lower at roughly 150,000 output tokens. OpenAI frames Sol as a cybersecurity defender, not an attacker - it found bugs and exploitation primitives in Chromium and Firefox tests but never produced an autonomous full-chain exploit. The company says Sol remains below its own "Cyber Critical" threshold.

On ExploitGym, a UC Berkeley benchmark, all three GPT-5.6 models improve as reasoning effort scales up. That suggests there is headroom for more compute. Claude numbers for this benchmark are not available yet.

The Speed Play

In July, Sol goes live on Cerebras hardware at up to 750 tokens per second. That is not a small number. If you are running agentic coding workflows or security analysis at scale, throughput matters as much as accuracy. OpenAI is clearly building for production use, not just leaderboard bragging.

What This Actually Means

Here is the uncomfortable truth: the best model in the world is irrelevant if you cannot access it. OpenAI has built a model that beats Claude Mythos 5 on coding, matches it on security, and does it with better token efficiency. But the release is government-controlled. "Select partners" only. The same gatekeeping Fable 5 just suffered.

This is the new normal for frontier AI. Not safety review boards. Not red-teaming disclosures. Direct government control over which companies, which researchers, which defenders get access to the most capable systems. OpenAI called it unsustainable. They are probably right. But they are also releasing anyway, which tells you everything about how much leverage a single company has against a national security determination.

The models keep getting better. The access keeps getting narrower. And the gap between what exists and what you can use is becoming the defining story of 2026.

Source: OpenAI, THE DECODER