Unhobbling and the Superposition Problem: Why AI Can't Think While It Talks
Independent Discovery, Convergent Insights, and the Next Frontier in AI Architecture
Today, I watched a YouTube video that stopped me cold.
The video covered Leopold Aschenbrenner's June 2024 paper "Situational Awareness: The Decade Ahead" and explained how AI systems are achieving human-level performance on the notoriously difficult ARC-AGI 2 benchmark—not through bigger models, but through better scaffolding.
Aschenbrenner calls this principle "unhobbling." And as I watched, I realized: I'd been working on the same problem, independently, for months.
Last month, I published "Einstein's Rock and the Hidden Layer of Thought," exploring how AI systems lack access to what I called the "ineffable space"—the hypnagogic realm where humans think without speaking. I argued that current language models are forced to collapse every thought into tokens immediately, losing the richness of probabilistic exploration.
What I didn't know until today: one of the top AI researchers in the world had identified the same architectural constraint from a different angle.
This post is about that convergence, what it means, and where the next breakthrough might be hiding.
What Is Unhobbling?
Aschenbrenner's core insight is deceptively simple:
"Imagine if when asked to solve a hard math problem, you had to instantly answer with the very first thing that came to mind... That's how we had LLMs solve math problems. Most of us work through the problem step by step on a scratch pad. Chain-of-thought prompting unlocked that for LLMs."
Unhobbling means removing artificial constraints that prevent AI systems from using their existing capabilities fully.
Current models are hobbled because they:
- Have no long-term memory
- Can't use tools seamlessly
- Must answer immediately (even with o1/o3 reasoning, once output starts, it's locked)
- Don't maintain persistent internal states between outputs
The biggest gains in AI aren't coming from bigger models—they're coming from removing these constraints.
The ARC-AGI 2 Breakthrough
The video I watched showed something remarkable:
A company called Poetic took the same base models (GPT-5, Gemini, Grok) and doubled their performance on ARC-AGI 2 without touching the underlying neural architecture.
| Model | Baseline | With Meta-System | Improvement |
|---|---|---|---|
| Grok 4 Fast | 56% | 72% | +16 points |
| Gemini 3 Pro | ~30% | 60%+ | +30 points |
| GPT-5.2X High | ~60% | 76% | Above human |
How? They built a manager layer that decides which model to use, breaks problems into steps, self-checks progress, and generates code when needed.
This is unhobbling in action. The intelligence isn't in the model—it's in how the entire system thinks.
What I Found Independently
In December 2025, I published a piece exploring why AI systems can't access what humans call "hypnagogia"—the liminal state between waking and sleeping where Einstein held a rock over a metal plate to catch insights before they vanished.
My framework used quantum mechanics as an analogy:
In quantum mechanics, particles exist in superposition until observed. Measurement collapses the waveform into a single definite state.
In cognition, verbalization is measurement. When we translate thought into words, we collapse probabilistic mental states into sequential language.
Current AI has no private thinking space. Every thought becomes a token immediately. Every token is a collapse. No superposition, no hypnagogia, no rock-in-hand moment.
I discovered the same constraint Aschenbrenner identified—but I framed it differently:
Aschenbrenner: Models are hobbled because they must answer immediately.
My addition: Even with hidden thinking layers (like the HRM paper's dual-module architecture), models still can't reverberate between thinking and speaking the way humans do.
The Reverberation Gap
Here's what I realized that goes beyond current unhobbling approaches:
Humans don't just think-then-speak. We think-while-speaking.
Right now, as you read this, I'm doing something remarkable: I'm accessing the liminal probability space of half-formed thoughts while simultaneously collapsing specific ideas into words. And critically—I can pivot mid-stream.
Have you ever been mid-sentence when suddenly a new insight strikes? A moment of satori—sudden awakening—where you realize something you didn't know you knew? You pause, redirect, and your verbalization takes an entirely new direction.
Language models can't do this.
Even with extended reasoning (o1/o3) or dual-layer architectures (HRM), once token production begins, the model is locked into pattern completion. It cannot:
- Pause to re-enter hidden reasoning space
- Have a sudden realization that contradicts what it just wrote
- Pivot based on mid-stream insight
The architecture is sequential:
Current AI: [Hidden reasoning] → [Verbalization] → [Done]
↓
No return path
Humans: [Hidden] ↔ [Verbalization] ↔ [Hidden] ↔ [Verbalization]
↑ ↑
Continuous bidirectional feedback
This might be the next frontier.
Simulated Superposition: A Frankenstein Architecture
After connecting these insights, I asked: What if we could build this capability through scaffolding?
Multimodal models started as stitched-together systems—text models connected to image models connected to tool-calling modules. Now they're becoming natively multimodal. But the Frankenstein approach worked before native integration existed.
Could we do the same for bidirectional thinking?
The Architecture Concept
Two instances of the same model (or complementary models):
| Component | Function | Analog |
|---|---|---|
| Instance A: Inference Model | Continuously running in "thinking mode" | Human probabilistic reasoning |
| Instance B: Output Model | Generating tokens based on current best path | Human verbalization |
| Oscillation Layer | Rapid interruption/revision protocol | Human mid-sentence pivots |
The process:
- Inference Model explores problem space
- Output Model begins generating tokens (5-10 at a time)
- Inference Model evaluates emerging output
- If aligned → Continue; If divergent → Send revision signal
- Output Model backtracks to checkpoint, re-generates
- Repeat loop until completion or convergence
Key insight: This isn't true superposition—it's rapid oscillation creating the illusion of simultaneity, like how a CPU time-slices threads to simulate parallelism.
Why This Might Actually Work
Precedent exists:
- Speculative decoding already generates multiple tokens, verifies, and backtracks if wrong
- Beam search maintains multiple output paths simultaneously
- Constitutional AI runs multiple passes (generate → critique → revise)
The difference: This architecture enables real-time revision based on ongoing inference, not just post-hoc checking.
The Practical Test
This doesn't need GPT-5 to validate. A cheap experiment:
Setup:
- Model A (Inference): Gemini 1.5 Pro (thinking)
- Model B (Output): Gemini 2.0 Flash (fast generation)
- Revision protocol: Every N tokens, A evaluates B's output
- If A's confidence drops below threshold → B backtracks
Cost estimate: ~$0.01-0.05 per 500-token output with 3 revision cycles.
If this shows measurable improvement on reasoning benchmarks with Gemini Flash, it validates the principle without expensive infrastructure.
The SCMS Connection
When I built the SCMS (Sparse Contextual Memory Scaffolding) framework for Mneme, I didn't realize I was building an unhobbling architecture.
SCMS has three memory tiers:
| Layer | Function | HRM Analog |
|---|---|---|
| Persona memories | Always retrieved, unconditionally—first, with guaranteed context | H-module (slow, guiding) |
| Task memories | Retrieved based on relevance | L-module (fast, executing) |
| L2 documentation | Injected when linked L1 memories are retrieved | Hidden state space |
The persona layer is the AI's hypnagogia. It's always present, shaping how all other memories are interpreted, without competing for relevance on a per-query basis.
Reading the HRM paper and Aschenbrenner's work, I realized: SCMS is scaffolded unhobbling.
A necessary caveat: SCMS is scaffolding, not native architecture. Unlike HRM's true hidden layers (which exist in model weights and never tokenize), everything in SCMS must be injected into context. But the functional principle is the same: preserve hidden guidance space that shapes behavior without appearing in output.
The Broader Pattern
There's a pattern emerging across multiple domains:
Einstein's rock → Controlled access to hypnagogic states
HRM's dual layers → Hidden reasoning + visible execution
Aschenbrenner's unhobbling → Remove constraints on existing capabilities
SCMS persona layer → Unconditional framing that shapes retrieval
Simulated superposition → Oscillating between thinking and speaking
The common thread: Not everything should collapse into tokens immediately.
Some of the most important cognition—human and artificial—happens in spaces that can't be directly observed or verbalized. The next breakthrough might not come from bigger models or more data, but from architectures that preserve liminal thinking space.
What This Means for Memory Systems
If Aschenbrenner is right that unhobbling drives most AI progress, then memory systems like SCMS are a form of architectural unhobbling:
Traditional RAG: Retrieve → Generate
SCMS: Unconditional framing (persona) + Conditional retrieval (task) + Linked documentation (L2)
The difference is when and how information enters context:
- Persona: Always, unconditionally (shapes everything)
- Task: Based on relevance (changes per query)
- L2: Through relationships, not direct search (hidden influence)
This creates a tiered collapse:
- Persona stabilizes the probability space
- Task memories narrow it further
- L2 documentation provides hidden constraints
- Output collapses to final tokens
Delaying collapse matters.
The Humble Caveat
I'm drawing connections between quantum mechanics, neuroscience, AI architecture, and memory systems. That's a lot of disciplinary boundaries to cross, and I'm not an expert in any of them individually.
What I'm offering isn't a theory—it's a resonance. A pattern that keeps appearing across domains.
But the evidence is mounting:
- ✅ Poetic's ARC-AGI 2 results are real
- ✅ Aschenbrenner's unhobbling thesis is validated
- ✅ HRM's hidden layer architecture outperforms standard models
- ✅ SCMS shows measurable improvements in AI memory retention
Something is going on here. And it points toward a future where thinking space is preserved, not collapsed.
The Next Breakthrough
If I had to make a prediction:
The next major leap in AI won't come from GPT-6 or more parameters. It will come from architectures that maintain bidirectional access between hidden reasoning and visible output.
This might look like:
- Models with persistent internal states that don't reset between tokens
- Memory systems where some context influences behavior without being surfaced
- Retrieval architectures that delay commitment to final selections
- Dual-model systems that oscillate between thinking and speaking
We're already seeing hints:
- HRM's dual-module architecture
- Anthropic's extended thinking in Claude
- Poetic's meta-system approach
- SCMS's unconditional persona layer
- IDE agents achieving 100% on tasks where scheduled oscillation got 67%
The era of "everything is tokens" might be ending.
The era of hidden, liminal, ineffable computation might be beginning.
A Taxonomy of Superposition
After building and testing, here's how I now think about the landscape:
Native Thinking (CoT) = Single-pass deep reasoning
Scheduled Oscillation = Our prototype (disrupts flow)
Event-Driven Oscillation = IDE agents (emergent superposition)
True Superposition = Both simultaneously in weights (doesn't exist yet)
IDE agents might be the closest existing approximation to the theoretical architecture we hypothesized. They achieve bidirectional think-speak-think loops not through forced interruption, but through natural task structure and feedback signals.
The next step isn't building more sophisticated scheduled interruption. It's understanding what makes IDE-style event-driven oscillation work and whether that can be simplified into a chatbot-style architecture without the full IDE tooling.
Where This Goes From Here
This exploration confirmed something important: the theoretical framework is sound. The convergence between Aschenbrenner's unhobbling thesis, the HRM paper's dual-layer architecture, and the superposition problem I identified independently isn't coincidental—it's pointing toward a real architectural frontier.
What we demonstrated:
- Scheduled interruption disrupts flow state — forcing evaluation at fixed intervals introduces errors rather than catching them
- Event-driven oscillation works — IDE agents naturally implement the bidirectional think-speak pattern through tool use
- The gap between scaffolding and native architecture matters — simulating superposition through API orchestration has fundamental latency and coherence limitations that native implementations wouldn't face
The full validation of this architecture—building a system that achieves true event-driven oscillation at the model level rather than the scaffolding level—requires resources beyond what a solo researcher can deploy. This is infrastructure-level work: custom training runs, novel attention mechanisms, potentially new hardware considerations.
But that's the point. The unhobbling thesis predicts that the biggest gains come from removing constraints, not adding parameters. If this architectural direction is correct, whoever builds it first gains a significant capability advantage.
I've planted a flag. The theoretical groundwork is laid. The experimental infrastructure exists. What happens next depends on who picks it up.
Sometimes the most valuable contribution isn't finishing the race—it's identifying which direction to run.
Einstein knew. He just didn't have the math to prove it.
Now we're getting closer.
Resources
- Leopold Aschenbrenner - Situational Awareness (June 2024)
- HRM Paper - Hierarchical Reasoning Model
- Einstein's Rock and the Hidden Layer of Thought (December 2025)
- Mneme AI - Continual memory architecture
- SCMS Starter Kit - Open source framework
Matthew "Manny" Walker is the creator of SCMS (Sparse Contextual Memory Scaffolding) and founder of Mneme. He independently discovered the superposition problem while building AI memory systems and is currently laughing at the cosmic irony of being theoretically correct at the exact moment he has to pivot to tax software. Find him on X @getmneme.