Your STR diagnostic can produce maximally stable readings while stably measuring something that does not exist. We introduce Structured Connectivity Coherence (SCC) — the legitimacy certificate for the entire STR framework.

From RN-001 through RN-006, we've showcased a series of remarkable capabilities of Soft Topological Return (STR): detecting hallucination trajectory divergence, driving closed-loop inference control, and tripping a topological circuit breaker within ~10 steps.

But there's a serious question we've never directly addressed: how do you know the STR values aren't deceiving you?

We uncovered the most insidious trap: a set of points drawn i.i.d. from a Gaussian mixture, arranged in arbitrary order — with zero dynamics — produces maximally stable high STR measurements (convergence CV ≈ 0). If you rely solely on "measurement stability" to judge signal trustworthiness, you will be perfectly deceived.

We rigorously classify diagnostic failures into two types: Type I (measurement-correctable) and Type II (structurally intrinsic). Type I means you misconfigured the instrument — adjust the parameters and it works. Type II means the underlying system has no recurrence structure whatsoever — no amount of tuning will extract signal from a phantom.

Chaotic systems (e.g., the Lorenz attractor) constitute an independent boundary case: structure exists, but is incompatible with fixed-scale measurement.

From this analysis, a structural property naturally emerges: when the system's trajectory maintains a persistent connected component under its own dynamics — we term this Structured Connectivity Coherence (SCC) — recurrence measurements become trustworthy.

Core conclusion: Stability ≠ Validity. Your instrument can produce maximally stable readings while stably measuring something that does not exist.

Dive into the Type I/II classification and SCC validity conditions: see RN-007 on the Trajectory Observatory.

X-006

Click to expand ↓

Stop Micro-Managing LLMs. You Need a "Topological Circuit Breaker"

Mainstream AI alignment methods are trapped in a physical dead end: trying to "micro-manage" intelligence at the token level. By monitoring the topology of the current instead of the water droplets, we can instantly trip a circuit breaker the moment a hallucination begins.

Mainstream AI alignment methods (like Process Reward Models or token-level Chain-of-Thought monitoring) are trapped in a physical dead end: trying to "micro-manage" intelligence at the token level. This is as computationally doomed as trying to predict ocean currents by tracking the Brownian motion of individual water molecules. It faces extreme observational uncertainty.

We proved this with a "failed" experiment. In early tests, we attempted to deploy the "Information-Directed Engine" to highly microscopic tactical search tasks (Formal Theorem Proving). The result: Complete failure. At the micro-scale of single-step deduction, the signal-to-noise ratio is abysmal, and the reasoning trajectory is dominated by Markovian noise. Forcing information-theoretic interventions at this scale actually disrupted the model's highly efficient greedy pattern matching.

Yet, why did the exact same engine achieve SOTA results on brutally complex codebase repairs (SWE-bench, RN-005)? The answer: Cognitive control is strictly bound by a "Scale-Dependence" and a "Domain of Validity".

Microscopic, short-range predictions (writing a line of code, proposing a math tactic) must be delegated to the greedy autoregressive generation of the LLM. However, macroscopic, long-range decisions (when to forage for information, when to break an infinite loop) must be governed by an external dynamical control engine.

If you can't micro-manage, how do you prevent hallucinatory collapse? Today, we introduce a fundamentally new runtime safeguard: The Topological Circuit Breaker.

We don't interfere with the "water droplets"; instead, we monitor the topological geometry of the "current". We run a Cognitive Lyapunov Function in the background, based on a rolling-window Soft Topological Return (STR). The moment an LLM's thought trajectory gets trapped in a "limit cycle (repetition loop)" or "diverges (hallucination)", the derivative of this function exhibits a sharp level shift within just ~10 time steps.

Fascinatingly, our ablation studies prove that by simply applying a temporal window (Scale-Selectivity), this topological monitoring yields a 10x boost in separating genuine reasoning from noisy hallucinations.

Instead of waiting for the model to finish generating paragraphs of nonsense, the system instantly trips the "circuit breaker", physically severing the generation stream. It then suspends the Agent, forcing it back into "Epistemic Foraging" to gather new evidence and reshape its state space.

Intelligence is not merely a product of massive parameter counts; it is an emergent property of controlled dynamical systems operating at the correct scale.

Dive deep into the Intervention Uncertainty Law and rolling-window STR regime tracking: see RN-006 on the Trajectory Observatory.

X-005

Click to expand ↓

Agents Don't Need Prompts, They Need Physics

The ReAct paradigm is a greedy trap. Cognitive control emerges naturally when you optimize for task utility combined with information gain, forcing the agent to forage when uncertainty is high.

Everyone building AI Agents is making the same mistake: trying to teach a model how to think using prompts (like "Think step by step"). This is as foolish as trying to teach water to flow downward using verbal instructions. You don't teach it; you give it a gravitational field.

The biggest failure of LLM agents is "Premature Exploitation". When they see a bug, they greedily jump straight into writing code, rather than running tests, reading logs, or foraging for information. This macro-level greed is the exact same physical phenomenon as micro-level trajectory collapse (hallucination).

Today, we are completely obsoleting the ReAct paradigm by introducing an Information-Directed Engine and Entropy-Gated Control into the Agent.

The Agent no longer blindly follows a prompt; it maximizes an information-theoretic equation:
Objective = Task Utility + Information Gain

When state entropy (uncertainty) is sky-high, the engine "forces" the Agent to suspend execution and engage in "Epistemic Foraging" (exploration).
As information accumulates, system entropy drops. When uncertainty falls below a critical threshold (Entropy-Gating), the Agent automatically switches to "Execution" mode.

The timing of "when to look" and "when to act" emerges naturally from information theory. We ran a decisive test on the brutal SWE-bench Lite:

Equipped with the Information-Directed Engine, Claude Sonnet 4.6 achieved a stunning 36.6% Pass@1 (a 22% relative gain over standard ReAct). For the smaller Gemini 3.1 Flash Lite, it skyrocketed by 166%! Interestingly, its naked "Greedy" baseline beat ReAct by 2x, further exposing the limitations of static prompting paradigms in complex tasks.

The intelligence ceiling of an agent is not dictated by the parameter count of the base model, but by its "Epistemic Architecture".

From micro residual stream control (RN-004) to macro agentic entropy-gated control (RN-005), we are proving one unified truth: Cognition is not a property of models — it is a property of controlled dynamical systems. See RN-005.

X-000

Click to expand ↓

Intelligence as a Controlled Dynamical System

Intelligence is not just a parameter scaling problem. It's a physics problem. We are observing the trajectories of cognition to map out the thermodynamic laws governing reasoning and active inference.

In 2026, something strange happened in the AI world. Anthropic built Claude Mythos, and then held it back. No flashy product launch as a new flagship, no enterprise API tier. Instead, it was deployed as an internal capability profile to dramatically extend Claude's performance over long time horizons. Tasks that used to collapse at 3 hours suddenly held for 10, then a full day.

Everyone was asking: how much smarter is it? But almost no one asked: why would a company that built something this powerful choose not to release it?

Here's a possibility that keeps me up at night: maybe it didn't change the system's single-step ceiling, but its long-range "trajectory". When you enter territory where the control layer hasn't caught up, shipping raw capability without the architecture to contain it isn't innovation — it's detonation.

We've been watching the same pattern across every major agent framework. Give a frontier model a 20-second task: brilliant. Give it a 3-hour task: it drifts, planning dissolves, context contaminates itself, hallucinations compound, and the search space collapses into local loops. Intelligence kept scaling. Stability? It didn't.

Think of it this way: we kept upgrading the engine — from 100 horsepower to 1,000. But we forgot to upgrade the brakes, the steering, the suspension. Then we wonder: why does the car flip at high speed? Memory, planning, tool calling, orchestration logic — the existing scaffolds were all designed for weaker models. Models that needed help, models that could be corralled.

Before: model capability < scaffold capacity.
Now: model capability > scaffold capacity.

The smarter they get, the harder they are to control. Not because intelligence is failing, but because the "control layer" is failing. We are used to measuring intelligence as a static property: benchmarks, accuracy, pass rates. But what if the kind of intelligence that matters for autonomous agents isn't a property at all, but trajectory stability?

Then the most urgent question isn't: can the model produce a more correct next token. It's: can the model maintain a coherent path through a long, uncertain, branching problem space — without degrading.

If this is right, the next scaling law may not be model scaling. It may be control scaling. Not bigger models. But better Trajectory orchestration, Uncertainty gating, State purification, and Phase control.

We've been optimizing the atom. Maybe the real object we need to see is the orbit.

Living Theory

v0.3 · Core Framework

Trajectory Physics of Intelligence

Intelligence reframed as a trajectory problem rather than a parameter problem. A system's capability is bounded by the topological stability of its generative manifold. Recurrence structure carries non-redundant information invisible to output-level metrics.

Active

v0.2 · Control Theory

Information-Directed Control & Phase Transitions

Control is not about predicting the next token, but steering the entire trajectory. Thermodynamic principles applied to trajectory-level autonomy scaling.

Active

v0.1 · Concepts

Future Thickness & Question Ecology

Advanced systems protect open futures rather than minimizing error. The Markov blanket of a civilization is not isolation, but a high-entropy generation zone.

Drafting

Open Questions

Can trajectory stability explain autonomy scaling?
Does recurrence predict long-horizon collapse?
What is the Markov blanket of a civilization?

Research Queue & Open Responses
Add a perturbation. Challenge this trajectory. What blind spot am I missing?
Please format your responses using one of the following prefixes to enter the Pending Research Queue:

Observation: What unexpected dynamics did you see?
Challenge: Could this be a memory scaling artifact instead of trajectory control?
Related idea: What stranger adjacency does this connect to?
Open question: What new door does this observation open?

Submit Perturbation

wutai@haelio.cc