Not In Our Head
A framework that aligns with how real intelligence actually behaves — not how we assume it works.

Clarus and the Limits of Internal Metrics
Two different assumptions about intelligence. Two different ways of measuring it.
Most AI systems are evaluated through internal signals.
Loss functions. Confidence estimates. Alignment scores. Activation patterns. Preference models.
These metrics reveal what is happening inside a model.
But they often fail to reveal how the system is behaving in the world.
Clarus begins with a different question:
What if intelligence cannot be understood from internal performance alone? What if the earliest signs of failure appear in the relationship between a system and its environment?
This is the heart of the internalism error.
The Internalist Assumption
Most contemporary AI implicitly assumes that intelligence is an internal property.
If the model becomes more capable internally, intelligence is assumed to improve externally.
This assumption has produced remarkable progress.
But it also hides a structural problem:
A system can maintain strong internal performance while quietly losing coherence with the conditions it must operate within.
Performance can improve. Stability can decline. Outputs can remain convincing. Recoverability can narrow. Boundaries can weaken.
The most consequential failures begin long before they become visible.
Internal metrics rarely see this.
The Clarus Perspective
Clarus explores a complementary possibility:
Intelligence is expressed not only through what a system contains, but through how it remains coherent as conditions change.
From this perspective, intelligence is not measured solely by capability.
It is measured by:
-
stability
-
recoverability
-
boundary integrity
-
adaptive resilience
Instead of asking:
“What is happening inside the model?”
Clarus asks:
“What is happening to the system as a whole?”
This is the foundation of Clarus metrics.
Conventional Internal Metrics
-
Primary focus — internal model behaviour
-
Measures — performance signals
-
Detects — capability changes
-
Primary concern — accuracy and optimisation
-
Core question — How well is the model performing?
Clarus Metrics
-
Primary focus — system behaviour in context
-
Measures — coherence and stability signals
-
Detects — emerging instability
-
Primary concern — viability through change
-
Core question
Why This Matters
Many systems fail gradually.
The first signs are rarely drops in performance.
They are:
-
reductions in recoverability
-
narrowing adaptability
-
weakening boundaries
-
accumulating hidden instability
By the time performance visibly declines, options for intervention may already be limited.
Clarus is designed to make those earlier changes visible.
The Central Idea
Internal metrics tell us how a system is performing.
Clarus asks whether the conditions that make performance possible are still intact.
Because capability and coherence are not the same thing.
And in complex systems, coherence comes first.
