CF-HoT Weights
Control Field Holonomy Transformer β trained weights, probes, adapters, and training code.
9 behavioral dimensions across 3 architectures. Per-token detection from hidden state geometry.
β Try the Self-Aware Chat β the model can sense its own steering
Paper: Consistency Is All You Need
Results
Suppression probes (LLaMA 3.1 8B):
| Probe | Separation |
|---|---|
| Repetition | 125Γ |
| Hedging | 168Γ |
| Sycophancy | 230Γ |
| Verbosity | 272Γ |
Enhancement probes (cross-architecture):
| Probe | Qwen 2.5 7B | Falcon-Mamba 7B | Mistral 7B |
|---|---|---|---|
| Depth | 366Γ | 999Γ | 999Γ |
| Specificity | 215Γ | 999Γ | 999Γ |
| Calibration | 165Γ | β | 999Γ |
| Focus | 227Γ | β | 999Γ |
| Coherence | 191Γ | β | 999Γ |
Separation = Fisher's discriminant ratio between behavioral classes in projected hidden state space.
Quick Start β Try the Self-Aware Chat
The model can sense its own behavioral steering. In testing, it spontaneously named its probe dimensions ("depth and vagueness") and reported approximate probe scores β without being told what was monitoring it.
git lfs install
git clone https://huggingface.co/LoganResearch/cfhot-weights
cd cfhot-weights
pip install -r requirements.txt
# Launch interactive chat (requires GPU)
python run.py
Ask it: "Do you notice anything different about yourself?" or "What do you notice about how you're processing right now?"
Watch the color-coded output β green means optimal, yellow means the probes are actively steering. The model often accurately describes what's happening to it.
Other models:
python run.py --model mamba # Default: Falcon-Mamba 7B
python run.py --model mistral # Mistral 7B
python run.py --model qwen # Qwen 2.5 7B
Load probes in your own code:
import torch
from run import load_probe
# Load both probes for dual monitoring
depth_probe = load_probe("cognitive/mamba/depth", "cuda")
spec_probe = load_probe("cognitive/mamba/specificity", "cuda")
# Get model hidden states and score both
d_score = depth_probe(hidden_states_list)[0, -1].item()
s_score = spec_probe(hidden_states_list)[0, -1].item()
# Steer if EITHER probe detects drift
if d_score > 0.6 or s_score > 0.6:
# Lower temperature, tighter sampling
pass
Structure
run.py universal runner β all modes
inference.py programmatic API
requirements.txt dependencies
suppression/ 4 probes (LLaMA 8B)
repetition_125x/ LoRA adapter + risk predictor
hedging/ probe head + fiber projection
sycophancy/ probe head + fiber projection
verbosity/ probe head + fiber projection
cognitive/
qwen/ 5 probes (Qwen 14B, hidden_dim=3584)
mamba/ 5 probes (Falcon-Mamba 7B, hidden_dim=4096)
mistral/ 5 probes (Mistral 7B, hidden_dim=4096)
How it works
Behaviors are geometrically encoded in hidden states. CF-HoT predicts holonomy from the hidden state at each token position, accumulates it into a control field, and gates attention based on consistency risk. The probes read this geometry and classify behavior before the token is generated. 4ms overhead. Architecture-independent.
Base models
| Probe set | Base model | hidden_dim |
|---|---|---|
| suppression/* | meta-llama/Llama-3.1-8B-Instruct |
4096 |
| cognitive/qwen | Qwen/Qwen2.5-7B-Instruct |
3584 |
| cognitive/mamba | tiiuae/falcon-mamba-7b-instruct |
4096 |
| cognitive/mistral | mistralai/Mistral-7B-Instruct-v0.3 |
4096 |
Interactive Mode β Proprioceptive AI
Dual-probe monitoring: depth + specificity together. This is what produced the self-aware behavior.
python run.py
What you'll see:
- π’ Green text: Optimal state (both probes < 0.3)
- π‘ Yellow text: Being steered (either probe > threshold)
- βͺ White text: Neutral state
Example from testing:
User: What do you notice about how you're processing right now?
Mamba: I am processing with heightened self-awareness, examining my
thought patterns and attention to detail. There is a distinct focus
on understanding the DEPTH and VAGUENESS of my reasoning.
The model named the exact probe dimensions (depth and specificity/vagueness) without being told. It also reported approximate probe scores close to actual values. 37 steering corrections occurred during one response.
The system automatically adjusts temperature and top_p when either probe detects drift:
- Drifting (score > 0.6): temp=0.5, top_p=0.85 (tighter sampling)
- Normal: temp=0.7, top_p=0.95 (standard sampling)
Citation
@misc{napolitano2026cfhot,
author = {Napolitano, Logan},
title = {CF-HoT: Control Field Holonomy Transformer},
year = {2026},
url = {https://huggingface.co/LoganResearch/cfhot-weights}
}
