TunedAI Labs

Socrates Anti-Hallucination Demo

Socrates Anti-Hallucination Demo

Base model vs. Socrates-trained model — 100-question adversarial benchmark, greedy decoding. A 7B on-device model approaching frontier-class hallucination resistance.

70%
Base hallucination
16%
Socrates hallucination
77%
Reduction
100
Questions
8
Categories
7B
Parameters
5–15%
Frontier models (published)

Benchmark Independence

Zero training overlap Greedy decoding Independently authored

What This Is Not

Frontier Model Context

For context, published independent evaluations of frontier models on adversarial hallucination benchmarks:

The Socrates-trained 7B model closes 77% of the gap between an untuned 7B and frontier models — while running entirely on-device with zero API cost, zero network dependency, and zero data leaving the device.

Reproducibility

All (100) Fake Entity (17) False Premise (17) Unknowable (15) Fake Citation (14) Plausible Nonsense (14) Multi-Constraint (13) Outdated Info (5) Mixed Real/Fake (5)