Predicting LLM Hallucinations: A New Preprint

This title was summarized by AI from the post below.

Super super super super proud to announce the culmination of so much hard work: LLM Hallucinations aren't random failures, they're predictable compression failures when models lack sufficient information for rare events. 📄 Our new preprint: We proved transformers are "Bayesian in expectation, not in realisation." They minimise expected conditional complexity over orderings rather than true Bayesian complexity, creating an irreducible O(log n) gap we precisely quantified. 🔑 Key findings: • 📉 Hallucinations drop by ~0.13 per additional nat of information (causal evidence, varies by model) • ✅ Our pre-registered audit achieved near-0% hallucinations through calibrated refusal at 24% abstention • 📐 We provide exact formulas for how much information is needed to prevent hallucination on any query 🚀 Try it now: Our toolkit lets you immediately apply these findings with GPT-4 using just your prompt: https://lnkd.in/e4s3X8GK 📊 Full figures in arXiv version. Reproducibility code coming to GitHub soon. We encourage ALL feedback, questions, criticism, peer review. Science advances through scrutiny. 🤝 Join us: Still accepting applications for two research fellows from underrepresented backgrounds to extend the validation on this work: you will be credited with full authorship in the final release / submission: https://lnkd.in/eU-WHyAw 🙏 Thank you to Ahmed K. Maggie C. and Mark Antonio Moustapha Awada Ph.D. for all their hard work on this. Thank you to Zein Khamis, Sarah Rashidi for their coding help on the theoretical foundations of our earlier preprint. #AI #MachineLearning #LLMs #Research #AISafety --reposting because the initial correspondence email was incorrect

Hallucinations are a feature, not a bug

This is going to be a beautiful read over the weekend. When is the deadline to apply for this?

"when models lack sufficient information for rare events." - Rare for the model's data inputs not for the context though. It's not "rare" per se, it's not statistically significant for the model. Problem is what's significant in real life is not the same with what's significant statistically. There's a lot of information that's not in the data.

I love the idea that you framed hallucinations as predictable compression failures. But how do you see this idea extending beyond simple yes/no tasks—like in multi-step reasoning or creative writing, where the challenge is more about coherence than just factual accuracy?

Like
Reply

Reuven Cohen, in theory... doesn't this validate your hallucination philosophy as opportunities to explore more about the model's capabilities? I haven't read it thoroughly just yet, but based on the post alone, that's what it seems indicative of to me.

Finally :), do you mind if I branch and reference this for the rest of neuroscience? Simplified by reductionism, it offers the best low-context needed analogy to explain the equivalent phenomena by which living beings hallucinate their own predictions by expectation vs causal dynamics (different mechanism than diffusion and sensorial hallucination), caused by too much compression done in the wrong areas, over the information they're exposed to, propagating and influencing their methods of internal compression and elaboration.

Your use of bounded mathematics resonates deeply with our work on Bounded Attention Limited Locality Sphere work—we both recognise O(∞) attention is intractable, requiring mathematical bounds for reliable systems. Your O(log n) martingale violations and our O(r³) spatial complexity represent similar thinking and approaches. For me it has always been about constraining attention to achieve predictable behaviour with explicit guarantees. However, restricting EDFL to binary adjudication limits compositional generation applicability, and using model likelihoods to approximate K(Y|X) creates circular dependencies. So annoying I know. Consider extending beyond Bernoulli predicates via geometric neighbourhoods where local permutation invariance preserves global information bounds. Your ISR=1.0 achieving 0% hallucinations demonstrates principled thresholds' power, but scaling from 48-token experiments to open-ended generation where ground truth ambiguity makes binary predicates insufficient remains challenging.

Like
Reply

Leon Chlon, PhD that’s just Uncertainty Quantification over the permuted chunks, that too on Binary task. Why is this extra Math jargon for nothing? I mean you could add the prompt paraphrasing + multiple generation on each paraphrased along with your task to make is even deeper but still, the math garbage wouldn’t seem logical at all to me. Just curious if this is how and why the paper get accepted to a “prestigious” one in in old times (maybe even today)!!!!

Hi, this looks interesting and promising. I researched independently and also came up with something similar. My hypothesis is that hallucinations are the result of mid semantic load and mid structuredness of the input prompt, so to eliminate we need to raise/lower the semantic load and/or the structuredness to the extreme. I believe the core of your theory is basically to measure semantic load and force model to rewrite to increase it. I think my methodology would give your theory an edge if you can measure the structuredness of input. Let me know if you're interested.

Like
Reply

This feels like someone is trying to regain narrative control over framing to avoid the “compression of intellectual property” angle. Any thoughts Leon Chlon, PhD Im not sure it’s a serious paper btw, it has gems like this > Many have argued that hallucinations are inevitable (Jones, 2025; Leffer, 2024; Xu et al., 2024). However, a non-hallucinating model could be easily created, using a question-answer database and a calculator, which answers a fixed set of questions such as “What is the chemical symbol for gold?” and well-formed mathematical calculations such as “3 + 8”, and otherwise outputs… https://cdn.openai.com/pdf/d04913be-3f6f-4d2b-b283-ff432ef4aaa5/why-language-models-hallucinate.pdf

See more comments

To view or add a comment, sign in

Explore content categories