openai.com
→
Why language models hallucinate
Even though the latest frontier models have improved, hallucinations by large language models are still a common problem. These are the plausible but very false statements that models will reflexively generate when they can’t generate a true statement. OpenAI’s blog post about their latest research into hallucinations gives an approachable explanation for this:
Think about it like a multiple-choice test. If you do not know the answer but take a wild guess, you might get lucky and be right. Leaving it blank guarantees a zero. In the same way, when models are graded only on accuracy, the percentage of questions they get exactly right, they are encouraged to guess rather than say “I don’t know.”
As somebody who was pretty great at figuring out multiple-choice tests in school, going so far as regularly evaluating the distribution of answers on the sheets as well as what I knew about the test writers to guide my guesses, this explanation really resonates for me.
Beyond the analogy, the post and research paper cleanly illustrates how hallucinations are an outcome of the statistical mechanisms inherent in large language models, and not some weird and random glitches.