The Need to Be Right: Humans and AI Compared. Lessons from Why Language Models Hallucinate by Lorne Epstein 6 min read
- Admin
- Sep 9
- 4 min read
I read an article about AI called Why Language Models Hallucinate, based on an academic paper of the same name. So, props to Adam Tauman Kalai, Ofir Nachum, Santosh S. Vempala, and Edwin Zhang for inspiring me to write this.
It got me thinking about how AI and human beings often strive to be right, yet are frequently incorrect. There is a palpable mix of fear and excitement about AI (large language models, general intelligence) emerging, and I thought I could reframe this potentially harmful technology into a valuable lesson for us human beings to learn from.
Both humans and AI share a striking similarity: a tendency to project confidence even when wrong. In AI, this behavior shows up as “hallucinations,” where models generate plausible but false statements (Ji et al., 2023). In humans, it manifests as overconfidence, a well-documented bias characterized by the overestimation of one’s knowledge or judgments (Moore & Healy, 2008). In both cases, the underlying issue is not just the errors themselves, but the incentives that reward guessing and confidence more than humility and uncertainty.
Humans’ Need to Be Right
From childhood, humans have been socialized to associate being right with being intelligent and competent. Educational systems often grade correct answers, rarely rewarding admissions of uncertainty (Kuhn, 2000). In professional settings, individuals who project confidence are usually perceived as more competent, regardless of whether their judgments are accurate (Anderson et al., 2012). This creates a cultural bias toward guessing and defending answers rather than admitting gaps in knowledge. If you don’t believe this is true, ask yourself the last time you publicly admitted you were incorrect. For many of us, this is a rare occurrence.
The psychological payoff is strong: being right protects one’s status and identity. But the cost is equally significant as errors are compounded, bad decisions spread, and learning opportunities are lost (Porter & Schumann, 2018).
AI’s Struggle with “I Don’t Know”
AI models face a similar problem. They are trained and evaluated on accuracy-based benchmarks, which reward correct guesses while giving abstentions zero credit, making guessing a more attractive strategy (OpenAI, 2023). AI is trained on vast amounts of text to predict the next word, but it lacks explicit examples of falsehoods. This makes it good at recognizing patterns, such as grammar, but prone to errors on arbitrary facts. The authors of Why Language Models Hallucinate say,
“it can be easier for a small model to know its limits. For example, when asked to answer a Māori question, a small model which knows no Māori can simply say “I don’t know” whereas a model that knows some Māori has to determine its confidence. As discussed in the paper, being “calibrated” requires much less computation than being accurate.” (Kalai, Nachum, Vempala, & Zhang, 2023)
Like humans, AI is not inherently dishonest; it is simply responding to the structure of incentives. When the scoreboard prioritizes accuracy over calibration, both humans and machines are pushed to prioritize appearing right rather than being honest about what they don’t know.
Incentivizing Humility
The solution for both is similar. We can take this opportunity to redesign incentives to reward humility. For humans, this could mean educational and workplace systems that give partial credit for identifying uncertainty or for being partially correct. Research has shown that intellectual humility, which involves acknowledging the limits of one’s knowledge, is associated with improved learning, greater openness to opposing views, and stronger collaboration (Porter & Schumann, 2018). Negative marking systems in standardized testing already attempt to discourage blind guessing by penalizing incorrect responses more so than omissions (Ben-Simon & Bennett, 2007).
For AI, researchers argue that evaluation metrics should reward abstentions and penalize confident errors more heavily than uncertainty (OpenAI, 2023). Just as humans learn more when mistakes are framed as learning opportunities rather than failures, models improve when “I don’t know” is scored as better than being confidently wrong.
The Shared Lesson
Humans and AI are both shaped by the incentives under which they are trained. If accuracy alone is rewarded, both will guess and hallucinate. If humility and partial correctness are rewarded, both will learn to say “I don’t know” when appropriate.
In the VUCA world we live in now, where information and decisions are increasingly high stakes, the courage to admit uncertainty may be more valuable than the need to be right. Incentivizing honesty over false certainty could help both people and machines make decisions that are not only smarter but also more trustworthy and hopefully more human.
References
Anderson, C., Brion, S., Moore, D. A., & Kennedy, J. A. (2012). A status-enhancement account of overconfidence. Journal of Personality and Social Psychology, 103(4), 718–735.
Ben-Simon, A., & Bennett, R. E. (2007). Toward more substantively meaningful automated essay scoring. Journal of Technology, Learning, and Assessment, 6(1).
Ji, Z., Lee, N., Frieske, R., Yu, T., Su, D., Xu, Y., ... & Fung, P. (2023). Survey of hallucination in natural language generation. ACM Computing Surveys, 55(12), 1–38.
Kuhn, D. (2000). Metacognitive development. Current Directions in Psychological Science, 9(5), 178–181.
Moore, D. A., & Healy, P. J. (2008). The trouble with overconfidence. Psychological Review, 115(2), 502–517.
OpenAI. (2023). Improving language model truthfulness through uncertainty-aware evaluation. [Research Paper].
Porter, T., & Schumann, K. (2018). Intellectual humility and openness to the opposing view. Self and Identity, 17(2), 139–162.



Comments