
Image generated with DALL·E through ChatGPT
Opinion: The AI Hallucination Epidemic, A Crisis We’re Not Ready For
Despite ongoing promises to reduce AI hallucinations, major AI tools—from ChatGPT to Perplexity to Gemini, and Apple Intelligence—continue to generate false information, often with alarming consequences. Experts, including those warning about AI risks, have fallen for fabricated content, and even advanced tools like Deep Research are making up reports. The truth seems to remain in human hands
Chatbots have been getting better throughout the years—a lot better. However, there’s one issue that hasn’t been quite solved and is known as “hallucinations.”
Our beloved chatbots share brilliant answers to our queries with the determination and authority of a science fiction Yoda, even when they are terribly wrong. And we believe them. Sometimes blindly.
Multiple scientists, experts, and even Chatbot developers have been warning about hallucinations for years. Yet, while adoption has been spreading rapidly—OpenAI reported 400 million weekly active users just a few days ago—AI literacy hasn’t kept pace.
Recent studies, court cases, and dramatic events continue to show that misinformation is even more dangerous than we realize.
It’s Worse Than We Think
At first, spotting major AI-generated errors was pretty funny—like those embarrassing AI overviews generated by Gemini suggesting users add “non-toxic glue to the sauce” for a pizza recipe or recommending eating “one small rock per day” last year. But, as we regain trust in AI, the situation has escalated, becoming increasingly concerning.
In December, we saw Apple’s AI tool create headlines “summarizing” news and generating fake and misleading information such as falsely claiming that the BBC had announced Luigi Mangione had shot himself. After this incident, the publication filed a complaint against Apple and began researching generative AI’s accuracy while analyzing news content.
BBC’s findings, published just a few days ago, revealed alarming statistics: 51% of the answers provided by popular AI Chatbots contained significant issues, 13% of the quotes provided by the models were completely fabricated, and 19% of the data was incorrect.
Teenagers are among the most affected populations, as they often struggle to distinguish fake news from real news, and can be easily influenced by AI-generated content. A study published in January showed that 35% of teens have been misled by fake content generated by AI models, and 22% shared the fake information.
But it’s not just teenagers and distracted people falling for these hallucinations. And it’s not just Gemini or Apple Intelligence.
No AI Model Is Spared, No Industry Is Safe
That research performed by the BBC confirms another issue: all AI models hallucinate. Experts considered the most popular models, ChatGPT, Gemini, Perplexity, and Copilot. No AI model is exempt from errors. Anthropic has a page addressing this issue, suggesting ideas on how to reduce hallucinations.
“Even the most advanced language models, like Claude, can sometimes generate text that is factually incorrect or inconsistent with the given context,” states the document. Other AI companies have shared similar pages with tips and tricks to avoid fake content, but it’s not that easy and it’s been an unresolved problem for quite a long time.
Back in 2023, OpenAI announced that it was working on new innovative ways to get rid of hallucinations. Spoiler alert: it’s still a huge problem today.
In January 2024—over a year ago—CEO Aravind Srinivas said that Perplexity’s hallucinations were primarily occurring in unpaid accounts. “Most of the complaints are from the free version of the product,” explained Srinivas, adding that they were already bringing more GPUs to fix the issue. Yet, by October, the New York Post and Dow Jones had filed a lawsuit against Perplexity—as their model kept attributing fake news to their publications—, and the AI tool developed by the startup for the U.S. elections was tested by experts who revealed inconsistencies, inaccurate summaries, and hallucinations.
The Hallucination Disease Is Reaching Scientific And Academic Levels
One of the biggest concerns right now is that even experts—including those warning about the risks and dangers of AI—have been falling for these hallucination-prone tools.
In December, Stanford professor and expert on technology and misinformation Jeff Hancock was accused of using AI to craft a court statement. Hancock filed a 12-page declaration defending the state’s 2023 law criminalizing the use of deepfakes, including 15 citations. However, two of those citations couldn’t be found anywhere—because ChatGPT, the misinformation expert’s preferred AI tool, had simply made them up.
Hancock—scheduled to teach “Truth, Trust, and Tech” this year—explained he used OpenAI’s chatbot to organize his citations which led to the hallucinations. The researcher apologized—and stood by the substantive points of his declaration—, and taught us all the valuable lesson that even experts and those most knowledgeable about AI risks are susceptible to it.
Professor Hancock has not been the only one to submit documents containing AI-generated fabrications in court, of course. Another case involving a lawsuit against Walmart recently went viral because the attorneys used fake cases generated by AI to build their argument. In fact, the issue has become so frequent in the U.S. courts that the law firm Morgan & Morgan recently sent emails to its more than 1,000 attorneys, warning them about the risks of using AI-generated citations, and the American Bar Association reminded its 400,000 members of the attorney ethics rules—including AI generated information.
Deep Research Too
One of the most popular AI tools right now is “Deep Research,” designed for experts and scientists seeking more complex results in their research. Hallucinations are not absent from this tool either, even though OpenAI’s version initially required a $200 Pro subscription for access.
Users on Reddit have raised concerns regarding this issue, reporting all the popular models featuring deep research tools—Perplexity, ChatGPT, and DeepSeek—have hallucinated. Researchers and AI experts have also shared troubling results on other social media platforms like X.
“The tool produced a beautifully written and argued report,” wrote one user who used OpenAI’s Deep Research tool to study math done by young people. “The only problem is that it’s all made up.”
“Deep Research made up a bunch of stats & analysis, while claiming to compile a dataset of 1000s of articles, & supposedly gather birth year info for each author from reputable sources,” shared another. “None of this is true.”
Had a sort of a funny experience with OpenAI’s Deep Research tool, which I wanted to share since I think it reveals some of the tool’s strengths and weaknesses.
— Daniel Litt (@littmath) February 18, 2025
The Truth Remains In Human Hands
Will chatbots ever stop hallucinating? AI’s weak point has been evident for years—we saw it in podcasts like Planet Money when they tested AI-generated episodes in 2023, and we continue to see it in the most advanced models, even those designed for exclusive use by expert and tech-savvy communities.
Perhaps it’s time to accept that this will remain an issue and understand that we must assume responsibility for what we create and share using AI tools.
The fact that even though it seems like a well-known problem, but AI risk experts themselves are falling for the persuasive and convincing writing of AI is definitely concerning. The situation becomes even more complex as adoption continues to accelerate at full speed, outpacing digital literacy, while inconsistencies and fabricated citations multiply.
Cases where AI hallucinations have been exposed tend to be those in which fact-checking is crucial—something Zuckerberg should be reminded of now that he has eliminated his fact-checking department. This is particularly evident in courtrooms, where lawyers and judges work to verify facts and cases, and in news media, where accuracy and source validation matter.
But what about cases where no one is scrutinizing these details? What happens in everyday, more personal contexts? Right now, millions of students are memorizing AI-generated answers for their studies, users are following AI-provided instructions to treat ailments, and others are learning about new topics, fully trusting the technology.
The consequences of this new reality we are facing are immeasurable and unpredictable, and the truth—for now—is in the hands of those who take the time to question and verify.
Leave a Comment
Cancel