Image by Sanket Mishra, from Pexels

ChatGPT’s Struggles with Accurate Citations, Raising Concerns For Publishers

Reading time: 3 min

Posted on Dec 1, 2024

Written by Kiara Fabbri Multimedia Journalist
Fact-Checked by Sarah Frazier Content Manager

ChatGPT’s frequent citation errors, even with licensed content, undermine publisher trust and highlight risks of generative AI tools misrepresenting journalism.

In a Rush? Here are the Quick Facts!

ChatGPT often fabricates or misrepresents citations, raising concerns for publishers.
Researchers found 153 out of 200 were incorrect citations, undermining trust in ChatGPT.
ChatGPT sometimes cites plagiarized sources, rewarding unlicensed content over original journalism.

A recent study from Columbia Journalism School’s Tow Center for Digital Journalism has cast a critical spotlight on ChatGPT’s citation practices, revealing significant challenges for publishers relying on OpenAI’s generative AI tool.

The findings suggest that publishers face potential reputational and commercial risks due to ChatGPT’s inconsistent and often inaccurate sourcing, even when licensing deals are in place.

The study tested ChatGPT’s ability to attribute quotes from 200 articles across 20 publishers, including those with licensing deals and those in litigation against OpenAI, as reported this week by Columbia journalism Review (CJR).

Despite OpenAI’s claims of providing accurate citations, the chatbot returned incorrect or partially incorrect responses in 153 instances. Only seven times did it acknowledge its inability to locate the correct source, often opting to fabricate citations instead.

Examples include ChatGPT falsely attributing a quote from the Orlando Sentinel to Time and referencing plagiarized versions of New York Times content from unauthorized sources.

Even when publishers allowed OpenAI’s crawlers access, citations were often misattributed, such as linking syndicated versions rather than original articles.

Mat Honan, editor-in-chief of MIT Tech Review, expressed skepticism over ChatGPT’s transparency, noting that its responses could mislead users unfamiliar with AI’s limitations.

“From my perspective, I’m very familiar with chatbots’ tendencies to hallucinate and make stuff up […] But I also know that most people probably don’t know that. […] I don’t think the small disclaimers you see in these chatbots are enough,” he said in CJR.

CJR notes that OpenAI defends its efforts, highlighting tools for publishers to manage content visibility and pledging to improve citation accuracy.

However, the Tow Center found that enabling crawlers or licensing content does not ensure accurate representation, with inconsistencies spanning both participating and non-participating publishers.

ChatGPT’s inaccuracies in referencing publisher content can erode trust in journalism and harm publishers’ reputations. When it misattributes or misrepresents articles, audiences may struggle to identify original sources, diluting brand recognition.

Even publishers permitting OpenAI’s crawlers or holding licensing agreements are not immune to these errors, highlighting systemic flaws. ChatGPT’s tendency to provide misleadingly confident answers, rather than admitting gaps in its knowledge, misleads users and undermines transparency.

Such practices could distance audiences from credible news sources, incentivize plagiarism, and weaken the visibility of high-quality journalism. These consequences jeopardize the integrity of information-sharing and trust in digital media platforms.

ChatGPT’s Struggles with Accurate Citations, Raising Concerns For Publishers

We're thrilled you enjoyed our work!

Leave a Comment