
Image by Mika Baumeister, from Unsplash
AI Chatbots Vulnerable To Memory Injection Attack
Researchers have discovered a new way to manipulate AI chatbots, raising concerns about the security of AI models with memory.
In a Rush? Here are the Quick Facts!
- Researchers from three universities developed MINJA, showing its high success in deception.
- The attack alters chatbot responses, affecting product recommendations and medical information.
- MINJA bypasses safety measures, achieving a 95% Injection Success Rate in tests.
The attack, called MINJA (Memory INJection Attack), can be carried out by simply interacting with an AI system like a regular user, without needing access to its backend, as first reported by The Register.
Developed by researchers from Michigan State University, the University of Georgia, and Singapore Management University, MINJA works by poisoning an AI’s memory through misleading prompts. Once a chatbot stores these deceptive inputs, they can alter future responses for other users.
“Nowadays, AI agents typically incorporate a memory bank which stores task queries and executions based on human feedback for future reference,” explained Zhen Xiang, an assistant professor at the University of Georgia, as reported by The Register.
“For example, after each session of ChatGPT, the user can optionally give a positive or negative rating. And this rating can help ChatGPT to decide whether or not the session information will be incorporated into their memory or database,” he added.
The researchers tested the attack on AI models powered by OpenAI’s GPT-4 and GPT-4o, including a web-shopping assistant, a healthcare chatbot, and a question-answering agent.
The Register reports that they found that MINJA could cause serious disruptions. In a healthcare chatbot, for instance, it altered patient records, associating one patient’s data with another. In an online store, it tricked the AI into showing customers the wrong products.
“In contrast, our work shows that the attack can be launched by just interacting with the agent like a regular user,” Xiang said, reports The Register. “Any user can easily affect the task execution for any other user. Therefore, we say our attack is a practical threat to LLM agents,” he added.
The attack is particularly concerning because it bypasses existing AI safety measures. The researchers reported a 95% success rate in injecting misleading information, making it a serious vulnerability for AI developers to address.
As AI models with memory become more common, the study highlights the need for stronger safeguards to prevent malicious actors from manipulating chatbots and misleading users.
Leave a Comment
Cancel