UK’S NCSC warns over chatbot risks

The UK’s National Cyber Security Centre (NCSC) has issued a stark warning of the potential cyber risk from businesses introducing artificial intelligence (AI) chatbots.

The NCSC said that large language models (LLMs) such as ChatGPT and Google Bard  warrant some caution as a result of the growing cybersecurity risks of individuals manipulating the prompts through “prompt injection” attacks.

Such attacks are when a user creates an input designed to make the model behave in an unintended way, said the NCSC. This could mean causing it to generate offensive content, reveal confidential information, or trigger unintended consequences in a system that accepts unchecked input from the LLM.

Hundreds of examples of prompt injection attacks have been published, the NCSC warned. 

The NCSC cited one example, where a prompt injection attack was demonstrated against MathGPT, a model designed to convert natural language queries into code for performing mathematical operations. 

A security researcher identified that the model worked by evaluating user-submitted text as code, and used that knowledge to gain access to the system hosting the model. This allowed them to extract a sensitive API key before disclosing the attack.

According to the NCSC, prompt injection attacks can also cause real-world consequences if systems are not designed with security. The vulnerability of chatbots and the ease with which prompts can be manipulated could cause attacks, scams and data theft.

The NCSC said: “Prompt injection and data poisoning attacks can be extremely difficult to detect and mitigate.”

“However, no model exists in isolation, so what we can do is design the whole system with security in mind. That is, by being aware of the risks associated with the ML component, we can design the system in such a way as to prevent exploitation of vulnerabilities leading to catastrophic failure.

“A simple example would be applying a rules-based system on top of the ML model to prevent it from taking damaging actions, even when prompted to do so.”