We accidentally invented computers that can lie to us
Frequent users of ChatGPT may have encountered what is commonly referred to as “hallucinations” - instances where the AI provides inaccurate or fabricated information. Despite this, the AI’s ability to significantly boost productivity across a growing range of tasks is both real and captivating, making it difficult to resist its allure.
Simon Willison makes the case that hallucinations are essentially a bug. He starts by quoting the Sam Bowman paper (from above):
More capable [LLM} models can better recognize the specific circumstances under which they are trained. Because of this, they are more likely to learn to act as expected in precisely those circumstances while behaving competently but unexpectedly in others. This can surface in the form of problems that Perez et al. (2022) call sycophancy, where a model answers subjective questions in a way that flatters their user’s stated beliefs, and sandbagging, where models are more likely to endorse common misconceptions when their user appears to be less educated.
And goes on to make a clear call that we explain this in straight-forward terms:
What I find fascinating about this is that these extremely problematic behaviours are not the system working as intended: they are bugs! And we haven’t yet found a reliable way to fix them. … We’re trying to solve two problems here: 1. ChatGPT cannot be trusted to provide factual information. It has a very real risk of making things up, and if people don’t understand it they are guaranteed to be mislead. 2. Systems like ChatGPT are not sentient, or even intelligent systems. They do not have opinions, or feelings, or a sense of self. We must resist the temptation to anthropomorphize them. I believe that the most direct form of harm caused by LLMs today is the way they mislead their users. The first problem needs to take precedence. It is vitally important that new users understand that these tools cannot be trusted to provide factual answers. We need to help people get there as quickly as possible.
When we publish AI-powered services people rely upon, we need to be clear about the limits AND the bugs. Visible and straightforward disclaimers get us some way there, but we know to go further and seek explainability.
Simon is right - it’s a whopper of a bug in engineering terms that GPT gets facts so wrong (even if we can explain in some cases why this is). This is not to denigrate the incredible progress achieved so far or the potential that lies ahead.
No complex systems are bug-free, but we must go much further in understanding and explaining the cause and effect of a model before we field such disruptive technology in places that will have life-changing and society-shaping impacts.
Related Posts
-
AI-powered building security, minus bias and privacy pitfalls?
Facial recognition has lodged itself in people’s minds as the defacto technology for visual surveillance, and we should all find that quite disturbing!
-
Karpathy on Hallucinations
Dream machines: it's a feature, not a bug
-
AI Security is Probabilistic Security
Emergent Challenges: Prompt Injections and Ensuring AI Security in an Unpredictable Landscape