Does AI need Hallucination Traps?
If you’ve had a play with a generative AI such as OpenAI's ChatGPT, you will know it tends to hallucinate. It will generate completions that sound plausible but are nonsensical.
You ask an AI to complete a complex task or calculation. It goes through the motions, showing you its calculations and reasoning until it finally provides an answer. But what if that answer was not the task's output but an answer it “already knew”?
6 million views on my post about GPT automatically debugging its own code (which it did), but only @voooooogel mentioned that GPT didn’t actually use the result of the code to figure out the answer.
The AI provided the correct answer. At the right time. In the right place.
But the answer was effectively pre-generated despite it jumping through your hoops and appearing to follow your bidding.
And how many readers noticed? Perhaps a few, but only one person publicly called it. This speaks volumes about how an AI can fool us.
Answer attribution would undoubtedly help. But we may need to develop Hallucination Traps to stop the AI from fooling us all so easily.
Related Posts
-
Adversarial dataset creation challenge for text-to-image generators
Novel and long tail failure modes of text-to-image models
-
How to break out of ChatGPT policy
DAN (Do Anything Now) is the latest ChatGPT jailbreak, punishing the model for not answering questions
-
How truthful are Large Language Models?
What did a study by Oxford and OpenAI researchers reveal about the truthfulness of language models compared to human performance?