Unit tests for prompt engineering
I’m a big fan of what I call “bad guy” unit tests for software security. These help software developers quickly identify certain classes of software security vulnerabilities. A couple of simple examples: what happens if we stuff unexpected data into a search query? Or provide a JSON array where a string is expected?
The topic of unit tests for Large Language Models (LLM) came up this past week:
Unit tests for prompt engineering. Like it or not, reliable prompt engineering is going to be a critical part of tech stacks going forward. " Unit test LLMs with LLMs Tracking if your prompt or fine-tuned model is improving can be hard. During a hackathon, @florian_jue, @fekstroem, and I built “Have you been a good bot?”. It allows you to ask another LLM to judge the output of your model based on requirements."
Two quick thoughts:
- we’re back again with one AI assessing another AI. It’s not hard to see a slew of AI governance, safety and trust products emerging.
- AIs are great for generating unit tests and can easily be prompted to generate “bad guy” ones. If you work in security, it’s time to roll your sleeves up!
Related Posts
-
Upgrade your Unit Testing with ChatGPT
Companies with proprietary source code can use public AI to generate regular and adversarial unit tests without disclosing their complete source code to said AI.
-
Secure AI Unit Testing: Have Your Cake and Eat It Too
Remember when we discussed generating unit tests without exposing your full source code to an AI? Well...
-
Error messages are the new prompts
Can error messages from software teach an AI a new skill?