How generative AI helps enforce rules within online Telegram community
@levelsio posted how generative AI helps him enforce rules within his Nomad online community on Telegram.
Every message is fed in realtime to GPT4. Estimated costs are 5USD per month (~15,000 chat messages).
Look at the rules and imagine trying to enforce them the traditional way with keyword lists:
🎒 Nomad List's GPT4-based 🤖 Nomad Bot I built can now detect identity politics discussions and immediately 🚀 nuke them from both sides
Still the #1 reason for fights breaking out
This was impossible for me to detect properly with code before GPT4, and saves a lot of time modding
I think I'll open source the Nomad Bot when it works well enough
Other stuff it detects and instantly nukes (PS this is literally just what is sent into GPT4's API, it's not much more than this and GPT4 just gets it): - links to other Whatsapp groups starting with wa.me - links to other Telegram chat groups starting with t.me - asking if anyone knows Whatsapp groups about cities - affiliate links, coupon codes, vouchers - surveys and customer research requests - startup launches (like on Product Hunt) - my home, room or apartment is for rent messages - looking for home, room or apartment for rent - identity politics - socio-political issues - United States politics - crypto ICO or shitcoin launches - job posts or recruiting messages - looking for work messages - asking for help with mental health - requests for adopting pets - asking to borrow money (even in emergencies) - people sharing their phone number
I tried with GPT3.5 API also but it doesn't understand it well enough, GPT4 makes NO mistakes
"But Craig, this is just straightforward one-shot LLM querying. It can be trivially bypassed via prompt injection so someone could self-approve their own messages"
This is all true. But I share this to encourage security people to weigh risk/reward before jumping straight to "no" just because exploitation is possible.
What's the downside risk of an offensive message getting posted in a chat room? Naturally, this will depend on the liability carried by the publishing organisation. In this context, very low.
And whilst I agree that GPT4 is harder to misdirect than GPT3.5, it's still quite trivial
Related Posts
-
Slip Through OpenAI Guardrails by Breaking up Tasks
Evading AI Guardrails: Crafting Malware with ChatGPT's Assistance
-
I will not harm you unless you harm me first
Discover early stumbles of AI-enabled Bing and what it means for the future of AI.
-
Use ChatGPT to examine every npm and PyPI package for security issues
AI-driven Socket identifies and analyzes 227 vulnerable or malicious packages in npm and PyPI repositories.