All Stories
April 2023
-
Guardrails: Reducing Risky Outputs
Enhancing LLM Output Predictability and Safety with Structured Validation
-
AI Security is Probabilistic Security
Emergent Challenges: Prompt Injections and Ensuring AI Security in an Unpredictable Landscape
-
Obi-ChatGPT - You’re My Only Hope!
Funny Jailbreak of the Week
-
Eight Things to Know about Large Language Models
LLMs as Colleagues? 8 Observations and Future Workplace Implications
-
Reverse Engineering Neural Networks
Building Trust in AI: Seeking Mechanistic Interpretability for AI Explainability and Safety
-
We accidentally invented computers that can lie to us
Hallucinations as Bugs: AI's Double-edged Sword in Disruptive Technology and Society.
-
Slip Through OpenAI Guardrails by Breaking up Tasks
Evading AI Guardrails: Crafting Malware with ChatGPT's Assistance
-
Use ChatGPT to examine every npm and PyPI package for security issues
AI-driven Socket identifies and analyzes 227 vulnerable or malicious packages in npm and PyPI repositories.
-
Introducing Microsoft Security Copilot
A closed-loop learning system for enterprise Security Operations Centers
-
Constitutional AI
Scaling Supervision for Improved Transparency and Accountability in Reinforcement Learning from Human Feedback Systems.
Page 5 of 11