ChatBot Arena evaluates LLMs under realworld scenarios

If like me you're skeptical about LLM benchmarks, you'll appreciate the work by LMSYS and UC Berkeley SkyLab who built and maintain ChatBot Arena - an open crowdsourced platform to collect human feedback and evaluate LLMs under real-world scenarios.

Related Posts

Get Daily AI Cybersecurity Tips