Frontier Group launches for AI Safety
OpenAI, Microsoft, Google, Antropic and other leading model creators started the Frontier Model Forum "focused on ensuring safe and responsible development of frontier AI models".
The Forum defines frontier models as large-scale machine-learning models that exceed the capabilities currently present in the most advanced existing models, and can perform a wide variety of tasks.
Naturally, there's only a handful of big tech companies that have the resources and talent to develop frontier models, other members . The forums stated goals are:
Identifying best practices: Promote knowledge sharing and best practices among industry, governments, civil society, and academia, with a focus on safety standards and safety practices to mitigate a wide range of potential risks.
Advancing AI safety research: Support the AI safety ecosystem by identifying the most important open research questions on AI safety. The Forum will coordinate research to progress these efforts in areas such as adversarial robustness, mechanistic interpretability, scalable oversight, independent research access, emergent behaviors and anomaly detection. There will be a strong focus initially on developing and sharing a public library of technical evaluations and benchmarks for frontier AI models.
Facilitating information sharing among companies and governments: Establish trusted, secure mechanisms for sharing information among companies, governments and relevant stakeholders regarding AI safety and risks. The Forum will follow best practices in responsible disclosure from areas such as cybersecurity.
Meta, who in July '23 released the second generation of their open-sourced Llama 2 Large Language model (including for commercial use), is notably absent from this group.
Related Posts
-
OWASP Livestream & Newsletter Reboot
It's time to share my sawdust...
-
Unpacking AI Safety
Tackling AI Safety & Alignment Challenges Amid Rapid Progress and Potential Disruptions.
-
Sleeper LLMs bypass current safety alignment techniques
Anthropic: we don't know how to stop a model from doing the bad thing