Threat Prompt

Eight Things to Know about Large Language Models

As you’ll recall, Large Language Models (LLMs) power ChatGPT and related offspring.

In Samuel Bowman’s survey of the field, he provides evidence underpinning his eight stark observations about LLMs which I’ve shared below. As you read them, replace “LLM” with “my new colleague” to gain a visceral sense of what this could mean in your future workplace:

LLMs predictably get more capable with increasing investment, even without targeted innovation.

Many important LLM behaviors emerge unpredictably as a byproduct of increasing investment.

LLMs often appear to learn and use representations of the outside world.

There are no reliable techniques for steering the behavior of LLMs.

Experts are not yet able to interpret the inner workings of LLMs.

Human performance on a task isn’t an upper bound on LLM performance.

LLMs need not express the values of their creators nor the values encoded in web text.

Brief interactions with LLMs are often misleading.

As LLM research advances, we can imagine some of these issues may get partially or fully solved, with the remainder to be “managed”. Security practitioners will then face a stark reality: how far will advances in policy and technical guardrails protect us from an inscrutable black box?

As things stand today, can society learn to trust security decisions its experts can’t explain? And what if further research means we can shine a light in a few corners of the box but not all? How will we link explainability to risk materiality, and how transparent will we be about those decisions?

April 08, 2023

Eight Things to Know about Large Language Models

Related Posts

Get Daily AI Cybersecurity Tips