Evaluating Trust and Safety of Large Language Models

Home / Articles / External / Government

graphic of lines and dots
Image source: LLNL; Illustration: Adobe Stock

September 3, 2024 | Originally published by Lawrence Livermore National Laboratory (LLNL) on August 15, 2024

Amid the skyrocketing popularity of large language models (LLMs), researchers at Lawrence Livermore National Laboratory are taking a closer look at how these artificial intelligence (AI) systems perform under measurable scrutiny. LLMs are generative AI tools trained on massive amounts of data in order to produce a text-based response to a query. This technology has the potential to accelerate scientific research in numerous ways, from cybersecurity applications to autonomous experiments. But even if a billion-parameter model has been trained on trillions of data points, can we still rely on its answer?

Two Livermore co-authored papers examining LLM trustworthiness — how a model uses data and makes decisions — were accepted to the 2024 International Conference on Machine Learning, one of the world’s prominent AI/ML conferences.

“This technology has a lot of momentum, and we can make it better and safer,” said Bhavya Kailkhura, who co-wrote both papers.

Focus Areas