The first revamp of the OWASP Top 10 for LLM Applications has been released. With only minor changes, version 1.1 of the Open Worldwide Application Security Project's list of key vulnerabilities continues to advance the project team's goal of bridging the divide between general application security principles and the challenges posed by LLMs.
To achieve that bridge, OWASP added a visual illustration of the data flow in a typical LLM application to highlight the potential areas of risk from vulnerabilities. For example, the data flow between an API and an LLM's production services could be vulnerable to a prompt injection or a denial-of-service attack, or an application's plugins might contain excessive vulnerabilities.
Generative AI is advancing at a breakneck pace. To keep it from breaking your organization's back, here's a full rundown on the changes in the OWASP Top 10 for LLMs, a starting point for your dev and AppSec teams to get a handle on generative AI.
[ See Webinar: Secure by Design: Why Trust Matters for Risk Management ]
Mapping out LLM risk: Go with the flow chart
OWASP has to work fast to keep up with the changes in LLM technology; version 1.0 of the Top 10 for LLMs was released only in August.
Chris Romeo, CEO of the threat modeling company Devici, said the inclusion of the LLM application data flow chart is the most significant change in the new version.
"The data flow provides a reference architecture to help readers understand how LLM systems are assembled. Without that context, it is more challenging to understand how the LLM Top 10 risks fit together."
—Chris Romeo
OWASP Top 10 for LLM project leader Steve Wilson, also chief product officer of Exabeam, said the language describing the risks, as well as the examples accompanying them, have been cleaned up and clarified.
"Some people were confused about the differences between some of the risks. For example, insecure output handling and excessive agency used some similar examples, although different vulnerabilities were at their core."
—Steve Wilson
Prompt injection and output handling enhanced
The new version of the Top 10 for LLMs also increases clarity around the descriptions and manifestations within LLM architectures for prompt injection and insecure output handling. Dan Hopkins, vice president of engineering at the API security testing firm StackHawk, said this move was essential.
"Those tests will prove to be very visible to a user and demand targeted fuzzing at runtime for effective assessment."
—Dan Hopkins
A step in the right direction on securing AI
Version 1.1 is "a significant step in the right direction,” said Hopkins. “It’s great to see version 1.1 placing a strong emphasis on enhancing the clarity and understanding of vulnerabilities within an LLM-based architecture.”
“The dataflow specifically does an amazing job highlighting where vulnerabilities exist in the stack, making it abundantly clear why black-box testing of a running application is essential for secure LLM usage,” he added.
The security community is still learning about the wide range of AI capabilities, and the OWASP Top 10 LLM 1.1 reflects that, observed Priyadharshini Parthasarathy, senior security consultant for application security at Coalfire.
“The new version includes a lot of detailed information on LLM-specific terms such as 'pre-training data,' the embedding process, and fine-tuning of data on how the models are being trained. This document also updated the list of scenario examples and references in the prevention and mitigation strategies."
—Priyadharshini Parthasarathy
Top 10 risks remain constant
The top 10 risks in the latest version of the list remain unchanged from v1.0:
- LLM01: Prompt injection—Used to manipulate an LLM through crafty inputs, causing unintended actions.
- LLM02: Insecure output handling—Occurs when an LLM output is accepted without scrutiny, exposing backend systems.
- LLM03: Training data poisoning—Occurs when LLM training data is tampered with, introducing vulnerabilities or biases that compromise security, effectiveness, or ethical behavior.
- LLM04: Model denial of service—Happens when attackers cause resource-heavy operations on LLMs, leading to service degradation or high costs.
- LLM05: Supply chain vulnerabilities—Can manifest when an application’s lifecycle is compromised by vulnerable components or services.
- LLM06: Sensitive information disclosure—Can arise when an LLM inadvertently reveals confidential data in its responses.
- LLM07: Insecure plugin design—Results in plugins with insecure inputs and insufficient access control, leading to consequences such as remote code execution.
- LLM08: Excessive agency—Surfaces when excessive functionality, permissions, or autonomy is granted to LLM-based systems.
- LLM09: Overreliance—Crops up when systems or people become overly dependent on LLMs without oversight.
- LLM10: Model theft—Involves unauthorized access, copying, or exfiltration of proprietary LLM models.
OWASP Top 10 for LLM: The next generation
Future versions of the OWASP Top 10 for LLMs will need to evolve with the gen AI field itself, security experts note. Devici's Romeo said that he, for one, wants the document to include threat language for each of the Top 10 items.
“The document contains vulnerability examples today, but threat examples would provide direct input into the threat modeling of LLM applications.”
—Chris Romeo
StackHawk's Hopkins said it would also be great to expand the Top 10 for LLMs' procedures for ensuring the absence of vulnerabilities.
“Adding detailed descriptions that highlight AppSec techniques and their suitability for mitigating and preventing various vulnerabilities within the context of a sample architecture would be incredibly beneficial.”
—Dan Hopkins
Michael Erlihson, a principal data scientist at the API security company Salt Security, suggested that the vulnerability descriptions in the list should be expanded in a future version. Including mitigation strategies for each vulnerability would also be worthwhile for developers and security teams, he said.
“More detailed descriptions and examples of each listed vulnerability could help practitioners better understand the risks involved."
—Michael Erlihson
Erlihson also suggested including industry-specific guidance in the list, as well as historical data on the vulnerabilities. “Historical data on how the vulnerabilities have evolved over time could provide insights into emerging threats and trends,” he said.
OWASP Top 10 for LLMs project leader Wilson said OWASP is planning two major deliverables in the near future, as well as additional rigor:
- International language versions of the list—“We're working on translating the list into 10 languages,” Wilson said. “We're almost done with Chinese and Hindi, which will allow a lot of software developers to consume this. We're looking to publish a bunch of those in the next month.”
- A companion document for CISOs, called the Checklist—This document focuses on what needs to be considered as gen AI technologies are deployed in the enterprise. “We’re planning to have the first draft of that available for public comment in November,” Wilson said.
- More rigor on data gathering—The aim here is to better highlight the likelihood and severity of risks to LLMs, Wilson said. “That's where we will almost certainly redefine some of the categories and change what's on the Top 10 list,” he said.
Keep learning
- Get up to speed on securing AI/ML with our Special Report. Plus: Join our Nov. 6 Webinar: The MLephant in the Room.
- Learn how you can go beyond the SBOM with deep visibility and new controls for the software you build or buy. Learn more in our Special Report — and take a deep dive with our white paper.
- Upgrade your software security posture with RL's new guide, Software Supply Chain Security for Dummies.
- Commercial software risk is under-addressed. Get key insights with our Special Report, download the related white paper — and see our related Webinar for more insights.
Explore RL's Spectra suite: Spectra Assure for software supply chain security, Spectra Detect for scalable file analysis, Spectra Analyze for malware analysis and threat hunting, and Spectra Intelligence for reputation data and intelligence.