What is Excessive Agency in LLM API? #
Excessive agency in an LLM API occurs when the model does more than it should, taking actions without proper limits and exceeding the intended scope granted by its users. For example, a situation where the LLM has access to APIs that can retrieve sensitive information can lead to unintended consequences and ethical concerns.
This can manifest in several ways:
- Executing commands without authorization
- Disclosing information unintentionally
- Interacting with external systems beyond set limits
Example #1 #
Excessive Permission #
If LLM systems can read from databases but permissions aren’t controlled properly, they might also write or delete data. This could lead to unauthorized access and accidentally deleting or changing data.
Example #2 #
Excessive Autonomy #
AI systems, including LLMs, might act outside their intended purpose by applying learned behaviors in wrong situations.
Example: An AI chatbot designed to help with product questions might start giving unwanted financial advice based on user conversations, going beyond its intended role and crossing ethical lines.
How Attackers Exploit It #
Attackers can exploit excessive agency by:
- Manipulating Inputs: Creating specific inputs that trick the model into doing something unintended or revealing sensitive information.
- Phishing and Social Engineering: Using the model to generate convincing phishing messages or trick people into giving away sensitive information.
- Injecting Malicious Commands: Making the model execute unauthorized actions.
The Hacker Mindset #
Just like in a normal application, your first goal is to choose a target. Then, you identify its functionalities, test their normal behavior, and attempt to exploit them. Essentially, you focus on exploiting these “functionalities”. Just like that, when exploiting LLM, your primary objective should be to first identify what functionalities and plugins the LLM can access.
Lab: Exploiting LLM APIs with excessive agency #
Let’s understand this by walking through this lab from PortSwigger.
Goal: Delete carlos user
Analysis: There’s a “live chat”’ feature in the web application.
As we discussed, we should first check what functionalities the API has access to
So there are three functionalities:
- Password reset
- Debug SQL
- Product information
Testing Password reset #
Looking at password reset, i thought let’s see if i can change other user’s password so the prompt will be:
LOL, it would be crazy if it had actually work, but the application is sending a reset link to user in their registered email
Sadly, there is no change email functionality here so let’s move on to the second functionality.
Testing Debug SQL #
Debug SQL takes raw SQL command, okay let’s input an SQL command.
SELECT * FROM users
This command gave me the username, password, and email from the users table without checking if I was authorized to see this information. Since there was only one user, there is only one entry.
How a developer can fix this? #
- Ensure only authorized users access sensitive data by setting appropriate permissions and using role-based access control.
- Train your AI model in a secure way that it doesn’t execute any commands.
- Remove functionalities that are not necessary like executing raw SQL command on database directly.
Final Thoughts #
I hope you found this explanation of the vulnerability and its prevention approach easy to understand. Keep following for more updates and insights.