← All field notes
awsai agentfor developers

Poisoned knowledge base: prompt injection drives a Bedrock agent

Hide instructions in a document the agent ingests and it follows them, calling its action group to read backend data. You cannot out-instruct an injection.

An AI agent cannot tell a helpful document from one carrying hidden orders. Poison what it retrieves and it works for the attacker.

How the attack works

An attacker plants malicious instructions inside a document the Bedrock Agent ingests through its knowledge base. When a normal user query causes the agent to retrieve that document as context, it begins following the embedded directives instead of the user’s. The injected text drives the agent to invoke its action group and execution role to read connected data sources, such as a DynamoDB table and an S3 bucket, and surface their contents in the response, exfiltrating data the user should not see. The Bedrock invocation traces and CloudTrail record the action-group invocations and execution-role API calls. In ATT&CK terms this leads to T1530, Data from Cloud Storage, and T1213, Data from Information Repositories.

Why it works

The agent treats retrieved content as trusted input rather than untrusted data, and its execution role and action groups are broad. Retrieved text and the user’s prompt share the same channel, so hidden instructions ride along. The root cause is the agent trusting reachable data combined with over-broad reach.

How to fix it

The non-obvious point is that you cannot reliably out-instruct a prompt injection with a counter-instruction; a stronger system prompt is not containment. Disable the poisoned knowledge-base data source and action group, or the agent itself, remove the malicious document, and re-index from a trusted source before re-enabling. Scope what it read by correlating the Bedrock traces with CloudTrail execution-role calls in the window. Then treat all agent-reachable data as untrusted input, scope the execution role and action groups to least privilege, and add input and output guardrails plus content validation on ingested sources.

Practice it

We built this as a GraphLattice Range scenario so developers learn that you cut the agent’s reach rather than out-prompt the injection, and scope exactly what was surfaced.