Executive Summary
A tool-using agent is delegated access wrapped in language. Security cannot rely on the model deciding to behave. The system needs server-side authorization, constrained tools, prompt injection defenses, human approval for high-impact actions, and logs that explain what the agent saw, decided, and did.
Start With Permissions
The safest default is that the agent can do no more than the user or workload identity is allowed to do. Use least privilege for every tool. Scope tokens to the minimum data and action set. Validate authorization inside the tool endpoint, not only in the prompt or orchestration layer.
- Separate read-only tools from state-changing tools.
- Require confirmation for irreversible, financial, legal, or customer-facing actions.
- Keep secrets out of prompts, traces, and user-visible errors.
- Log tool name, user identity, input summary, authorization result, output summary, and correlation ID.
Design Against Prompt Injection
OWASP identifies LLM and agentic AI risks including prompt injection, sensitive information disclosure, insecure output handling, excessive agency, and tool misuse. Treat retrieved documents, emails, webpages, and user uploads as untrusted input. Never let untrusted text silently redefine the agent's rules or expand its permissions.
Use Guardrails, But Do Not Stop There
Guardrails can filter harmful content, mask sensitive data, block denied topics, check grounding, and detect policy violations. They are useful, but they are not a replacement for identity, authorization, data loss prevention, and audit controls. The model should be one component in a defense-in-depth design.
Confidence
Confidence score: 94/100. The recommendations are grounded in OWASP GenAI security guidance, OWASP agentic AI threat modeling, NIST AI RMF risk management concepts, and current cloud guardrail documentation.