What is OWASP LLM05 improper output handling and how do I sanitize LLM output before a browser, shell, or SQL?
OWASP LLM05 Improper Output Handling occurs when downstream systems trust LLM output without validation, which can lead to vulnerabilities like XSS, SQLi, SSRF, RCE, or privilege escalation when the output is rendered, executed, or passed to tools. To sanitize LLM output, it should be treated as untrusted and controls should be implemented before it interacts with browsers, shells, or SQL databases.
- Treat model output as untrusted The fundamental control for OWASP LLM05 is to assume that all LLM output is untrusted. This mindset guides the implementation of subsequent sanitization and validation steps.
- Encode/sanitize before rendering Before displaying LLM output in a browser, it must be encoded or sanitized to prevent Cross-Site Scripting (XSS) and other rendering-based attacks. For example, sensitive information like API keys should be redacted from error messages before the LLM sees them or before they are returned to the agent.
- Use parameterized queries for SQL When LLM output is used to construct SQL queries, parameterized queries should be employed to prevent SQL injection vulnerabilities. This ensures that LLM-generated text is treated as data, not executable code.
- Schema-validate tool arguments If LLM output is passed as arguments to tools, these arguments should be validated against a defined schema. This is considered the cheapest and most effective runtime check, as it can interrupt attacks by refusing to proceed on schema violations. Structured output, where the LLM's response conforms to a machine-readable schema (e.g., JSON), allows for robust type-checking and parsing by downstream code.
- Never
evalmodel text Directly evaluating LLM-generated text as code (e.g., usingevalin Python) should be avoided, as this can lead to Remote Code Execution (RCE). - Implement pre-execution guards for shell commands Before executing any shell command derived from LLM output, implement guards that perform checks like regex-based dangerous command detection and optional security scanning. For high-impact actions, human approval may be required. Container environments can provide isolation, bypassing some approval checks, but the container boundary is treated as sufficient isolation.
- Chapter 13: MCP Integration — Connecting Agents to the World (Claude Code vs. Hermes Agent)
- owasp_llm_top10
- Chapter 4: Permission Systems and Safety Guardrails (Claude Code vs. Hermes Agent)
- Designing Agentic AI Systems with the ORCHIDEAS Framework
- Chapter 15: Structured Output and Schema-Constrained Generation (Claude Code vs. Hermes Agent)
- LAAF: Logic-Layer Automated Attack Framework - A Systematic Red-Teaming Methodology for LPCI Vulnerabilities in Agentic Large Language Model Systems
How does your AI agent score?
Get a free, instant AI agent security readiness snapshot — mapped to NIST, OWASP & ISO — then unlock the full report with a prioritized, cited fix-list.
This AI-generated answer is for guidance only — not a certification, audit, or penetration test. Grounded in the NIST AI RMF, OWASP LLM Top 10, and ISO/IEC 42001 control text; verify applicability to your environment.