prompt-guard (AI-Research-SKILLs)

Agent SkillsFreeOpen Source

Safety skill for deploying Prompt Guard to detect prompt-injection and jailbreak inputs.

🛡️ AgentReady threat assessment

MAESTRO 7-layer threat model + OWASP AIVSS risk score for prompt-guard (AI-Research-SKILLs), derived from its capabilities.

AIVSS 6.2 · Medium

View MAESTRO 7-layer threat model →

Overview

A safety-alignment skill covering Meta's Prompt Guard classifier to detect prompt-injection and jailbreak attempts on LLM inputs. Surface: injects deployment guidance and writes/runs classifier code — directly relevant to agent security defenses.

Key features

Prompt-injection/jailbreak detection
Prompt Guard classifier integration
Part of the safety-alignment skill set

Use cases

Add injection detection to an LLM app
Filter malicious agent inputs