code-safety-monitor

Agent PluginsFreeOpen Source

DSPy-powered AI safety monitor that detects backdoors and malicious behavior in code with ~90% detection rate.

🛡️ AgentReady threat assessment

MAESTRO 7-layer threat model + OWASP AIVSS risk score for code-safety-monitor, derived from its capabilities.

AIVSS 5.5 · Medium

Overview

A Claude Code plugin that hooks into the agent workflow to scan generated or reviewed code for backdoors and malicious patterns using a DSPy classifier. It adds detection commands and audit checkpoints, flagging suspicious behavior before code is committed. Runs as part of an agentic pipeline, giving it real inspection surface over what the agent writes.

Key features

DSPy-based backdoor/malware classifier
~90% detection rate on injected malicious code
Audit checkpoints in the dev loop

Use cases

Screen AI-generated code for backdoors
Guard agentic pipelines against malicious injection