macos-cleaner — agentic threat model
The macos-cleaner agent presents a high-impact risk due to its ability to delete files on the host filesystem, though this is mitigated by a mandatory user-confirmation gate. Prompt injection or framework-level bypasses remain critical vectors that could lead to accidental or malicious data loss.
OWASP AIVSS score rationale
| Autonomy of Action | 0.30 | |
| Goal-Driven Planning | 0.20 | |
| Self-Modification | 0.00 | |
| Dynamic Tool Use | 0.60 | |
| Persistent Memory | 0.00 | |
| Contextual Awareness | 0.30 | |
| Dynamic Identity | 0.00 | |
| Multi-Agent Interactions | 0.00 | |
| Non-Determinism | 0.40 | |
| Opacity & Reflexivity | 0.30 |
Scored with the canonical OWASP AIVSS formula (AIVSS calculator reference); agentic risk factors estimated from the agent’s described capabilities.
MAESTRO 7-layer threat model
Per-layer threats for this agent. Layers tagged “not certain from listing” are general, caveated commentary where the public description didn’t pin that layer.
Not certain from the listing — the underlying LLM is not specified, but threats include prompt injection leading to unauthorized file deletion recommendations or attempts to bypass confirmation gates.
Not certain from the listing — no dedicated vector store or RAG is mentioned, but the agent reads local filesystem metadata which could contain sensitive file paths or names.
The agent uses tools to scan the filesystem and delete files. Vulnerabilities in tool execution or prompt injection could lead to directory traversal or deletion of critical system files.
Not certain from the listing — runs on the host macOS environment. Without strict sandboxing, a compromise of this skill could lead to arbitrary local command execution or host compromise.
Not certain from the listing — no explicit logging, guardrails, or evaluation metrics are described to monitor the safety of deletion recommendations.
The agent implements a Human-in-the-Loop (HITL) control requiring user confirmation before any file deletion, mitigating unauthorized destructive actions.
Not certain from the listing — this is a standalone community skill; no multi-agent interactions or marketplace integrations are detailed.
MAESTRO — the 7-layer agentic threat-modeling framework (Cloud Security Alliance / Ken Huang).