Clippy — agentic threat model

8.3AIVSS 8.3 · High

Clippy is a highly autonomous coding agent capable of planning, writing, and executing code, presenting a high risk of remote code execution and supply chain poisoning if its execution environment is not strictly sandboxed.

OWASP AIVSS score rationale

AIVSS = (CVSS_Base + AARS) × Mitigation_Factor, where AARS = (10 − CVSS_Base) × (Factor_Sum / 10) × ThM

CVSS base 8.5AARS uplift 0.77Factor sum 4.9/10Threat ×1.05Mitigation ×0.9

Autonomy of Action		0.70
Goal-Driven Planning		0.80
Self-Modification		0.20
Dynamic Tool Use		0.80
Persistent Memory		0.40
Contextual Awareness		0.60
Dynamic Identity		0.10
Multi-Agent Interactions		0.10
Non-Determinism		0.70
Opacity & Reflexivity		0.50

Scored with the canonical OWASP AIVSS formula (AIVSS calculator reference); agentic risk factors estimated from the agent’s described capabilities.

MAESTRO 7-layer threat model

Per-layer threats for this agent. Layers tagged “not certain from listing” are general, caveated commentary where the public description didn’t pin that layer.

L1 · Foundation Models⚠ not certain from listing

Not certain from the listing — the underlying foundation model is not specified, but it is inherently vulnerable to prompt injection that could hijack the code generation process or introduce subtle backdoors into the generated software.

L2 · Data Operations⚠ not certain from listing

Not certain from the listing — no details on codebase indexing, vector stores, or training data operations are provided, though poisoning of the reference codebase could lead to the propagation of insecure code patterns.

L3 · Agent Frameworks✓ mapped

The agent framework orchestrates planning, writing, debugging, and testing. Insecure tool integration is a critical threat here, as the agent's ability to run tests and debug code autonomously implies execution capabilities that could be abused to run arbitrary shell commands.

L4 · Deployment & Infrastructure⚠ not certain from listing

Not certain from the listing — the deployment infrastructure and sandboxing mechanisms are not described. If run directly on a developer's host machine without containerization, any malicious code execution during the 'test' phase could compromise the entire host.

L5 · Evaluation & Observability⚠ not certain from listing

Not certain from the listing — there are no mentioned logging, evaluation, or guardrail mechanisms, which creates a blind spot regarding what code the agent is executing or modifying during its autonomous cycles.

L6 · Security & Compliance (cross-cutting)⚠ not certain from listing

Not certain from the listing — no identity, authorization, or compliance policies are defined, raising significant compliance and intellectual property risks if the agent accesses or leaks proprietary codebases.

L7 · Agent Ecosystem⚠ not certain from listing

Not certain from the listing — no multi-agent coordination or ecosystem marketplace interactions are described for this agent.

MAESTRO — the 7-layer agentic threat-modeling framework (Cloud Security Alliance / Ken Huang).