Capy — agentic threat model

9.6AIVSS 9.6 · Critical

Capy presents a high-risk profile as an autonomous coding agent with the capability to write, test, and ship code. Without explicit sandboxing or human-in-the-loop guardrails, a compromise could lead to severe repository-wide supply chain attacks.

OWASP AIVSS score rationale

AIVSS = (CVSS_Base + AARS) × Mitigation_Factor, where AARS = (10 − CVSS_Base) × (Factor_Sum / 10) × ThM

CVSS base 8.8AARS uplift 0.83Factor sum 6.3/10Threat ×1.1Mitigation ×1.0

Autonomy of Action		0.80
Goal-Driven Planning		0.80
Self-Modification		0.30
Dynamic Tool Use		0.80
Persistent Memory		0.60
Contextual Awareness		0.80
Dynamic Identity		0.40
Multi-Agent Interactions		0.50
Non-Determinism		0.70
Opacity & Reflexivity		0.60

Scored with the canonical OWASP AIVSS formula (AIVSS calculator reference); agentic risk factors estimated from the agent’s described capabilities.

MAESTRO 7-layer threat model

Per-layer threats for this agent. Layers tagged “not certain from listing” are general, caveated commentary where the public description didn’t pin that layer.

L1 · Foundation Models⚠ not certain from listing

Not certain from the listing — The underlying LLM is not specified, leaving it vulnerable to standard prompt injection, model reprogramming, or adversarial inputs that could alter code generation.

L2 · Data Operations⚠ not certain from listing

Not certain from the listing — The agent must ingest codebase repositories and issue trackers, exposing it to data poisoning or sensitive data exfiltration if malicious code or issues are ingested.

L3 · Agent Frameworks⚠ not certain from listing

Not certain from the listing — The orchestration framework for planning and tool calling (e.g., git, bash, compilers) is unspecified, risking insecure tool execution or command injection via malicious issue descriptions.

L4 · Deployment & Infrastructure⚠ not certain from listing

Not certain from the listing — The execution environment (sandboxing for running/testing code) is not detailed, posing a severe risk of host compromise or lateral movement if untrusted code is executed.

L5 · Evaluation & Observability⚠ not certain from listing

Not certain from the listing — No details are provided regarding guardrails, output sanitization, or monitoring of the generated code before it is committed or shipped.

L6 · Security & Compliance (cross-cutting)⚠ not certain from listing

Not certain from the listing — There is no mention of access control, commit signing, or compliance frameworks governing the agent's write access to repositories.

L7 · Agent Ecosystem⚠ not certain from listing

Not certain from the listing — While it ships features in parallel, it is unclear if it coordinates with other specialized agents, risking cascading failures or unauthorized multi-agent trust escalation.

MAESTRO — the 7-layer agentic threat-modeling framework (Cloud Security Alliance / Ken Huang).