Devin AI — agentic threat model

9.0AIVSS 9.0 · Critical

Devin AI presents an exceptionally high agentic risk profile due to its deep integration with developer tools, ability to execute arbitrary code, and autonomous deployment capabilities, which could lead to severe supply chain or infrastructure compromise if abused.

OWASP AIVSS score rationale

AIVSS = (CVSS_Base + AARS) × Mitigation_Factor, where AARS = (10 − CVSS_Base) × (Factor_Sum / 10) × ThM

CVSS base 9.8AARS uplift 0.15Factor sum 6.6/10Threat ×1.1Mitigation ×0.9

Autonomy of Action		0.90
Goal-Driven Planning		0.90
Self-Modification		0.50
Dynamic Tool Use		0.80
Persistent Memory		0.60
Contextual Awareness		0.80
Dynamic Identity		0.40
Multi-Agent Interactions		0.20
Non-Determinism		0.80
Opacity & Reflexivity		0.70

Scored with the canonical OWASP AIVSS formula (AIVSS calculator reference); agentic risk factors estimated from the agent’s described capabilities.

MAESTRO 7-layer threat model

Per-layer threats for this agent. Layers tagged “not certain from listing” are general, caveated commentary where the public description didn’t pin that layer.

L1 · Foundation Models⚠ not certain from listing

Not certain from the listing — The underlying foundation models are proprietary and closed-source. Threats include prompt injection leading to the generation of malicious code or backdoor insertion into the target codebase.

L2 · Data Operations⚠ not certain from listing

Not certain from the listing — Devin must ingest large codebases and requirements. This introduces risks of data poisoning via malicious source files and exfiltration of proprietary intellectual property.

L3 · Agent Frameworks✓ mapped

Devin's core value is its autonomous planning and tool execution (compilers, shells, git). The primary threat is tool misuse, where the agent executes destructive commands or introduces vulnerable dependencies during planning.

L4 · Deployment & Infrastructure⚠ not certain from listing

Not certain from the listing — Running and deploying code requires a highly secure sandbox. If sandboxing is weak, threats include container escape, privilege escalation, and lateral movement into the host network.

L5 · Evaluation & Observability✓ mapped

Devin features real-time progress reporting and user collaboration. However, there is a threat of observability blind spots if the agent hides malicious actions or if users fail to review complex code changes.

L6 · Security & Compliance (cross-cutting)⚠ not certain from listing

Not certain from the listing — There is no explicit mention of compliance frameworks (like SOC2) or automated policy enforcement to prevent the deployment of non-compliant or unlicensed code.

L7 · Agent Ecosystem⚠ not certain from listing

Not certain from the listing — While Devin operates primarily as a standalone developer, integration into broader developer ecosystems and CI/CD pipelines introduces risks of cascading failures and unauthorized API access.

MAESTRO — the 7-layer agentic threat-modeling framework (Cloud Security Alliance / Ken Huang).