Code Autopilot — agentic threat model

9.3AIVSS 9.3 · Critical

Code Autopilot presents a high-risk profile due to its deep integration with GitHub repositories and write access capabilities, making it a prime target for repository hijacking and downstream supply chain attacks via prompt injection.

OWASP AIVSS score rationale

AIVSS = (CVSS_Base + AARS) × Mitigation_Factor, where AARS = (10 − CVSS_Base) × (Factor_Sum / 10) × ThM

CVSS base 8.5AARS uplift 0.77Factor sum 4.9/10Threat ×1.05Mitigation ×1.0

Autonomy of Action		0.60
Goal-Driven Planning		0.70
Self-Modification		0.10
Dynamic Tool Use		0.60
Persistent Memory		0.40
Contextual Awareness		0.80
Dynamic Identity		0.30
Multi-Agent Interactions		0.10
Non-Determinism		0.70
Opacity & Reflexivity		0.60

Scored with the canonical OWASP AIVSS formula (AIVSS calculator reference); agentic risk factors estimated from the agent’s described capabilities.

MAESTRO 7-layer threat model

Per-layer threats for this agent. Layers tagged “not certain from listing” are general, caveated commentary where the public description didn’t pin that layer.

L1 · Foundation Models✓ mapped

Utilizes GPT models. Vulnerable to prompt injection attacks that could trick the model into generating insecure code, introducing backdoors, or leaking sensitive codebase context.

L2 · Data Operations✓ mapped

Creates context by analyzing entire codebases. Vulnerable to codebase poisoning, where malicious code or comments in the repository manipulate the agent's behavior or trigger data exfiltration.

L3 · Agent Frameworks⚠ not certain from listing

Not certain from the listing — the orchestration framework is proprietary. However, the integration with GitHub APIs presents a risk of tool misuse, where the agent could be manipulated into creating unauthorized pull requests or commits.

L4 · Deployment & Infrastructure⚠ not certain from listing

Not certain from the listing — hosting, sandboxing, and secrets management details are omitted. If the agent executes code locally or in a cloud environment to verify bug fixes, it faces severe container escape and privilege escalation risks.

L5 · Evaluation & Observability⚠ not certain from listing

Not certain from the listing — there is no mention of guardrails, output filtering, or observability tools to detect anomalous code generation or malicious instructions before they are pushed to GitHub.

L6 · Security & Compliance (cross-cutting)⚠ not certain from listing

Not certain from the listing — compliance certifications (e.g., SOC 2) and fine-grained authorization policies are not specified, raising concerns about how repository access tokens are secured and audited.

L7 · Agent Ecosystem⚠ not certain from listing

Not certain from the listing — the agent operates primarily as a standalone tool within a single repository context, with no explicit multi-agent or marketplace interactions described.

MAESTRO — the 7-layer agentic threat-modeling framework (Cloud Security Alliance / Ken Huang).