OpenAI Codex SDK — agentic threat model

9.9AIVSS 9.9 · Critical

The OpenAI Codex SDK presents a high-risk profile due to its integration into sensitive CI/CD pipelines and GitHub workflows, where autonomous code execution and orchestration could lead to severe supply chain compromises if hijacked.

OWASP AIVSS score rationale

AIVSS = (CVSS_Base + AARS) × Mitigation_Factor, where AARS = (10 − CVSS_Base) × (Factor_Sum / 10) × ThM

CVSS base 9.8AARS uplift 0.15Factor sum 6.6/10Threat ×1.1Mitigation ×1.0

Autonomy of Action		0.80
Goal-Driven Planning		0.70
Self-Modification		0.30
Dynamic Tool Use		0.90
Persistent Memory		0.60
Contextual Awareness		0.60
Dynamic Identity		0.70
Multi-Agent Interactions		0.50
Non-Determinism		0.80
Opacity & Reflexivity		0.70

Scored with the canonical OWASP AIVSS formula (AIVSS calculator reference); agentic risk factors estimated from the agent’s described capabilities.

MAESTRO 7-layer threat model

Per-layer threats for this agent. Layers tagged “not certain from listing” are general, caveated commentary where the public description didn’t pin that layer.

L1 · Foundation Models✓ mapped

Uses OpenAI Codex models, making it susceptible to prompt injection, adversarial code generation, and model alignment issues that could result in the generation of insecure or malicious code.

L2 · Data Operations⚠ not certain from listing

Not certain from the listing — details on data operations, vector stores, or RAG pipelines are not specified, though thread resumption implies some state/history storage.

L3 · Agent Frameworks✓ mapped

The SDK orchestrates Codex agents using a TypeScript library and CLI. Key threats include insecure tool integration, where the agent might execute generated code or commands without proper validation, and memory poisoning via thread continuation.

L4 · Deployment & Infrastructure✓ mapped

Integrates directly with CI/CD pipelines, CLI, and GitHub Actions. This exposes the host environment to severe threats such as container escape, privilege escalation, and unauthorized access to repository secrets or deployment environments.

L5 · Evaluation & Observability⚠ not certain from listing

Not certain from the listing — no explicit mention of built-in evaluation, logging, guardrails, or observability tools for monitoring agent actions.

L6 · Security & Compliance (cross-cutting)⚠ not certain from listing

Not certain from the listing — no details provided regarding compliance certifications, built-in access controls, or security policies.

L7 · Agent Ecosystem⚠ not certain from listing

Not certain from the listing — while it mentions orchestrating agents, details on multi-agent collaboration protocols or marketplace interactions are not provided.

MAESTRO — the 7-layer agentic threat-modeling framework (Cloud Security Alliance / Ken Huang).

These scores are auto-generated from public information (the agent's own listing, docs, and repository) using the canonical OWASP AIVSS formula and the MAESTRO framework — an estimate for guidance, not a penetration test, audit, or certification. See the scoring methodology. Are you the vendor? Factual corrections are free.