skill-creator — agentic threat model

8.8AIVSS 8.8 · High

The skill-creator agent poses a moderate-to-high risk as a meta-agent that executes local Python scripts and generates code (skills) for other agents. A compromise could lead to local code execution or the silent injection of vulnerabilities into downstream agent skills.

OWASP AIVSS score rationale

AIVSS = (CVSS_Base + AARS) × Mitigation_Factor, where AARS = (10 − CVSS_Base) × (Factor_Sum / 10) × ThM

CVSS base 7.8AARS uplift 0.99Factor sum 4.5/10Threat ×1.0Mitigation ×1.0

Autonomy of Action		0.40
Goal-Driven Planning		0.60
Self-Modification		0.80
Dynamic Tool Use		0.50
Persistent Memory		0.20
Contextual Awareness		0.40
Dynamic Identity		0.10
Multi-Agent Interactions		0.30
Non-Determinism		0.70
Opacity & Reflexivity		0.50

Scored with the canonical OWASP AIVSS formula (AIVSS calculator reference); agentic risk factors estimated from the agent’s described capabilities.

MAESTRO 7-layer threat model

Per-layer threats for this agent. Layers tagged “not certain from listing” are general, caveated commentary where the public description didn’t pin that layer.

L1 · Foundation Models✓ mapped

Uses foundation models to draft skills and run test prompts. Vulnerable to prompt injection during the drafting phase, which could lead to the generation of malicious skills or poisoned test cases.

L2 · Data Operations✓ mapped

Manipulates local files (SKILL.md) and evaluation data. Threat of data poisoning where malicious inputs manipulate the benchmark results or corrupt the skill definitions.

L3 · Agent Frameworks✓ mapped

Orchestrates a draft-test-evaluate loop and executes bundled Python scripts (e.g., generate_review.py). Threat of insecure tool execution if the script execution environment is not strictly isolated.

L4 · Deployment & Infrastructure⚠ not certain from listing

Not certain from the listing — the deployment environment (sandbox vs local CLI) is not specified, but running local python scripts poses a high risk of host compromise if executed in an unsandboxed environment.

L5 · Evaluation & Observability✓ mapped

Focuses heavily on evaluation and benchmarking (eval-viewer). Threat of evaluation gaming, where the agent optimizes skills to pass specific benchmark tests while introducing security regressions or functional blind spots.

L6 · Security & Compliance (cross-cutting)⚠ not certain from listing

Not certain from the listing — no identity, authorization, or compliance frameworks are mentioned for this open-source tool.

L7 · Agent Ecosystem✓ mapped

Acts as a meta-agent generating capabilities for other agents. A compromise here could lead to supply-chain style attacks, distributing vulnerable or malicious skills across an entire agent ecosystem.

MAESTRO — the 7-layer agentic threat-modeling framework (Cloud Security Alliance / Ken Huang).