AgentReadyHomeAgent Listing

← ship-mate

ship-mate — agentic threat model

9.6AIVSS 9.6 · Critical

ship-mate presents a high-risk agentic profile due to its end-to-end write and execution surface, spanning code modification, PR creation, and local/CI test execution via Playwright.

OWASP AIVSS score rationale

AIVSS = (CVSS_Base + AARS) × Mitigation_Factor, where AARS = (10 − CVSS_Base) × (Factor_Sum / 10) × ThM
CVSS base 8.5AARS uplift 1.07Factor sum 6.5/10Threat ×1.1Mitigation ×1.0
Autonomy of Action
0.90
Goal-Driven Planning
0.80
Self-Modification
0.30
Dynamic Tool Use
0.90
Persistent Memory
0.20
Contextual Awareness
0.70
Dynamic Identity
0.50
Multi-Agent Interactions
0.90
Non-Determinism
0.70
Opacity & Reflexivity
0.60

Scored with the canonical OWASP AIVSS formula (AIVSS calculator reference); agentic risk factors estimated from the agent’s described capabilities.

MAESTRO 7-layer threat model

Per-layer threats for this agent. Layers tagged “not certain from listing” are general, caveated commentary where the public description didn’t pin that layer.

L1 · Foundation Models✓ mapped

Uses Claude Code (Anthropic Claude models) as its foundation. Risks include prompt injection via malicious user stories or codebase files, leading to unauthorized code generation or execution.

L2 · Data Operations⚠ not certain from listing

Not certain from the listing — operates on local codebase files and user-provided story files. Lacks explicit mention of vector databases or RAG, but codebase context acts as the primary data ingestion layer.

L3 · Agent Frameworks✓ mapped

Orchestrates a multi-agent pipeline (orchestrator, architect, developer, reviewer, QA). Vulnerable to tool misuse and insecure tool integration, as the framework translates model outputs directly into file writes and PR creation.

L4 · Deployment & Infrastructure✓ mapped

Runs locally or in CI/CD environments to execute Playwright tests and edit code. This creates severe risks of container/host compromise or lateral movement if malicious code is injected and executed during the test phase.

L5 · Evaluation & Observability⚠ not certain from listing

Not certain from the listing — there is no mention of built-in guardrails, evaluation frameworks, or real-time observability to detect anomalous file modifications or malicious test scripts before execution.

L6 · Security & Compliance (cross-cutting)⚠ not certain from listing

Not certain from the listing — security controls depend entirely on the host environment's git/SSH permissions and Claude Code's underlying configuration. No native authentication or policy enforcement is described.

L7 · Agent Ecosystem✓ mapped

Features a highly integrated multi-agent pipeline (orchestrator -> architect -> developer -> reviewer -> QA). Vulnerable to cascading failures and trust abuse where a compromised developer agent bypasses the reviewer and QA agents.

MAESTRO — the 7-layer agentic threat-modeling framework (Cloud Security Alliance / Ken Huang).