Taxy AI — agentic threat model

8.0AIVSS 8.0 · High

Taxy AI presents a high-risk profile due to its execution of browser-level actions (like GitHub workflows and calendar scheduling) using GPT-4, making it highly susceptible to indirect prompt injection from untrusted web page content.

OWASP AIVSS score rationale

AIVSS = (CVSS_Base + AARS) × Mitigation_Factor, where AARS = (10 − CVSS_Base) × (Factor_Sum / 10) × ThM

CVSS base 8.5AARS uplift 0.88Factor sum 5.6/10Threat ×1.05Mitigation ×0.85

Autonomy of Action		0.70
Goal-Driven Planning		0.80
Self-Modification		0.10
Dynamic Tool Use		0.90
Persistent Memory		0.20
Contextual Awareness		0.80
Dynamic Identity		0.80
Multi-Agent Interactions		0.10
Non-Determinism		0.70
Opacity & Reflexivity		0.50

Scored with the canonical OWASP AIVSS formula (AIVSS calculator reference); agentic risk factors estimated from the agent’s described capabilities.

MAESTRO 7-layer threat model

Per-layer threats for this agent. Layers tagged “not certain from listing” are general, caveated commentary where the public description didn’t pin that layer.

L1 · Foundation Models✓ mapped

Uses GPT-4. Highly vulnerable to indirect prompt injection where malicious instructions embedded in web pages hijack the agent's execution flow.

L2 · Data Operations⚠ not certain from listing

Not certain from the listing — no explicit RAG or vector database is mentioned, though it dynamically processes active browser DOM data as its primary context.

L3 · Agent Frameworks✓ mapped

Translates LLM outputs into browser actions (clicks, keystrokes). Vulnerable to tool misuse if prompt injection forces the agent to perform unintended actions like deleting GitHub repos.

L4 · Deployment & Infrastructure✓ mapped

Deployed as a local browser extension. Risks include local storage exposure of API keys and potential extension sandbox escape if DOM manipulation is poorly isolated.

L5 · Evaluation & Observability⚠ not certain from listing

Not certain from the listing — being in a research preview phase, it likely lacks robust real-time guardrails, execution logging, or drift detection.

L6 · Security & Compliance (cross-cutting)✓ mapped

Operates with the security context of the logged-in user's browser session. Lacks enterprise-grade access controls, policy enforcement, or audit trails.

L7 · Agent Ecosystem✓ mapped

Currently operates as a single-agent browser automation tool with no multi-agent coordination or marketplace ecosystem described.

MAESTRO — the 7-layer agentic threat-modeling framework (Cloud Security Alliance / Ken Huang).

These scores are auto-generated from public information (the agent's own listing, docs, and repository) using the canonical OWASP AIVSS formula and the MAESTRO framework — an estimate for guidance, not a penetration test, audit, or certification. See the scoring methodology. Are you the vendor? Factual corrections are free.