Bismuth — agentic threat model

9.0AIVSS 9.0 · Critical

Bismuth presents a high-risk agentic profile because it possesses write-access to enterprise code repositories and project management tools, combined with the capability to execute code during fuzzing and regression testing.

OWASP AIVSS score rationale

AIVSS = (CVSS_Base + AARS) × Mitigation_Factor, where AARS = (10 − CVSS_Base) × (Factor_Sum / 10) × ThM

CVSS base 8.5AARS uplift 0.93Factor sum 5.9/10Threat ×1.05Mitigation ×0.95

Autonomy of Action		0.80
Goal-Driven Planning		0.80
Self-Modification		0.10
Dynamic Tool Use		0.90
Persistent Memory		0.40
Contextual Awareness		0.80
Dynamic Identity		0.50
Multi-Agent Interactions		0.30
Non-Determinism		0.70
Opacity & Reflexivity		0.60

Scored with the canonical OWASP AIVSS formula (AIVSS calculator reference); agentic risk factors estimated from the agent’s described capabilities.

MAESTRO 7-layer threat model

Per-layer threats for this agent. Layers tagged “not certain from listing” are general, caveated commentary where the public description didn’t pin that layer.

L1 · Foundation Models⚠ not certain from listing

Not certain from the listing — The specific foundation models used by Bismuth are not disclosed, leaving threats like model-specific prompt injection, backdoor exploits, or training data leakage unverified.

L2 · Data Operations✓ mapped

Bismuth utilizes a custom code-graph search technology to scan codebases. This introduces risks of codebase poisoning, where malicious code or comments in the repository could manipulate the search index or LLM context to exfiltrate intellectual property.

L3 · Agent Frameworks✓ mapped

The agent orchestrates complex workflows involving Jira/Linear and GitHub APIs. Vulnerabilities in the orchestration framework could lead to tool misuse, such as unauthorized repository modifications or ticket manipulation.

L4 · Deployment & Infrastructure⚠ not certain from listing

Not certain from the listing — While Bismuth performs fuzzing and regression testing (which implies code execution), the listing does not specify if these tests run in a secure, isolated sandbox, posing a risk of container escape or host compromise if malicious code is executed.

L5 · Evaluation & Observability✓ mapped

Bismuth relies on regression testing and automated reviews to ensure PR safety. A key threat is evaluation gaming, where a sophisticated exploit or backdoor bypasses the automated test suite while appearing benign to the reviewer.

L6 · Security & Compliance (cross-cutting)⚠ not certain from listing

Not certain from the listing — Although targeted at enterprise teams, there is no mention of specific compliance certifications (e.g., SOC2), fine-grained authorization policies, or audit logging mechanisms for the agent's actions.

L7 · Agent Ecosystem✓ mapped

Bismuth operates within the GitHub and Jira/Linear ecosystems. It is vulnerable to upstream trust abuse, where a compromised external bot or user modifying a Jira ticket could trigger Bismuth to generate and submit malicious pull requests.

MAESTRO — the 7-layer agentic threat-modeling framework (Cloud Security Alliance / Ken Huang).