Hebbia AI — agentic threat model
Hebbia AI presents a high-consequence risk profile due to its integration with sensitive financial and legal enterprise data, balanced by its emphasis on transparency, citations, and enterprise-grade security controls.
OWASP AIVSS score rationale
| Autonomy of Action | 0.70 | |
| Goal-Driven Planning | 0.80 | |
| Self-Modification | 0.10 | |
| Dynamic Tool Use | 0.50 | |
| Persistent Memory | 0.60 | |
| Contextual Awareness | 0.80 | |
| Dynamic Identity | 0.20 | |
| Multi-Agent Interactions | 0.50 | |
| Non-Determinism | 0.50 | |
| Opacity & Reflexivity | 0.30 |
Scored with the canonical OWASP AIVSS formula (AIVSS calculator reference); agentic risk factors estimated from the agent’s described capabilities.
MAESTRO 7-layer threat model
Per-layer threats for this agent. Layers tagged “not certain from listing” are general, caveated commentary where the public description didn’t pin that layer.
Not certain from the listing — Hebbia likely leverages proprietary or third-party frontier foundation models to handle its infinite context window. Primary threats include prompt injection that could hijack multi-step workflows or cause misaligned outputs in sensitive financial/legal contexts.
Ingests massive structured and unstructured datasets across multiple formats. Key threats include data exfiltration of highly confidential enterprise documents, knowledge-base poisoning, and unauthorized access across tenant boundaries.
Orchestrates complex, multi-step workflows and end-to-end tasks via its 'Matrix' interface. Threats include insecure tool integration, workflow bypass, and manipulation of the agent's planning logic during execution.
Not certain from the listing — likely deployed as a secure SaaS platform with enterprise-grade hosting. Threats include container escape, insecure API endpoints, and lack of strict sandboxing during heavy document parsing and execution.
Provides strong observability through 'total transparency' and a spreadsheet-like interface delivering answers with direct citations. This mitigates opacity, but risks of evaluation gaming or undetected drift in automated workflows remain.
Explicitly claims 'enterprise-grade security' tailored for finance, law, and Fortune 500 companies. This implies robust access controls, audit logging, and compliance frameworks, though specific certifications are not detailed in the listing.
Allows users to build and run multiple AI agents to complete tasks. Threats include cascading failures across interdependent workflows and potential unauthorized cross-agent data sharing within the enterprise tenant.
MAESTRO — the 7-layer agentic threat-modeling framework (Cloud Security Alliance / Ken Huang).