Relari (YC W24) — agentic threat model

7.4AIVSS 7.4 · High

Relari acts as a critical diagnostic and evaluation layer for AI applications; its primary security risks involve the potential for 'evaluation gaming' where compromised metrics mask underlying agent vulnerabilities, and the exposure of sensitive production logs during online monitoring.

OWASP AIVSS score rationale

AIVSS = (CVSS_Base + AARS) × Mitigation_Factor, where AARS = (10 − CVSS_Base) × (Factor_Sum / 10) × ThM

CVSS base 6.5AARS uplift 0.91Factor sum 2.6/10Threat ×1.0Mitigation ×1.0

Autonomy of Action		0.20
Goal-Driven Planning		0.20
Self-Modification		0.10
Dynamic Tool Use		0.30
Persistent Memory		0.20
Contextual Awareness		0.40
Dynamic Identity		0.10
Multi-Agent Interactions		0.20
Non-Determinism		0.50
Opacity & Reflexivity		0.40

Scored with the canonical OWASP AIVSS formula (AIVSS calculator reference); agentic risk factors estimated from the agent’s described capabilities.

MAESTRO 7-layer threat model

Per-layer threats for this agent. Layers tagged “not certain from listing” are general, caveated commentary where the public description didn’t pin that layer.

L1 · Foundation Models⚠ not certain from listing

Not certain from the listing — Relari likely utilizes third-party foundation models to generate synthetic data and run custom evaluators. These models are susceptible to prompt injection, which could corrupt evaluation metrics or synthetic test-set generation.

L2 · Data Operations✓ mapped

Relari generates synthetic test-sets and processes user feedback to train custom evaluators. Threats include data poisoning of the synthetic generation pipeline and the potential leakage of sensitive production data ingested for online monitoring.

L3 · Agent Frameworks⚠ not certain from listing

Not certain from the listing — Relari orchestrates evaluations using 'Agent Contracts'. If the orchestration framework is vulnerable, malicious test cases could lead to insecure tool execution or state manipulation during simulation runs.

L4 · Deployment & Infrastructure⚠ not certain from listing

Not certain from the listing — As an API-driven platform, Relari requires secure hosting and secrets management to protect user API keys and connection endpoints to the applications under test.

L5 · Evaluation & Observability✓ mapped

This is Relari's core layer. It provides 30+ metrics and online monitoring. The primary threat is evaluation gaming, where developers or adversarial agents optimize prompts to bypass Relari's metrics without resolving underlying safety issues.

L6 · Security & Compliance (cross-cutting)⚠ not certain from listing

Not certain from the listing — Compliance risks exist if production logs ingested for continuous evaluation contain PII or PHI, requiring robust data masking and access control policies that are not detailed in the public directory.

L7 · Agent Ecosystem✓ mapped

Relari defines 'Agent Contracts' to test complex agentic applications. A threat exists if simulated multi-agent interactions in Relari's test environment fail to capture emergent rogue behaviors that subsequently manifest in production.

MAESTRO — the 7-layer agentic threat-modeling framework (Cloud Security Alliance / Ken Huang).

These scores are auto-generated from public information (the agent's own listing, docs, and repository) using the canonical OWASP AIVSS formula and the MAESTRO framework — an estimate for guidance, not a penetration test, audit, or certification. See the scoring methodology. Are you the vendor? Factual corrections are free.