What are the seven NIST AI RMF trustworthiness characteristics and how do they map to agent controls?

Question

Accepted Answer

The NIST AI RMF identifies seven characteristics of trustworthy AI: valid & reliable, safe, secure & resilient, accountable & transparent, explainable, privacy-enhanced, and fair. These characteristics are integrated into organizational practices through various controls across the RMF's functions.

Here's how these characteristics map to agent controls:

Valid & Reliable, Safe, and Secure & Resilient: These are addressed by evaluating AI system security and resilience, including adversarial robustness, prompt-injection resistance, and abuse resistance, which maps to OWASP LLM01 (prompt injection) and LLM04 (model DoS). Reliability engineering principles, such as graceful degradation and visible failures, also contribute to these characteristics.
Accountable & Transparent: This is supported by mechanisms to log decisions and trace AI behavior. For agents, this means logging every AI decision with a strict decision_json.
Explainable: While not explicitly detailed in the provided excerpts, the general principle of transparency and logging decisions can contribute to explainability.
Privacy-Enhanced: This characteristic is addressed by policies that manage risks from third-party models, datasets, and tools, including data privacy concerns like leakage of sensitive/PII training or context data, which maps to OWASP LLM02.
Fair: The provided sources do not explicitly detail controls for "fairness" as a trustworthiness characteristic.

What are the seven NIST AI RMF trustworthiness characteristics and how do they map to agent controls?

How does your AI agent score?

Related questions