How do I handle data retention and right-to-erasure for documents in a vector store?

Question

Accepted Answer

To handle data retention and right-to-erasure for documents in a vector store, implement deletion semantics that propagate through all derived stores and maintain a comprehensive data inventory. This addresses the OWASP LLM Top 10 risk of "Right-to-erasure failures" (L2, L6). Implement deletion workflows that propagate to derived data: When a user requests deletion, ensure that copies of their data are removed from all memory stores, embeddings, summaries, fine-tuning data, and logs. This is a core aspect of "retention and right-to-erasure" within Data & Memory Governance. Maintain a per-user data inventory: Create a continuously updated map of what personal and sensitive data exists, where it is located, how it flows through the agent system, who has access, and how long it persists. This inventory is crucial for compliance and incident response. Architectural choices to minimize data proliferation: Design systems to reduce the number of copies of personal data. Document non-deletable data: If certain data cannot be deleted, explicitly document this and provide user notice. Treat vector databases as containing original text for access control: For security purposes, assume that vector databases contain the original text, especially when considering embedding inversion attacks. Ensure classification inheritance: Any data derived from classified inputs, such as embeddings or summaries, must inherit at least the classification of its inputs. This ensures that security properties are maintained throughout the data lifecycle.

How do I handle data retention and right-to-erasure for documents in a vector store?

How does your AI agent score?

Related questions