Pixtral 12B 24.09 — agentic threat model
Pixtral 12B is a raw multimodal foundation model with low inherent agentic risk, but its high capabilities in vision-to-code and large context processing present potential downstream risks if integrated into autonomous agent frameworks without strict sandboxing.
OWASP AIVSS score rationale
| Autonomy of Action | 0.10 | |
| Goal-Driven Planning | 0.20 | |
| Self-Modification | 0.00 | |
| Dynamic Tool Use | 0.10 | |
| Persistent Memory | 0.00 | |
| Contextual Awareness | 0.60 | |
| Dynamic Identity | 0.00 | |
| Multi-Agent Interactions | 0.00 | |
| Non-Determinism | 0.70 | |
| Opacity & Reflexivity | 0.80 |
Scored with the canonical OWASP AIVSS formula (AIVSS calculator reference); agentic risk factors estimated from the agent’s described capabilities.
MAESTRO 7-layer threat model
Per-layer threats for this agent. Layers tagged “not certain from listing” are general, caveated commentary where the public description didn’t pin that layer.
As a multimodal model (text + vision), Pixtral 12B is highly susceptible to L1 threats such as adversarial prompt injection, visual jailbreaks (e.g., hiding malicious instructions in images), and model reprogramming. Its open-source nature (Apache 2.0) mitigates model stealing threats but increases the risk of malicious fine-tuning or downstream exploitation.
Not certain from the listing — The listing describes the model's 128K context window and OCR capabilities, but does not specify any default RAG pipelines, vector databases, or training data operations, which would be defined by the deploying entity.
Not certain from the listing — Pixtral 12B is a raw foundation model and does not include a built-in agent framework, memory management, or tool-calling orchestration. Downstream developers must implement these, introducing potential framework-level vulnerabilities.
Not certain from the listing — As an open-source model, deployment and infrastructure (hosting, sandboxing, API security) are entirely dependent on the user's environment. There are no built-in infrastructure controls specified.
Not certain from the listing — While the model's performance is benchmarked (MMMU, MathVista), the listing does not mention any runtime evaluation, guardrails, logging, or observability features to detect drift or anomalous inputs.
Not certain from the listing — There is no mention of compliance certifications (such as SOC2, ISO 27001, or EU AI Act alignment) or built-in identity and access management policies in the model's directory listing.
Not certain from the listing — The model is a single-agent foundation model and does not natively participate in a multi-agent ecosystem or marketplace, meaning ecosystem-level threats are not applicable out-of-the-box.
MAESTRO — the 7-layer agentic threat-modeling framework (Cloud Security Alliance / Ken Huang).
These scores are auto-generated from public information (the agent's own listing, docs, and repository) using the canonical OWASP AIVSS formula and the MAESTRO framework — an estimate for guidance, not a penetration test, audit, or certification. See the scoring methodology. Are you the vendor? Factual corrections are free.