
Apple Ferret-UI
A multimodal AI model for enhanced understanding and interaction with mobile user interfaces.
🛡️ AgentReady threat assessment
MAESTRO 7-layer threat model + OWASP AIVSS risk score for Apple Ferret-UI, derived from its capabilities.
AIVSS 9.2 · Critical
View MAESTRO 7-layer threat model →Overview
Apple's Ferret-UI is a multimodal large language model (MLLM) designed to comprehend and interact with mobile user interfaces (UIs). It possesses referring, grounding, and reasoning capabilities, enabling it to identify UI elements such as icons and text, understand their spatial relationships, and execute tasks based on this understanding. Ferret-UI aims to improve user interactions by facilitating advanced control over devices through natural language commands, potentially enhancing accessibility and automation in mobile applications.
Key features
- multimodal AI
- user interface understanding
- mobile automation
- accessibility
- natural language processing
Use cases
- Enhancing virtual assistants' ability to navigate and control mobile applications.
- Improving accessibility features by providing detailed descriptions of on-screen elements.
- Automating complex tasks within mobile apps through natural language commands.
- Facilitating app testing and usability studies by understanding UI layouts.