Link to the announcement below:
Gemini Robotics-ER 1.5
- Gemini Robotics 1.5 – Most capable vision-language-action (VLA) model turns visual information and instructions into motor commands for a robot to perform a task. This model thinks before taking action and shows its process, helping robots assess and complete complex tasks more transparently. It also learns across embodiments, accelerating skill learning.
- Gemini Robotics-ER 1.5 – Most capable vision-language model (VLM) reasons about the physical world, natively calls digital tools and creates detailed, multi-step plans to complete a mission. This model now achieves state-of-the-art performance across spatial understanding benchmarks.
------------------------------
Todor Kostov
Director
------------------------------