Google unveils the Gemini model, which allows robots to function without an internet connection

Google has created a new Gemini model designed just for robots and does not need an internet connection to function. The “Gemini Robotics On-Device” model, according to the tech giant, is an efficient on-device robotics model that delivers “general-purpose dexterity” and faster work adaptation. This new version is based on the Gemini Robotics VLA (visual language action) paradigm, which was released in March and extended Gemini 2.0’s capacity to reason in several ways and comprehend the real world to physical applications. The on-device design will benefit applications that need low latency and make environments more robust when there is no or intermittent internet.

In addition, Google offers writers with the Gemini Robot SDK. They will be able to utilise this SDK to test Gemini Robotics On-Device for their own occupations and places, play with the model in Google’s MuJoCo physics simulator, and make adjustments for new regions with only a few demonstrations (as little as 50 to 100). Developers may get the SDK by registering as a trusted tester with Google.

Google’s new Gemini model for robots: What it can do and how it works

Google describes Gemini Robotics On-Device as a lightweight robotics foundation model designed for bi-arm robots that allows them to do sophisticated dexterous manipulation with very little additional computing resources.

It is built on Gemini Robotics’ features and enables rapid testing, fine-tuning for new workloads, and local low latency inference. The company also asserts that the model has strong generalisation spanning visual, semantic, and behavioural challenges. It can obey natural language orders and do complex tasks such as unzipping bags or folding clothes while working directly on the robot.

Gemini Robotics On-Device outperforms other on-device models during testing, particularly in tough, out-of-distribution, and multi-step conditions. You may fine-tune it with just 50 to 100 samples, allowing it to be utilised in a variety of ways.

The model was first trained on ALOHA robots, but it also performed well with the bi-arm Franka FR3 and Apollo humanoid robots, which could fold clothing and assemble belts. This is the first time a VLA model has been released for on-device fine-tuning, allowing you to deploy powerful robotics without relying on the cloud.