Google Brings On-Device GenAI to Chrome, Chromebook Plus, and Pixel Watch via LiteRT-LM
LiteRT-LM has already supported wide deployment of Gemini Nano and Gemma models inside Google products.

Google has announced that its LiteRT-LM inference engine is now powering on-device generative AI across Chrome, Chromebook Plus, and Pixel Watch. This move lets AI capabilities run directly on devices—without relying on cloud calls—enabling faster responses and greater privacy.
LiteRT-LM has already supported wide deployment of Gemini Nano and Gemma models inside Google products. The new release opens up access for developers, offering a preview C++ interface so they can build custom, high-performance AI pipelines for their own applications.
Powering on-device gen AI at scale. Introducing LiteRT-LM, our production inference framework powering built-in features in Chrome, Chromebook Plus, and Pixel Watch.
— Google for Developers (@googledevs) October 2, 2025
Now, developers can use the same framework to build with our latest on-device open models, like Gemma 3n, Gemma 3… pic.twitter.com/HF0WhceaLf
The engine forms part of Google’s AI Edge stack, combining LiteRT (the runtime for model execution) and higher-level APIs (e.g. MediaPipe LLM Inference API, Chrome Built-in AI APIs, Android AICore) to let developers choose their level of control.
"You can already leverage the high-level APIs such as the MediaPipe LLM Inference API, Chrome Built-in AI APIs, and Android AICore to run LLMs on-device, but now, for the first time, we are providing the underlying C++ interface (in preview) of our LiteRT-LM engine," Google said in a blog post.
LiteRT-LM stands out with its modular, cross-platform architecture. It supports CPUs, GPUs, NPUs, and scales across Linux, Android, macOS, Windows, and even Raspberry Pi. It also enables multiple features to share a single base model using lightweight LoRA adapters.
Comments ()