Google is positioning Gemma 4 as a bridge between open and proprietary AI ecosystems, giving developers flexibility to build locally or scale via cloud infrastructure.
Google is positioning Gemma 4 as a bridge between open and proprietary AI ecosystems, giving developers flexibility to build locally or scale via cloud infrastructure.Google has unveiled Gemma 4, its latest family of open AI models designed to deliver high-end reasoning and agentic capabilities while capable of running across devices, from smartphones to enterprise GPUs.
What is Gemma 4?
Gemma 4 is Google’s newest generation of open models built using the same research stack as its proprietary Gemini models. Unlike large closed systems, Gemma is designed to be lightweight and deployable across a wide range of hardware setups.
The company is positioning the models as a bridge between open and proprietary AI ecosystems, giving developers flexibility to build locally or scale via cloud infrastructure.
What models are being launched and how do they perform?
The Gemma 4 family is available in four variants:
Google claims the larger models rank among the top open models globally, outperforming significantly larger systems despite using fewer parameters. The 26B MoE model activates only a fraction of its parameters during inference, improving speed and efficiency, while the 31B Dense model focuses on maximum output quality.
How is Gemma 4 optimised for edge and efficiency?
A key shift with Gemma 4 is its push toward “intelligence-per-parameter.” Instead of scaling size, the models are optimised to deliver higher performance with lower compute requirements.
The smaller E2B and E4B models are built for on-device use cases such as smartphones, IoT devices and embedded systems. These models support multimodal inputs, including audio, images and video, and are designed to run offline with low latency.
What are the key capabilities?
Gemma 4 introduces several upgrades aimed at developers and enterprises:
What does the open licence mean for developers?
Gemma 4 is released under the Apache 2.0 licence, allowing commercial use, modification and deployment without restrictive conditions.
The models are available through platforms like Hugging Face, Kaggle and Ollama, and support a wide range of tools including Transformers, vLLM and llama.cpp. Developers can fine-tune the models locally or deploy them via cloud platforms.
For Unparalleled coverage of India's Businesses and Economy – Subscribe to Business Today Magazine