BT Explainer: Google’s Gemma 4 could put powerful AI on your phone and laptop

BT Explainer: Google’s Gemma 4 could put powerful AI on your phone and laptop

Gemma 4 is released under the Apache 2.0 licence, allowing commercial use, modification and deployment without restrictive conditions.

Advertisement
Google is positioning Gemma 4 as a bridge between open and proprietary AI ecosystems, giving developers flexibility to build locally or scale via cloud infrastructure.Google is positioning Gemma 4 as a bridge between open and proprietary AI ecosystems, giving developers flexibility to build locally or scale via cloud infrastructure.
Business Today Desk
  • Apr 3, 2026,
  • Updated Apr 3, 2026 1:22 PM IST

Google has unveiled Gemma 4, its latest family of open AI models designed to deliver high-end reasoning and agentic capabilities while capable of running across devices, from smartphones to enterprise GPUs.

What is Gemma 4?

Gemma 4 is Google’s newest generation of open models built using the same research stack as its proprietary Gemini models. Unlike large closed systems, Gemma is designed to be lightweight and deployable across a wide range of hardware setups.

Advertisement

Related Articles

The company is positioning the models as a bridge between open and proprietary AI ecosystems, giving developers flexibility to build locally or scale via cloud infrastructure.

What models are being launched and how do they perform?

The Gemma 4 family is available in four variants:

  • Effective 2B (E2B)  
  • Effective 4B (E4B)  
  • 26B Mixture of Experts (MoE)  
  • 31B Dense

Google claims the larger models rank among the top open models globally, outperforming significantly larger systems despite using fewer parameters. The 26B MoE model activates only a fraction of its parameters during inference, improving speed and efficiency, while the 31B Dense model focuses on maximum output quality.

How is Gemma 4 optimised for edge and efficiency?

A key shift with Gemma 4 is its push toward “intelligence-per-parameter.” Instead of scaling size, the models are optimised to deliver higher performance with lower compute requirements.

Advertisement

The smaller E2B and E4B models are built for on-device use cases such as smartphones, IoT devices and embedded systems. These models support multimodal inputs, including audio, images and video, and are designed to run offline with low latency.

What are the key capabilities?

Gemma 4 introduces several upgrades aimed at developers and enterprises:

  • Advanced reasoning: Improved performance in multi-step logic and instruction-following tasks  
  • Agentic workflows: Native support for function calling, structured outputs and system instructions  
  • Code generation: Enables local, offline coding assistants  
  • Multimodal support: Handles images, video and audio inputs  
  • Long context windows: Up to 256K tokens for larger models  
  • Language support: Trained across more than 140 languages

What does the open licence mean for developers?

Gemma 4 is released under the Apache 2.0 licence, allowing commercial use, modification and deployment without restrictive conditions.

Advertisement

The models are available through platforms like Hugging Face, Kaggle and Ollama, and support a wide range of tools including Transformers, vLLM and llama.cpp.  Developers can fine-tune the models locally or deploy them via cloud platforms.  

For Unparalleled coverage of India's Businesses and Economy – Subscribe to Business Today Magazine

Google has unveiled Gemma 4, its latest family of open AI models designed to deliver high-end reasoning and agentic capabilities while capable of running across devices, from smartphones to enterprise GPUs.

What is Gemma 4?

Gemma 4 is Google’s newest generation of open models built using the same research stack as its proprietary Gemini models. Unlike large closed systems, Gemma is designed to be lightweight and deployable across a wide range of hardware setups.

Advertisement

Related Articles

The company is positioning the models as a bridge between open and proprietary AI ecosystems, giving developers flexibility to build locally or scale via cloud infrastructure.

What models are being launched and how do they perform?

The Gemma 4 family is available in four variants:

  • Effective 2B (E2B)  
  • Effective 4B (E4B)  
  • 26B Mixture of Experts (MoE)  
  • 31B Dense

Google claims the larger models rank among the top open models globally, outperforming significantly larger systems despite using fewer parameters. The 26B MoE model activates only a fraction of its parameters during inference, improving speed and efficiency, while the 31B Dense model focuses on maximum output quality.

How is Gemma 4 optimised for edge and efficiency?

A key shift with Gemma 4 is its push toward “intelligence-per-parameter.” Instead of scaling size, the models are optimised to deliver higher performance with lower compute requirements.

Advertisement

The smaller E2B and E4B models are built for on-device use cases such as smartphones, IoT devices and embedded systems. These models support multimodal inputs, including audio, images and video, and are designed to run offline with low latency.

What are the key capabilities?

Gemma 4 introduces several upgrades aimed at developers and enterprises:

  • Advanced reasoning: Improved performance in multi-step logic and instruction-following tasks  
  • Agentic workflows: Native support for function calling, structured outputs and system instructions  
  • Code generation: Enables local, offline coding assistants  
  • Multimodal support: Handles images, video and audio inputs  
  • Long context windows: Up to 256K tokens for larger models  
  • Language support: Trained across more than 140 languages

What does the open licence mean for developers?

Gemma 4 is released under the Apache 2.0 licence, allowing commercial use, modification and deployment without restrictive conditions.

Advertisement

The models are available through platforms like Hugging Face, Kaggle and Ollama, and support a wide range of tools including Transformers, vLLM and llama.cpp.  Developers can fine-tune the models locally or deploy them via cloud platforms.  

For Unparalleled coverage of India's Businesses and Economy – Subscribe to Business Today Magazine

Read more!
Advertisement