Google I/O 2026 introduces company's new AI video model.
Google I/O 2026 introduces company's new AI video model.Google I/O 2026 kicked off with some exciting announcements around Google's artificial intelligence (AI) progress across apps, services, and platforms. During the keynote, Demis Hassabis, the CEO of Google DeepMind, announced the new AI video generation and editing model dubbed “Gemini Omni,” a new family of multimodal models that builds on the company's expertise in world models.
The Gemini Omni model supports conversational editing, allowing users to edit characters, backgrounds, and other elements using voice commands. According to Hassabis, the long-term goal for Omni is to generate any type of output from any kind of input. The first version, Gemini Omni Flash, is set to launch this summer.
Gemini Omni: What it is and how it works
As mentioned above, Gemini Omni is an AI video generation model that can "create anything from any input," including combining images, audio, video, and text. The AI model comes with a deeper understanding of physics, culture, history, and science, allowing it to generate more context-aware and realistic outputs.
Pichai said, "When we first announced Gemini, it was our first AI model to be natively multimodal."
"We knew that training it on a combination of text, code, audio, images, and video would give it a deeper understanding of the world. With world models, AI is moving from predicting text to simulating reality. Gemini Omni is the next step in that direction," he added during the keynote.
Google has started rolling out Gemini Omni Flash, its first model of the Omni family, to the Gemini app, Google Flow, and YouTube Shorts starting today. The company further highlighted that users can edit videos with natural conversation to keep characters and other elements consistent.
With Gemini Omni, users can shoot a regular video and then use AI prompts to reshape the scene, visuals, or action. "Your video becomes a starting point for something you never could have filmed yourself," Google explained. "Edit the action, add in new characters or objects, or transform a moment into something unexpected. Change the environment, angle, style or even specific details."
The company also revealed that the model can create AI-generated digital avatars of users based on their voice and appearance. This way, users can generate videos in which an AI version of them speaks or appears on screen without recording every scene manually. Lastly, all AI-generated videos created with Gemini Omni will include SynthID, Google’s invisible digital watermarking system, designed to help identify and verify that the content was generated using AI, even if viewers cannot visibly see the marker in the video.
For Unparalleled coverage of India's Businesses and Economy – Subscribe to Business Today Magazine