Google has taken the wraps off from its new AI model, Gemini. This model is designed to behave in human-like ways, better that what other models can achieve. Gemini is a new artificial intelligence model that outperforms other models in tasks like understanding, summarising, reasoning, coding, and planning. It comes in three versions: Pro, Ultra, and Nano. The Pro version is already available, and the Ultra version will be released early next year.

Gemini Availability

Currently, Google has integrated the new Gemini Pro with its chatbot Bard which is a direct competitor of ChatGPT. You can have a text-based interaction with Gemini-powered Bard but Google has promised support for other modalities "soon". The new update is available in 170 countries and territories but it is limited to English.

What is Gemini?

Gemini is a large language model (LLM) developed by Google’s DeepMind division. It’s designed to compete with other AI systems like OpenAI’s ChatGPT and possibly outperform them.

Key Features of Gemini

Multimodal Capabilities

Gemini is designed from the ground up to be multimodal, integrating text, images, and other data types. This could allow for more natural conversational abilities. Google showcased the abilities of the AI by conversing with it with the help of a direct video interaction, showing it different objects in real time.

Use of Tools and APIs

Gemini is one of the “next-generation multimodal models” that will utilize Pathways, Google’s new AI infrastructure. This hints at Gemini potentially being the largest language model created to date.

Different Sizes and Capabilities

Gemini is a “series of models” that will be made available in different sizes and capabilities. It may utilise memory, fact-checking against sources like Google Search, and improved reinforcement learning to enhance accuracy and reduce hazardous hallucinated content.

Impact of Gemini

Gemini is expected to have a significant impact on the AI industry. It’s Google’s most powerful AI model yet and outperforms OpenAI’s GPT-4. It powers applications and devices like the Bard chatbot and Pixel 8 Pro. Google claims it is one of the first models that has been built as multi-modal LLM from the ground up. That should make interaction more natural and "human-like".

