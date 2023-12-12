A team at Google has proposed to work on an artificial intelligence project that can tell users about their lives using data from their photos and searches, reported CNBC. This program is named as Project Ellmann, after biographer and literary critic Richard David Ellmann. Google will use its new LLM Gemini to ingest data like spotting patterns in photos, search results to create an AI chatbot that “answer previously impossible questions”.

As per the report, this project aims to be “Your Life Story Teller”. Google has not revealed if this capability will be added in the Google Photos app or more of its products. This is expected to enhance its existing products with this technology. As per the company blog post, Google Photos has over 1 billion users and over 4 trillion photos and videos.

The report reveals that Project Ellmann was presented alongside Gemini at an internal summit by a Google product manager. He proposed that this project will have bird’s eye approach to one’s life story. The presentation mentioned, “We can’t answer tough questions or tell good stories without a bird’s-eye view of your life”.

It further added, “We crawl through your photos, looking at their tags and locations to identify a meaningful moment. When we step back and understand your life in its entirety, your overarching story becomes clear.” The presentation also gave examples like it can notify that it has been 10 years since your graduation. The presentation included, “It’s exactly 10 years since he graduated and is full of faces not seen in 10 years, so it’s probably a reunion.”

It will basically extract context from biographies, previous memories and photos. It will describe the images more deeply than “just pixels with labels and metadata”. It also has the ability to identify a series of moments like university years.

Recently, Google announced its “most capable” large language model (LLM), Gemini, that is now set to outperform OpenAI’s GPT-4. This LLM is multimodal which means it can process and analyse data beyond text like photos, audio and videos.

