Google is expanding its visual search app Lens with the power to reply questions on your surroundings in near real time.
English-speaking Android and iOS users with the Google app installed can now start recording a video through Lens and ask questions on interesting objects within the video.
Lou Wang, head of product management at Lens, said the feature uses a “tailor-made” Gemini model to grasp the video and relevant questions. Gemini is Google's family of AI models and powers a lot of products across the corporate's portfolio.
“Let’s say you wish to learn more about some interesting fish,” Wang said in a news conference. “(Lens will) produce a chart that explains why they swim in circles, together with other resources and helpful information.”
To access Lens' latest video analytics feature, you'll need to join Google's Search Labs program and enroll for the experimental AI Overviews and More features in Labs. In the Google app, holding down your smartphone's shutter button will activate Lens' video recording mode.
If you ask an issue while recording a video, Lens will link to a solution provided by AI Overviews, the feature in Google Search that uses AI to summarize information from the net.
According to Wang, Lens uses AI to find out which images in a video are most “interesting” and salient – ​​and most significantly, relevant to the query being asked – and uses these to “ground” the reply from AI overviews.
“All of this comes from observing how persons are currently attempting to use things like Lens,” Wang said. “If you lower the barrier to asking these questions and help people satisfy their curiosity, people will naturally pick up on it.”
The introduction of video for Lens follows an identical feature that Meta unveiled last month for its Ray-Ban Meta AR glasses. Meta plans to equip the glasses with real-time AI video capabilities that may allow wearers to ask questions on their surroundings (e.g., “What form of flower is that this?”).
OpenAI also announced a feature that enables its Advanced Voice Mode tool to grasp videos. Finally, Advanced Voice Mode – a premium feature of ChatGPT – will have the opportunity to research videos in real time and take context under consideration while responding to you.
Google appears to have beaten each corporations – aside from the incontrovertible fact that Lens is asynchronous (you possibly can't chat with it in real time) and assuming the video feature works as advertised. We weren't shown a live demo in the course of the press conference, and Google has a history of overpromising relating to the capabilities of its AI.
In addition to video evaluation, Lens can now also search images and text in a single go. English-speaking users, even those that don't take part in Labs, can launch the Google app and press and hold the shutter button to take a photograph, then ask an issue by speaking out loud.
Finally, Lens gets latest e-commerce specific features.
Starting today, when Lens on Android or iOS detects a product, it’ll display details about it, including price and offers, brand, reviews and inventory. Product ID works on uploaded and newly taken photos (but not videos) and is initially limited to pick countries and certain shopping categories, including electronics, toys and sweetness.
“Let’s say you saw a backpack and also you prefer it,” Wang said. “You can use Lens to discover this product and you possibly can immediately see details you may be wondering about.”
There can also be an promoting component. According to Google, the outcomes page for products identified by Lens also shows “relevant” shopping ads with options and costs.
Why stick promoting in Lens? According to Google, around 4 billion Lens searches are related to shopping every month. For a tech giant whose lifeblood is promoting, this chance is just too lucrative to pass up.