HomeIndustriesGoogle I/O 2024 – Here are the AI ​​highlights that Google revealed

Google I/O 2024 – Here are the AI ​​highlights that Google revealed

Google's I/O 2024 event began on Tuesday with the announcement of several recent AI product developments.

OpenAI Google could have tried to outshine Google by releasing GPT-4o on Monday, however the Google I/O 2024 keynote was filled with exciting announcements.

Here's a have a look at the outstanding AI advances, recent tools and prototypes that Google is experimenting with.

Ask for photos

Google Photos, Google's photo storage and sharing service, will likely be searchable using natural language queries with Ask Photos. Users can already seek for specific objects or people of their photos, but Ask Photos takes this to the following level.

Sundar Pichai, CEO of Google, showed how you need to use Ask Photos to recollect your automobile's license plate or provide feedback on the progress of a baby's swimming skills.

Powered by TwinsAsk Photos understands the context between images and might extract text, create highlight compilations, or answer questions on saved images.

With greater than 6 billion images uploaded to Google Photos on daily basis, Ask Photos needs a big contextual window to be useful.

Twins 1.5 per

Pichai announced this Twins 1.5 Pro with a 1M token context window will likely be available Twins Advanced users. That's roughly 1,500 pages of text, hours of audio and a full hour of video.

Developers can join a waiting list to try it out Twins 1.5 Pro with a powerful 2M context window coming to general availability soon. According to Pichai, that is the following step in Google's journey towards the final word goal of infinite context.

Twins 1.5 Pro has also seen a performance boost in translation, reasoning and encoding and will likely be truly multimodal with the flexibility to investigate uploaded videos and audios.

Google Workspace

The expanded context and multimodal capabilities enable Twins proves extremely useful when integrating with Google Workspace.

Users can use natural language queries to ask questions Twins Questions related to your emails. The demo showed an example of a parent asking for a summary of recent emails from their child's school.

Twins may also have the opportunity to extract highlights from Google Meet meetings as much as an hour long and answer questions on them.

NotebookLM – Audio Overview

Google released NotebookLM last yr. It allows users to upload their very own notes and documents for which NotebookLM becomes an authority.

This is incredibly useful as a research guide or tutor, and Google has demonstrated an experimental upgrade called Audio Overview.

Audio Overview takes the input source documents and generates an audio discussion based on the content. Users can join the conversation and use voice to ask questions on NotebookLM and control the discussion.

It's not yet known when the audio overview will likely be introduced, but it surely could possibly be a terrific help to anyone who needs a tutor or sounding board to assist solve an issue.

Google also announced LearnLM, a brand new family of models based on it Twins and precisely tailored to learning and education. LearnLM makes NotebookLM, YouTube, search, and other educational tools more interactive.

The demo was very impressive, but it surely already seems to point out among the mistakes that Google made with its original Twins Release videos crept into this event.

AI agents and Project Astra

Pichai says AI agents are powered by Twins will soon have the opportunity to perform our on a regular basis tasks. Google is developing prototype agents that may work across platforms and browsers.

The example Pichai gave was of a user giving instructions Twins For example, you may return a pair of shoes after which have the agent undergo several emails to seek out out the relevant details, log the return in the web store and book the gathering with a courier.

Demis Hassabis introduced Project Astra, Google's prototype conversational AI assistant. The demonstration of its multimodal capabilities gave a glimpse of the long run, where an AI answers questions in real time based on live video and remembers details from previous videos.

Hassabis said a few of these features will likely be rolled out later this yr.

Generative AI

Google gave us a have a look at the generative AI tools for images, music and videos the corporate is working on.

Google introduced Imagen 3, its most advanced image generator. It reportedly responds more accurately to details under sophisticated prompts and delivers more photorealistic images.

Hassabis said Imagen 3 is Google's “best model thus far for rendering text, which has been difficult for image generation models.”

Music AI Sandbox is an AI music generator designed as knowledgeable collaborative music creation tool moderately than a full song generator. This looks as if a terrific example of how AI could possibly be used to make good music, with a human driving the creative process.

Veo is Google's video generator that converts text, image or video prompts into minute-long 1080p clips. It also allows text prompts to make video edits. Will Veo be pretty much as good as Sora?

Google will introduce its SynthID digital watermark for text, audio, images and video.

Trillium

All of those recent multimodal capabilities require a whole lot of computing power to coach the models. Pichai introduced Trillium, the sixth iteration of its Tensor Processing Units (TPUs). Trillium delivers greater than 4 times the computing power of the previous TPU generation.

Trillium will likely be available to Google's cloud computing customers later this yr and can make NVIDIA's Blackwell GPUs available in early 2025.

AI search

Google will integrate Twins into its search platform because it moves toward using generative AI to reply queries.

With AI Overview, a search query ends in a comprehensive answer compiled from multiple online sources. This makes Google Search more of a research assistant than simply finding an internet site that might need the reply.

Twins allows Google Search to make use of multi-level considering to interrupt down complex multi-part questions and return essentially the most relevant information from multiple sources.

TwinsVideo Understanding will soon allow users to make use of a video to question Google searches.

This is useful for Google Search users, but is more likely to end in significantly less traffic to the sites from which Google sources information.

Twins 1.5 flash

Google has announced a lightweight, cheaper and fast model called Twins 1.5 flash. According to Google, the model is “optimized for narrower or high-frequency tasks where the model’s responsiveness is most vital.”

Twins 1.5 Flash costs $0.35 per million tokens, much lower than the $7 you would need to pay to make use of it Twins 1.5 per.

Each of those further developments and recent products deserves its own article. We'll post updates as more information becomes available or after we're in a position to try it out for ourselves.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Must Read