Google has announced three recent models of its Gemini family and made them available as experimental versions to gather feedback from developers.
The release is a continuation of Google's iterative approach, relatively than jumping on to Gemini 2.0. The experimental models are improved versions of Twins 1.5 Pro and Gemini 1.5 Flash in addition to a brand new, smaller Gemini 1.5 Flash-8B.
Logan Kilpatrick, product lead at Google, said Google releases experimental models “to assemble feedback and get our latest updates into the hands of developers. What we learn from experimental launches determines how we release models more broadly.”
Google says the updated Gemini 1.5 Pro is a major improvement over the previous version, with improved encoding capabilities and complicated prompt processing. Gemini 1.5 models are designed to handle extremely long contexts and might retrieve and process fine-grained information from as much as not less than 10 million tokens. However, the experimental models have a limit of 1 million tokens.
Gemini 1.5 Flash is the lower-cost, low-latency model designed to handle large-scale tasks and summarize long contexts of multimodal input. In initial testing of the experimental versions, the improved Pro and Flash models rose up the LMSYS leaderboard.
Chatbot Arena Update⚡!
Latest Twins (Pro/Flash/Flash-9b) Results are actually live, with over 20,000 community votes!
Highlights:
– New Twins-1.5-Flash (0827) makes an enormous jump and climbs from twenty third to sixth place in the general rating!
– New Twins-1.5-Pro (0827) shows strong progress in coding and math over … https://t.co/6j6EiSyy41 pic.twitter.com/D3XpU0Xiw2— lmsys.org (@lmsysorg) 27 August 2024
Gemini Flash 8B
When Google Twins 1.5 Technical report Earlier this month, among the Google DeepMind team's early work on a fair smaller, 8 billion parameter variant of the Gemini 1.5 Flash model was presented.
The multimodal experimental Gemini 1.5 Flash-8B model is now available for testing, with benchmark tests showing it beating Google's lightweight Gemma 2-9B model and Meta's much larger Llama 3-70B.
The idea behind Gemini 1.5 Flash-8B is to have an especially fast and really low-cost model that also has multimodal capabilities. Google says it “can power intelligent agents at scale and enable real-time interactions with a big user base.” Flash-8B is “intended for every thing from high-volume multimodal use cases to long-context summarization tasks.”
Developers searching for a light-weight, low-cost, and fast multimodal model with a 1M token context will likely be more enthusiastic about Gemini Flash-8B than the improved Flash and Pro models. Those searching for more advanced models can be wondering when we will expect Google to release Gemini 1.5 Ultra.