Google debuts Gemini 1.5 Flash, a fast multimodal model
Join us in returning to NYC on June 5th to collaborate with executive leaders in exploring comprehensive methods for auditing AI models regarding bias, performance, and ethical compliance across diverse organizations. Find out how you can attend here.
Google is announcing the release of Gemini 1.5 Flash, a small multimodal model built for scale and tackling narrow high-frequency tasks. It comes with a “breakthrough” two million token context window and is available today in public preview through the Gemini API within Google AI Studio.
However, Gemini 1.5 Flash isn’t the sole model supporting such a large number of tokens. Its counterpart, Gemini 1.5 Pro, which debuted in February, will also see its context window expanded to two million tokens from one million. Developers interested in this update will have to sign up for the waitlist.
There are some notable differences between Gemini 1.5 Flash and Gemini 1.5 Pro. The former is intended for those who care about output speed, while the latter has more weight and performs similarly to Google’s large 1.0 Ultra model. Josh Woodward, Google’s vice president of Google Labs, points out that developers should use Gemini 1.5 Flash if they are looking to address quick tasks where low latency matters. On the other hand, he explains Gemini 1.5 Pro is geared towards “more general or complex, often multi-step reasoning tasks.”
Developers now have a wider selection of AI from which to choose versus a one-size-fits-all approach. Not all apps require the same data and AI capabilities and having variations can make the difference in how users experience an AI-powered service. What may be appealing is that Google found a way to essentially bring a state-of-the-art AI model to developers while accelerating its performance. Perhaps the biggest downside is that it’s not trained on large enough datasets that developers may want. In that case, the next option is to move up to Gemini 1.5 Pro.
VB Event
The AI Impact Tour: The AI Audit
Request an invite
Google’s models span the spectrum from the most lightweight with Gemma and Gemma 2 to Gemini Nano, Gemini 1.5 Flash, Gemini 1.5 Pro, and Gemini 1.0 Ultra. “Developers can move between the different sizes, depending on the use case. That’s why it’s got the same multimodal input abilities, the same long context, and, of course, runs as well in the same sort of backend,” Woodward points out.
This new small language model was revealed 24 hours after one of Google’s biggest AI competitors, OpenAI, unveiled GPT-4o, a multimodal LLM that will be available for all users and includes a desktop app.
Both Gemini 1.5 models are available in public preview in over 200 countries and territories worldwide, including the European Economic Area, the UK and Switzerland.


