Google I/O 2026: Gemini 3.5 Flash, Omni Video, Spark Background Agents, and the Antigravity Agent Stack

Published on 20.05.2026

AI & AGENTS

Google I/O 2026: Gemini 3.5 Flash, Omni (NanoBanana for Video), Spark (background agents), and Antigravity 2.0

TLDR: Google I/O 2026 was less about a single model announcement and more about Google assembling a coherent agent execution platform. Gemini 3.5 Flash is the fast agentic engine, Antigravity is the orchestration and deployment substrate, Gemini Omni adds multimodal generation, and Spark extends it all to a 24/7 cloud-based personal agent. The story Google told this year was infrastructure, not just capability.

Summary: The number that anchors this I/O announcement is 3.2 quadrillion tokens processed per month. A year ago it was 480 trillion. That's a seven times increase, and it tells you something important about the scale at which Google is now operating and what they need their models to do. Gemini 3.5 Flash is the answer to that operational reality: a model built for high-throughput agentic workloads, not for maximizing a benchmark score, with a one-million-token context window, 65,000 token maximum output, and four thinking levels that let you dial cost and depth per task. It's generally available today across Gemini, Search AI Mode, the API, Android Studio, and enterprise surfaces. The benchmark numbers are strong — it beats Gemini 3.1 Pro on Terminal-Bench 2.1, GDPval-AA, and MCP Atlas, and it runs four times faster than comparable frontier models, reportedly up to twelve times faster within the Antigravity serving environment.

The third-party picture is worth paying attention to. Artificial Analysis puts Gemini 3.5 Flash at an Intelligence Index score of 55, up nine points over Gemini 3 Flash, with a hallucination rate that dropped 31 percentage points on their setup. They also note it's 5.5 times more expensive to run than Gemini 3 Flash and 75% more expensive than Gemini 3.1 Pro on their benchmark suite. The Arena data is more straightforward: number nine in Text Arena, number nine in Code Arena for Frontend, with a 1507 score that's a 70-point jump over Gemini 3 Flash. The performance gain is real. The cost compression story is less clean than Google's marketing implies, and several observers noted this directly — the "Flash" label increasingly means something closer to what would have been called a high-end product model in prior generations. Whether that matters to you depends on whether you were counting on Flash meaning "cheap workhorse."

Antigravity is the piece that deserves more attention than it got in the headline coverage. Google is no longer presenting agents as a thin wrapper around a chat interface. Antigravity 2.0 is a desktop app with multi-agent orchestration built in, there's an Antigravity CLI, an SDK, and Managed Agents in the Gemini API that give you a hosted Linux sandbox supporting Bash, Python, and Node, with file access, browsing, custom skills defined in markdown, and repository and GCS mounts — all from a single API call. Google's marquee demo built a functioning operating system in twelve hours using 93 parallel sub-agents, 15,000-plus model requests, 2.6 billion tokens, and under a thousand dollars in API credits. Stage-managed benchmark or not, that architectural picture — many fast agents running in parallel over many hours — is what Google wants developers building toward.

Gemini Omni extends the model family into multimodal generation, starting with video. The framing is not "video generation model" but rather a system that merges Gemini's reasoning and world knowledge with Google's generative media capabilities, supporting text, image, audio, and video inputs and producing video edits and generation in return. Paid Gemini users get it today in the app and Flow. YouTube Shorts and Create get it this week for free. APIs come in the coming weeks. The editing consistency demonstrations drew positive responses, particularly the ability to maintain scene and character coherence across multi-turn conversational edits — which is the hard part that most video generation systems still fail at. Some observers were less enthusiastic about the interface design. The more durable observation is that Omni positions Google as competing on world grounding and physical understanding, not just text-centric intelligence.

Gemini Spark is the 24/7 personal agent piece: it runs on dedicated Google Cloud virtual machines, continues working while your device is closed, and checks with you before taking major actions. The Search redesign is arguably the most consequential consumer announcement — Google is moving Search from retrieval and ranking toward background agentic monitoring with persistent tasks, real-time signal synthesis, and on-the-fly generation of custom visual tools and simulations using Antigravity and 3.5 Flash. That's a meaningful shift in what Search is. Andrej Karpathy joining Anthropic was the day's most engaged tweet, which says something about where community attention actually was during I/O.

Key takeaways:

  • Gemini 3.5 Flash is GA with 1M context, four thinking levels, and benchmark numbers that outperform Gemini 3.1 Pro on agentic tasks — but third parties find it 75% costlier than 3.1 Pro on end-to-end suites
  • Antigravity 2.0 reframes Google's agent story as execution infrastructure: desktop app, CLI, SDK, and hosted sandboxes via a single API call
  • Gemini Omni extends into multimodal generation starting with video, with a world-model framing rather than a pure content-generation framing
  • Google Search is shifting from retrieval to background agentic monitoring with generated UI tools — a structural change in what Search does

Why do I care: The Antigravity Managed Agents API is the announcement I'd spend time with this week. Getting a hosted Linux sandbox with browsing, file access, and custom skills from a single API call removes a substantial amount of infrastructure you'd otherwise build yourself. The economics are worth watching carefully though — if 3.5 Flash is 75% more expensive than 3.1 Pro on real workloads, the cost accumulates fast in agent loops. The architecture is right. The pricing transparency needs work before you commit to it for production agentic pipelines.

[AINews] Google I/O 2026: Gemini 3.5 Flash, Omni (NanoBanana for Video), Spark (background agents), and Antigravity 2.0