Google's Always-On Agent Is Only as Good as Your Brief

Published on 21.05.2026

AI & AGENTS

Google's Always-On Agent Is Only as Good as Your Brief

TLDR: Google I/O put a persistent cloud agent called Gemini Spark front and center, capable of running tasks around the clock while you sleep. The problem is not whether the model is powerful enough. It is whether the person giving it instructions knows how to define work clearly.

Google spent two hours at I/O telling you the work will get done while you do something else. Gemini Spark runs on its own cloud machine, chewing through email, monitoring projects and doing research under your direction. The search box got its biggest redesign in 25 years. There are new audio glasses, a faster default model in Gemini 3.5 Flash, and a reshuffled pricing tier: a new $100-per-month Ultra plan sitting below a top tier that dropped from $250 to $200. Big day. The bit the keynote skipped is the part that actually matters.

Everything announced points the same direction. AI that reasons on its own, acts in the background, and strings tools together without you approving every individual step. Spark is plugged into Gmail, Docs and outside apps via MCP, and it does not clock off. On stage, the demo task is clean. The inbox is tidy. The goal fits in one sentence. Your actual work is messier than that.

Here is what the keynote will never sell you: a background agent does not fail because the model is weak. Gemini 3.5 Flash is fast, and a stronger Pro version lands in June. Agents fail because the instructions given to them were never properly defined. A thin brief returns thin output, and now that output arrives at 3am while you sleep, looking polished and confident and full of wrong answers.

Most people are genuinely bad at handing work to a human assistant. The brief is half a sentence. The actual constraints live inside their own head. Nobody specified what done looks like. You correct it after the fact and call that managing. An agent makes this habit expensive in a specific way: a person senses a half-baked instruction and asks a clarifying question. An agent takes the half-baked instruction and runs with it, on a schedule, with complete confidence. Feed it a fuzzy task and it returns a tidy pile of wrong work by morning.

The people who will actually pull value from Spark are the ones who already write tight briefs. If you cannot tell a new hire what good output looks like in plain words, a cloud machine acting on your behalf will not save you. It will scale your vagueness and bill you monthly for the privilege. The discipline required is boring. It involves writing out inputs, listing steps in order, describing the exact shape of the output you want, and defining the single line that tells you the job is done. Run it as a plain prompt first, watch where it guesses wrong, fix the brief rather than the tool, and only automate once it produces something you would sign your name to without edits. Google showed the automation first and the definition never. Do it the other way around.

As for the access question: a background agent with standing permission to your mail, your documents, your calendar and third-party tools is a fundamentally different animal from a chatbot you paste text into. Decide in advance what it may act on without asking you, what it must surface to you before proceeding, and how you review its output before it leaves the building. Treat it like a junior with keys to the office, not a faster search bar.

Key takeaways:

  • Gemini Spark is a persistent 24/7 cloud agent wired into Workspace and third-party tools via MCP, with new pricing starting at $100 per month.
  • The model quality is not the bottleneck. Poorly defined tasks produce confidently wrong output at scale, and automation makes that problem arrive faster.
  • Before enabling any background agent, write a clear brief for one repeating task, test it manually until it produces work you would sign off on, then automate. Not the other way around.
  • Grant agent access deliberately: define what it can act on autonomously, what requires your approval, and how you audit its output before it reaches anyone else.

Why do I care: As someone who builds and ships frontend systems, the thing that should make you stop is not the pricing or even the feature list. It is the access model. An agent with persistent, broad permissions to your workspace is an architectural decision, not a subscription upgrade. The same rigor I would apply to reviewing what a third-party SDK can touch in a production app applies here. The keynote made it look effortless because it demoed a clean task against a clean inbox. Define your tasks precisely first, then let the agent run. That order is not optional.

Google sold you an employee who never sleeps. Now what