Working 28 Hours While AI Agents Clock 90: A Real-World Productivity Nowcast

Last Week I Worked 28 Hours. My Agents Put in Another 90.

TLDR: A solo operator shares a detailed week-by-week breakdown of how AI agents handled the bulk of his work across three dedicated workspaces, achieving a claimed 4.2x productivity multiplier over 2023 levels. The piece then pivots to a sharp critique of Anthropic's labor market findings, arguing that a "slow freeze" in hiring is already underway.

Summary:

This is a fascinating piece because it does something most AI productivity articles do not: it shows receipts. The author describes running three separate AI workspaces in Obsidian, each configured with different instructions, context, Skills, and MCP servers. The primary workhorse is Claude Opus 4.6 running in Claude Code, handling what the author calls "busywork" across content creation, agency operations, and research. On top of that, he layers in Perplexity AI for internet research, Google's Nano Banana Pro for image generation, Apollo for lead data, and four additional AI models for a filmmaking side project. That is eight different AI models in active use during a single work week.

The most concrete example is the lead generation workflow. The author needed to pre-screen 6,000 companies for cold outreach and do deep research on 250 of them, covering lines of business, tech stacks, and platform strategies. His AI agents handled most of this autonomously, generating 250 personalized company reports in about two minutes of runtime. He is honest that it took roughly six hours over preceding weeks to build and refine that pipeline, which is an important detail many AI productivity stories conveniently leave out. The upfront investment in tooling is real and non-trivial.

Where the piece gets genuinely interesting is the second half, where the author pushes back on Anthropic's recent economic research paper. Anthropic introduced a metric called "observed exposure" and concluded there is "no systematic increase in joblessness for highly exposed workers." The author counters with hard numbers: US job openings have dropped to 6.5 million, the lowest since 2017 outside the pandemic. Monthly job additions in 2026 are down 71% year-over-year. The quit rate sits at 2.0%, well below pre-pandemic levels, meaning people are staying in jobs because there is nowhere to go. In Europe, youth unemployment remains at 15.1% EU-wide, with Spain at 23.5% and Finland at 22.4%.

He cites a January 2026 Harvard Business Review survey finding that 60% of companies have reduced headcount in anticipation of AI productivity gains, while only 2% made cuts based on actual, proven results. That is a striking gap. Companies are cutting based on vibes, not evidence.

The article also includes a rapid-fire news roundup covering the Anthropic-Pentagon standoff, GPT-5.4 hitting 50% on FrontierMath, three Chinese labs shipping frontier-competitive models in a single week, and the W3C's WebMCP standard for browser-to-agent communication.

Now, here is what deserves pushback. The 4.2x productivity claim is self-reported and based on the author's own estimate of how long tasks would have taken in 2023. That is inherently unreliable. Memory is a terrible benchmark. We tend to overestimate how hard things used to be and underestimate the learning curve we have already climbed. A more rigorous approach would involve actual time-tracking data from 2023, which the author does not provide.

There is also something the author dances around but never directly addresses: what is the quality of those 250 auto-generated company reports? Generating them in two minutes is impressive from a throughput perspective, but if the reports contain hallucinated details about a prospect's tech stack or mischaracterize their business, the outreach built on top of them could do more harm than good. Speed without accuracy is just faster failure.

The piece also has a coaching program pitch embedded in it, which is worth noting. The author has a financial incentive to make AI productivity look as dramatic as possible, since he sells a program helping executives build their own AI operating systems. That does not invalidate his observations, but it does mean you should calibrate accordingly.

What is genuinely missing from this analysis is any discussion of failure modes. Which tasks did the AI agents get wrong? How much time was spent correcting agent mistakes? Every practitioner working with AI agents at this intensity encounters hallucinations, context window limitations, and tool-calling failures. The absence of any mention of these suggests a selective retelling.

Key takeaways:

Running multiple AI workspaces with specialized MCP server configurations can create meaningful productivity gains, but requires significant upfront setup investment
The 4.2x productivity multiplier claim is self-reported and should be taken as directional rather than precise
Lead generation and research tasks see the largest speed-ups because they are fundamentally information retrieval and synthesis, which is where current AI models excel
The labor market data paints a more concerning picture than Anthropic's "observed exposure" research suggests, with companies cutting preemptively based on expectations rather than evidence
60% of companies reducing headcount in anticipation of AI gains while only 2% have proven results represents a potentially dangerous expectations gap
Three Chinese labs shipping frontier-competitive models in one week signals that the model selection landscape is widening and getting cheaper
WebMCP from Google, Microsoft, and W3C could become a foundational standard for how AI agents interact with the web

Last week I worked 28 hours. My agents put in another 90.