Nvidia Exits AI Lab Investments, GPT-5.4 Gets a Million-Token Window, and Anthropic Hits $20B Revenue

Published on 05.03.2026

AI & AGENTS

Nvidia Pulls Back from OpenAI and Anthropic as IPOs Approach

TLDR: Nvidia is unwinding its investment positions in both OpenAI and Anthropic as both companies prepare for IPOs, ending what had been a roughly $40 billion strategic bet. This signals a shift from Nvidia being an investor-partner to purely a hardware supplier in the AI ecosystem.

Summary:

This is one of those moves that looks boring on the surface but tells you everything about where we are in the AI infrastructure cycle. Nvidia pouring money into AI labs made sense when those labs were unproven and Nvidia needed to ensure demand for its GPUs. Now that demand is not exactly a problem -- every major tech company is fighting over GPU allocations -- Nvidia no longer needs to subsidize its own customers.

The IPO angle is worth scrutinizing. When OpenAI and Anthropic go public, their investor relationships become subject to entirely different regulatory and disclosure requirements. Nvidia sitting on the cap table of its two largest customers while also being their primary hardware supplier creates the kind of conflict-of-interest story that would dominate every S-1 risk section. Getting out ahead of that is pragmatic, not dramatic.

What deserves more critical attention is what this means for the "vertical integration" thesis that Silicon Valley has been pushing. The narrative was that the AI stack would consolidate -- chip makers would invest in labs, labs would build applications, and the whole thing would be one vertically integrated machine. Nvidia stepping back suggests the opposite: the AI industry is actually disaggregating into distinct layers, much like the rest of enterprise tech eventually does.

The $40 billion figure also warrants some context. Nvidia's market cap is well north of $2 trillion at this point. These investments, while large in absolute terms, were always more strategic positioning than financial exposure. The real question is whether OpenAI and Anthropic will continue to get preferential GPU access without the investment relationship, or whether this opens the door for AMD and custom silicon to make further inroads.

Key takeaways:

  • Nvidia is divesting from OpenAI and Anthropic ahead of their anticipated IPOs
  • The move reduces regulatory and conflict-of-interest complications for all three companies
  • It signals the AI industry is maturing into distinct, separable layers rather than vertically integrating
  • GPU demand is strong enough that Nvidia no longer needs investment relationships to guarantee sales
  • Watch whether hardware access terms change for these labs post-divestiture

The AI Break - Original Newsletter


GPT-5.4: One Million Token Context Window and Extreme Reasoning Mode

TLDR: OpenAI is reportedly preparing GPT-5.4 with an extreme reasoning mode and a 1-million-token context window, more than doubling the current limits. This would represent a significant leap in what LLMs can process in a single session, but practical utility at that scale remains an open question.

Summary:

A million-token context window is the kind of spec that sounds revolutionary until you think about what you would actually do with it. The current practical limit for most users is somewhere around 200K tokens with Claude and GPT-4-class models, and even at that scale, retrieval accuracy degrades toward the middle of the context (the well-documented "lost in the middle" problem). Doubling to a million tokens does not magically solve this -- it potentially makes it worse unless the architecture has fundamentally changed.

That said, there are legitimate use cases where massive context windows matter. Entire codebases can be loaded for analysis, full legal document sets can be processed without chunking, and long-running agent sessions can maintain complete conversation history. The question is whether the accuracy at 1M tokens is good enough to be useful, or whether it is just a marketing number. OpenAI has been competing with Google on context length (Gemini has offered 1M+ for a while), so competitive pressure is clearly a factor here.

The "extreme reasoning mode" language is interesting because it suggests a further push toward chain-of-thought and multi-step inference. If this is a system-2-style reasoning capability baked into the model, it could genuinely change what kinds of problems LLMs can tackle -- multi-hop reasoning, complex planning, and extended mathematical proofs. But "extreme" is doing a lot of marketing work in that description, and we should wait for benchmarks before getting excited.

The practical implication for developers is that context window management strategies may need to evolve. RAG pipelines were partly a workaround for limited context windows. If context windows become large enough and cheap enough, some RAG use cases might be better served by simply stuffing everything into the prompt. But cost matters -- processing a million tokens is not free, and latency scales with context size.

Key takeaways:

  • GPT-5.4 reportedly targets a 1-million-token context window, a competitive response to Google Gemini
  • An "extreme reasoning mode" suggests deeper chain-of-thought capabilities
  • Larger context windows do not automatically mean better retrieval -- lost-in-the-middle remains a concern
  • Some RAG patterns may become unnecessary if large-context inference becomes cost-effective
  • Wait for independent benchmarks before assuming this changes your architecture

Google NotebookLM Generates Cinematic Video Overviews from Documents

TLDR: Google NotebookLM now uses Gemini 3 and Veo 3 to generate cinematic video overviews from your uploaded documents, essentially turning your notes and research into narrated visual content. It is an impressive demo of multimodal AI but raises questions about depth versus polish.

Summary:

NotebookLM has been one of Google's more interesting AI products, largely because it found a real use case that people actually care about: making sense of large document collections. The audio overview feature (turning docs into podcast-style conversations) was genuinely useful and surprisingly well-received. Video is the logical next step, and combining Gemini 3 for understanding with Veo 3 for generation is technically impressive.

The concern here is the same one that applies to all AI-generated video content: polish is not the same as accuracy. Cinematic visuals can make superficial content feel authoritative. If you feed NotebookLM a set of documents with conflicting information or nuanced arguments, the video output will likely smooth over those complexities in favor of a clean narrative. That is fine for summaries but potentially misleading for research or decision-making.

From a technical standpoint, this is a meaningful integration of multiple AI capabilities -- document understanding, summarization, script generation, voice synthesis, and video generation -- into a single pipeline. That kind of multimodal orchestration is where Google's infrastructure advantage becomes apparent. Running Gemini 3 and Veo 3 in sequence for a single user request is computationally expensive, and few companies besides Google can offer that at scale without charging enterprise pricing.

The practical implication for knowledge workers is that the barrier to creating professional-looking content from raw research has dropped to near zero. The question is whether this accelerates understanding or just accelerates the production of content that looks like understanding.

Key takeaways:

  • NotebookLM combines Gemini 3 and Veo 3 to turn documents into narrated video summaries
  • Builds on the already-popular audio overview feature
  • Multimodal pipeline integration is technically impressive and hard to replicate outside Google's infrastructure
  • Risk of polish masking shallow analysis -- cinematic does not mean accurate
  • Useful for presentations and knowledge sharing, but verify the output against source material

Anthropic Approaching $20 Billion Annual Revenue Run Rate

TLDR: Anthropic is reportedly closing in on a $20 billion annual revenue run rate as Claude gains users at the expense of ChatGPT. This would make it one of the fastest-growing enterprise software companies in history, though the revenue-to-cost ratio in AI remains the elephant in the room.

Summary:

Twenty billion dollars in annual revenue is a staggering number for a company that barely existed five years ago. To put it in perspective, it took Salesforce about 18 years to hit that mark. Anthropic is doing it in roughly four. Even accounting for the fact that AI is operating in a bubble-adjacent environment with inflated expectations, the underlying demand is clearly real.

The framing that users are "leaving ChatGPT behind" deserves some skepticism, though. The AI assistant market is not zero-sum. Most power users have subscriptions to multiple services and switch between them based on task requirements. Claude tends to outperform on long-form analysis and coding tasks, while ChatGPT maintains advantages in multimodal capabilities and ecosystem breadth. What is more likely happening is that the total market is expanding and Anthropic is capturing a disproportionate share of the growth, not necessarily stealing existing ChatGPT users.

The more interesting question is profitability. Revenue is one thing; margin is another. Training and inference costs for frontier models are enormous, and Anthropic has been open about the fact that it is not yet profitable. At $20 billion in revenue, the company is almost certainly still burning through capital at a significant rate given the compute infrastructure required to serve Claude at scale. The path to profitability requires either dramatic cost reductions in inference (which is happening but slowly) or price increases (which would slow growth).

The broader industry signal here is that enterprise adoption of AI is accelerating faster than consumer adoption. Anthropic's revenue growth is heavily driven by API usage and enterprise contracts rather than consumer subscriptions. This supports the thesis that AI's near-term economic impact will be primarily through business productivity tools rather than consumer applications.

Key takeaways:

  • Anthropic approaching $20B annual revenue run rate, making it one of the fastest-growing software companies ever
  • Growth driven more by market expansion than direct ChatGPT displacement
  • Enterprise API usage and contracts are the primary revenue driver
  • Profitability remains uncertain given massive compute infrastructure costs
  • Signals that enterprise AI adoption is outpacing consumer adoption

Google Search AI Mode Adds Canvas for Drafting and Editing

TLDR: Google Search now includes a Canvas feature within AI Mode that lets users draft and edit writing and code directly within search results. This blurs the line between search engine and productivity tool, positioning Google to capture more of the user workflow.

Summary:

This is Google continuing to execute on the strategy of making search the starting point for work, not just information retrieval. By adding a Canvas for drafting and editing directly within search results, Google is attempting to collapse the workflow that currently requires switching between a search engine, a text editor, and an AI assistant into a single interface.

The strategic logic is sound but aggressive. If users start drafting documents and writing code inside Google Search, they have less reason to open VS Code, Google Docs, or even ChatGPT for those tasks. Google is essentially competing with its own products (Docs, Colab) to prevent users from leaving Search for competitor AI tools. That internal cannibalization risk tells you how seriously Google takes the threat from ChatGPT and Claude as alternative starting points for knowledge work.

For developers, the code editing capability is worth watching. If Google can offer a competent code drafting experience within search -- leveraging Gemini for generation and providing an inline editing surface -- it could capture a meaningful portion of the "quick script" and "code snippet" use cases that currently drive traffic to Stack Overflow, GitHub Copilot, or ChatGPT. The quality of the code generation will determine whether this is useful or just a gimmick.

The broader implication is that the traditional boundaries between search, AI assistants, and productivity tools are dissolving. Every major tech company is racing to become the single interface where users start and finish their work. The winner will likely be determined not by which AI model is best, but by which workflow integration is most seamless.

Key takeaways:

  • Google Search AI Mode now includes inline document and code drafting via Canvas
  • Strategic move to prevent users from leaving Search for competitor AI tools
  • Risks cannibalizing Google's own products (Docs, Colab) but prevents external competition
  • Code drafting quality will determine whether this captures developer mindshare
  • Signals the collapse of boundaries between search, AI, and productivity tools

AI Investment Roundup: OpenAI's $110B Round, Ayar Labs, and MatX

TLDR: OpenAI closed a historic $110 billion funding round at a $730 billion valuation, Ayar Labs raised $500 million for silicon-photonic chip interconnects, and MatX raised $500 million for AI training chips claiming 10x performance over Nvidia GPUs.

Summary:

The numbers in this week's AI investment activity are almost numbing. OpenAI's $110 billion round -- with $50 billion from Amazon, $30 billion from SoftBank, and $30 billion from Nvidia -- is the largest private tech funding round in history. At a $730 billion pre-money valuation, OpenAI is being valued like a public company that has already won the market. The question that nobody seems to be asking loudly enough is: what does OpenAI need $110 billion for? The answer is compute infrastructure, and the scale of that need tells you something important about how expensive it is to stay at the frontier.

Ayar Labs is an interesting bet on a different part of the AI stack. Silicon-photonic chips that replace copper interconnects with optical connectivity address a genuine bottleneck in AI data centers: moving data between chips and between racks fast enough to keep the GPUs busy. As models get larger and training becomes more distributed, interconnect bandwidth becomes a limiting factor. AMD and Nvidia both participating as investors suggests this is not speculative -- it is addressing a known pain point.

MatX claiming 10x better LLM training performance versus Nvidia GPUs is the kind of claim that needs extraordinary evidence. Custom silicon for specific workloads can absolutely outperform general-purpose GPUs, and we have seen this with Google's TPUs. But "10x" is aggressive, and the real test will be whether MatX can deliver at scale and with the software ecosystem that makes Nvidia's CUDA so dominant. Leopold Aschenbrenner's involvement adds credibility from a research perspective, and Jane Street's investment adds financial sophistication, but hardware startups have a long history of impressive benchmarks that fail to translate into production deployments.

Key takeaways:

  • OpenAI's $110B round at $730B valuation is the largest private tech funding ever
  • The sheer capital requirements signal that frontier AI is extraordinarily expensive to build
  • Ayar Labs targets a real infrastructure bottleneck with optical interconnects for AI data centers
  • MatX's 10x performance claim over Nvidia needs independent verification at scale
  • The AI hardware ecosystem is diversifying beyond Nvidia, but CUDA's software moat remains formidable

Tradeoffs: Custom AI chips offer potential performance gains but face massive adoption barriers. Nvidia's dominance is not just hardware -- it is the CUDA ecosystem, developer tooling, and library support built over two decades. Any competitor needs to either match that ecosystem or offer such overwhelming performance advantages that customers will invest in porting their software stacks.

External Links (1)