This Week In React #277: TanStack RSC, React2DoS CVE, Vercel WebStreams 10x Faster, MUI v9, and the Vertical Codebase

Published on 15.04.2026

bash — 80×24$pnpm dev▶ ready on localhost:3000$git commit -m "feat: og images"$npx tsc --noEmit✓ 0 errorsCODING

React Server Components Your Way: TanStack's Take

TLDR: TanStack Start treats React Server Components as plain data streams rather than a server-owned component tree. You fetch, cache, and render them on your terms, using existing tools like TanStack Query or TanStack Router.

Summary: The TanStack team has been deliberately slow to ship RSC support, and this post explains why the wait was worth it. Their core insight is that RSCs don't need to become the new center of gravity for your application architecture. In Next.js App Router, the server owns the tree and you opt into client interactivity. In TanStack Start, RSCs are just React Flight streams that you can fetch from anywhere on the server and decode wherever makes sense on the client. The client owns the tree. The server provides data.

This distinction matters more than it sounds. When RSCs are "just streams," they compose with everything you already have. TanStack Query manages them with cache keys, stale time, and background refetching. TanStack Router caches them in its loader system alongside other loader data, giving you the snappy back-button behavior for free. GET server functions can even be cached at the CDN layer since they're just HTTP responses with standard cache headers.

The performance numbers from migrating tanstack.com are honest in a way that most RSC evangelism isn't. Blog post pages dropped 153KB of gzipped JavaScript and Lighthouse scores moved from 52 to 74. Total Blocking Time went from 1,200ms to 260ms. But landing pages that are already dominated by interactive UI shell saw flat or slightly worse results. The team is explicit that RSCs are a good fit for content-heavy, dependency-heavy pages and a weak fit for dashboards and app sessions.

What's genuinely new here is Composite Components: a pattern where the server renders UI while leaving typed slots open for the client to fill with components the server doesn't know about. The server positions a footer slot for post actions; the client passes a fully interactive PostActions component that uses hooks, state, and context. The server never sees it. The client assembles the final tree. That's a composition model most RSC frameworks don't even attempt. Worth experimenting with.

Key takeaways:

  • TanStack RSC is client-first: RSC output is data you fetch, not a server-owned tree you build around
  • Composites with TanStack Query and Router require zero special RSC modes
  • Performance gains are real on content-heavy pages; interactive-heavy pages see little benefit
  • Composite Components let the client fill server-rendered slots with its own components
  • TanStack Start intentionally avoids use server actions due to security attack vectors

Why do I care: This is the RSC architecture I've wanted to exist. The TanStack approach removes the "you must reorganize your whole application around server components" tax that Next.js App Router imposes. If you're evaluating frameworks for a new project or a significant refactor, this is the alternative worth understanding.

React Server Components Your Way | TanStack Blog


React2DoS (CVE-2026-23869): A New DoS Vulnerability in RSC Flight Protocol

TLDR: Imperva disclosed a high-severity denial-of-service vulnerability in React Server Components that uses the Flight protocol's Map/Set deserialization to trigger quadratic computation from small payloads. Update to React 19.2.5 now.

Summary: This is the third significant security issue in the React Server Components Flight protocol in a short period, following React2Shell and CVE-2026-23864. Each one has found a different way to abuse the custom serialization format that RSC uses to stream data between server and client.

React2DoS works by exploiting how the Flight protocol deserializes Maps and Sets from client-controlled input. The key discovery is that the consumed flag that prevents re-computation of the same reference only gets set when a reference resolves successfully. A reference to a Map constructor that fails can be repeated many times in the same payload without triggering the deduplication guard. The researchers then went further and discovered a quadratic version: prepend valid Map entries before the failing reference, and the Map constructor iterates over all of them before failing. The more valid entries you prepend, the more expensive each failure becomes, and the complexity compounds in a nonlinear way.

The comparison with the previous CVE is stark. CVE-2026-23864 required a megabyte-scale payload to exhaust CPU for several seconds. React2DoS achieves the same CPU exhaustion with tens of kilobytes. At hundreds of kilobytes, it outperforms the previous attack by several orders of magnitude. A single request can trigger computation lasting minutes.

The fix is simple in hindsight: set the consumed flag before the constructor call, not after. React 19.2.5 patches this, with backports to 19.0.5 and 19.1.6. TanStack Start is unaffected because it doesn't parse Flight data from the client at all.

Key takeaways:

  • CVE-2026-23869 affects React Server Components 19.2.4 and below
  • The attack requires only tens of kilobytes to cause multi-minute computation on the server
  • Update to React 19.2.5, 19.1.6, or 19.0.5 immediately
  • The pattern: custom serialization formats are fertile ground for DoS and RCE attacks
  • Architectures that avoid parsing client-controlled Flight data are inherently safer

Why do I care: Three CVEs in the RSC Flight protocol in quick succession is a pattern worth taking seriously. The Flight protocol is complex custom serialization code, and complex parsers accumulate vulnerabilities. If you're running server functions that accept client input and haven't patched, this is your reminder to do it today.

React2DoS (CVE-2026-23869): When the Flight Protocol Crashes at Takeoff | Imperva


Vercel's fast-webstreams: Making WebStreams 10x Faster for Server Rendering

TLDR: Vercel built a drop-in replacement for Node.js WebStreams that eliminates per-chunk Promise allocations, achieving 10x throughput improvement for piped streams and 14.6x for the exact byte-stream pattern React Server Components uses.

Summary: This is genuinely impressive engineering work, and the writeup is one of the clearest performance deep dives I've read this year. The problem is well-defined: the WHATWG Streams API (ReadableStream, WritableStream, TransformStream) is the web standard, and Next.js uses it for server rendering. But on the server, it's significantly slower than Node.js's older streaming primitives because it allocates objects and traverses Promise chains for every chunk flowing through a pipeline.

The concrete numbers are shocking. Piping 1KB chunks through a native WHATWG stream runs at 630 MB/s. The same operation through Node.js's older pipeline runs at 7,900 MB/s. That's a 12x gap, almost entirely from Promise and object allocation overhead. For React Flight's specific byte stream pattern, native WebStreams manage 110 MB/s. The fast-webstreams library gets 1,600 MB/s on the same pattern: 14.6x faster.

The implementation is clever without being magic. When you pipe between fast streams, the library doesn't start piping immediately. It records upstream links and waits until pipeTo is called at the end of the chain. At that point, it walks the entire chain, collects the underlying Node.js stream objects, and issues a single pipeline call. One function call, zero Promises per chunk. For native fetch response bodies, a deferred resolution technique wraps the native stream in a fast shell so the optimization still applies.

The right long-term fix is in Node.js itself, and work is already happening there. The Vercel team contributed ideas upstream, and there's already a PR in Node.js applying two of the fast-webstreams optimizations to native WebStreams. The library is available as experimental-fast-webstreams on npm if you're hitting streaming performance limits in your own framework.

Key takeaways:

  • WHATWG WebStreams' per-chunk Promise overhead is the primary performance bottleneck in RSC streaming
  • fast-webstreams achieves 10x improvement by using a single Node.js pipeline call instead of per-chunk Promises
  • The library passes 1,100 out of 1,116 Web Platform Tests, matching native compliance
  • The long-term fix belongs in Node.js itself; upstream work is already in progress
  • RSC streaming was 14.6x faster with this library on the React Flight byte stream pattern

Why do I care: This directly explains why framework-level streaming performance varies so much across different Next.js versions and deployment environments. Understanding that the bottleneck is Promise allocation overhead per chunk, not the rendering work itself, changes how you think about SSR performance optimization.

We Ralph Wiggumed WebStreams to make them 10x faster - Vercel


Material UI and MUI X v9: Synchronized Versions and AI-Native Components

TLDR: MUI v9 re-aligns Material UI and MUI X on the same major version number for the first time since their split in 2023, adds NumberField, Menubar, Scheduler, and Chat components, and ships a new MUI Console for managing AI assistant licenses.

Summary: The version synchronization story is the most interesting part of this release for teams that run both libraries. When MUI X was on v6 and Material UI was on v7, every upgrade required checking compatibility matrices and managing separate migration guides. That friction accumulated over time. With v9, both libraries share the same major version, and upgrades across the stack should be straightforward to communicate and coordinate. Material UI skips v8 entirely to land on v9 alongside MUI X.

The new components are worth noting separately. NumberField and Menubar fill gaps that have forced developers to reach for outside solutions or build from scratch. The Scheduler alpha is for resource-aware calendars and timelines, which is a category that has historically required heavy third-party dependencies or significant custom work. The Chat component is designed as an embeddable conversational UI with streaming support, not to be confused with MUI Recipes (formerly MUI Chat), which is the AI code generation tool that produces MUI components.

The Data Grid AI Assistant moving to general availability with the MUI Console backing it is a bigger deal than the release notes suggest. Previously, rolling out AI-assisted grid features in production required support tickets for API keys. The Console centralizes license management, API keys, and billing in one place. That's the kind of operational friction removal that makes the difference between "we're experimenting with this" and "this is production."

There's also a pricing change: MUI X Pro and Premium move to application-based licensing, and priority support becomes Enterprise-only. Worth reading the full announcement before renewing existing licenses.

Key takeaways:

  • Material UI v7 → v9 (skips v8) to synchronize with MUI X v9
  • New alpha components: Scheduler and Chat for resource management and conversational UI
  • NumberField and Menubar arrive as installable npm packages
  • MUI Console makes AI assistant features operationally viable in production
  • Pricing change: Pro/Premium move to application-based licensing

Why do I care: If you're on both libraries in a large codebase, the version sync removes real maintenance burden. The Scheduler filling the "complex calendar" gap is useful to know about for the next time a product request comes in for resource planning views.

Introducing Material UI and MUI X v9 - MUI


The Uphill Climb of Making GitHub Diff Lines Performant

TLDR: GitHub's team reduced INP from 450ms to 100ms and cut JavaScript heap usage by 50% on large pull requests by simplifying their React diff line architecture from 13 components per line to 2, and adding TanStack Virtual for p95+ pull requests.

Summary: This is a case study in what "performance-focused React" actually looks like in production, and it's full of specific numbers that make the lessons stick. The original architecture had up to 15 DOM elements and 13 React components per diff line, with 20+ event handlers per line. On a pull request with 10,000 lines, that adds up to 183,000 rendered components and 200,000 DOM nodes, pushing the JavaScript heap past 250MB in some cases.

The v2 architecture brought that down to 2 components per line by eliminating thin wrapper components that existed to share code between split and unified diff views. Yes, this introduced code duplication, and the team made that trade consciously. Some code is duplicated, but the result is simpler and faster. The duplication is a better engineering decision than the abstraction tax.

Event handling is another area where the numbers compound badly. v1 attached 5-6 event handlers per component, meaning a single diff line carried 20+ handlers. v2 consolidates this into a single top-level handler that uses data attributes to determine which lines are affected. This is the same pattern that React's synthetic event system uses internally, and it scales predictably.

The O(1) data access refactor is subtle but significant. v1 accumulated O(n) lookups across shared state. v2 switched to JavaScript Maps for all line-level lookups: comment presence, selection state, and hover state are all constant-time reads. They also restricted useEffect to the top level of diff files and added linting rules to prevent it from creeping back into diff line components.

For the very largest pull requests (p95+, over 10,000 lines), even the optimized v2 components couldn't keep up. TanStack Virtual provides window virtualization, keeping only visible diff lines in the DOM. That brought a 10x reduction in heap usage and dropped INP from 275-700ms to 40-80ms for those extreme cases.

Key takeaways:

  • Reduce React component count aggressively; code duplication beats abstraction tax at scale
  • Centralize event handling with data attributes instead of per-component handlers
  • Use O(1) data structures (Maps) for all hot-path lookups in list-heavy UIs
  • Restrict useEffect to top-level components and enforce this with lint rules
  • TanStack Virtual handles p95+ cases where even optimized components can't keep up

Why do I care: The GitHub team's approach is unusually practical and measurable. Every decision has a before/after number attached. If you're working on a high-density list UI with complex interactions, this article is required reading.

The uphill climb of making diff lines performant - GitHub Engineering


If You Can't See the Boundary, You Can't Reason About the System

TLDR: A thoughtful developer shared that he'd rather build a client-only React app than use Next.js App Router because the RSC server/client boundary is invisible at runtime. The author responds by building RSC Boundary, a dev-only overlay that makes the boundary visible.

Summary: This post articulates a frustration I've heard from experienced developers repeatedly: the hardest part of RSC isn't understanding the concept, it's that you can't verify your mental model against what's actually running. You read the code and infer where server and client responsibilities sit, but you can't quickly check whether your inference is correct. You end up asking "what do I think is happening" instead of "what is happening."

The author's framing generalizes nicely beyond React. Any architecture with hidden boundaries creates the same failure mode: developers build mental models from incomplete signals, those models work until they don't, and debugging is expensive because the runtime gives you little help validating assumptions. Microservices without request-flow visibility, edge versus origin without a clear execution path, any system where tooling doesn't show the boundary under real traffic. The principle is the same.

The comparison to Nuxt DevTools is pointed. First-party, in-app inspection that surfaces components, payloads, and server routes in one coherent surface actually gets used because it meets developers where they already are. Next.js is moving in the right direction with its dev indicator and the next-devtools-mcp server, but neither currently shows the server/client split as a page-level map.

RSC Boundary is the author's direct response: a dev-only overlay that wraps your root layout and highlights client component roots and server regions while you work. It doesn't replace React DevTools. It answers one specific question: where does the server/client split show up on this page, right now? The author is honest that this is a sketch, not a substitute for understanding import graphs.

Key takeaways:

  • Invisible architecture boundaries cause debugging failures more than complex architecture does
  • RSC observability tools are lagging behind RSC adoption, causing real adoption friction
  • Nuxt DevTools remains a benchmark for what first-party in-app inspection should feel like
  • RSC Boundary (dev-only overlay) provides the missing server/client map for Next.js App Router
  • The missing view is often the difference between "we should use this" and "let's simplify"

Why do I care: This is the developer experience argument against RSC that doesn't get enough airtime. The architectural model is defensible; the tooling isn't there yet. If you're teaching or onboarding developers to App Router, RSC Boundary is worth adding to your setup immediately.

If you can't see the boundary, you can't reason about the system


The Vertical Codebase: Structuring Code by Domain, Not by Type

TLDR: TkDodo argues for organizing codebases vertically by domain rather than horizontally by type (components, hooks, utils), citing Sentry's codebase as a cautionary tale of what horizontal structure looks like after 10 years.

Summary: The horizontal split is seductive when you start: components here, hooks there, utils over there, types in their own folder. It's clean at first. Then you add 200 files to the components directory, and the only thing they have in common is that they're React components. Finding anything requires knowing what it is technically, not what it does. The Sentry codebase example is instructive: profiling code split across components/profiling, types/profiling, utils/profiling, and views/profiling. Four directories for one feature. Coupling without cohesion.

The vertical approach groups code by what it does rather than what it is technically. Everything widget-related goes into src/widgets. That can be components, hooks, types, utils, constants, whatever lives there. You don't need to know whether something is a hook or a component to find it. You need to know what part of the product it belongs to. This also aligns naturally with how product teams are structured, which is a nice side effect.

The author pushes the argument into monorepo territory, which is where it gets its teeth. Individual verticals become packages with their own dependencies and explicitly defined public interfaces through package.json exports. Now the coupling becomes visible rather than implicit. Teams can't accidentally use each other's internal utilities without declaring a dependency. The eslint-plugin-boundaries approach works as an intermediate step for codebases that aren't ready to go full monorepo.

I find the "AI doesn't care" dismissal interesting to think through. The author's response is that AI agents need the same things humans need: clear boundaries, fast feedback loops, and navigable structure. Agents are already better at new codebases than at codebases that have grown organically. That's not a coincidence; it's a structural argument for keeping your code organized.

Key takeaways:

  • Horizontal structure groups by type (components/hooks/utils); vertical groups by domain (what code does)
  • Vertical structure aligns with team ownership and makes cross-feature coupling visible
  • Monorepo packages with explicit exports enforce the public/private boundary without relying on convention
  • eslint-plugin-boundaries is a viable intermediate step before committing to monorepo structure
  • AI agents benefit from vertical structure for the same reasons humans do: clear boundaries and navigability

Why do I care: This is the codebase structure argument I've been making in architecture reviews for years, with a concrete vocabulary. The vertical codebase aligns with how I think about code ownership and change locality, and Sentry's codebase is an honest case study of what happens when you don't make the change early.

The Vertical Codebase


Now More Than Ever, You Need to Master Custom ESLint Rules

TLDR: A detailed walkthrough of building custom ESLint rules using AST analysis, starting with a real useEffect derived-state antipattern, showing how AST Explorer makes rule development practical, and arguing that custom rules are the most reliable way to enforce team coding standards.

Summary: The premise here is personal: the author kept writing the same PR comment about derived state in useEffect hooks. Three reviewers would pile on, the author would push back, seven comments for a three-line change. The solution was to encode the knowledge into an ESLint rule that catches it automatically. What followed was a genuinely educational three-day dive into how JavaScript's Abstract Syntax Tree works.

The AST Explorer introduction is the best part of this article. The insight that every tool in the JavaScript ecosystem (Babel, Prettier, TypeScript, ESLint) works with the same tree representation of your code is one of those things that, once you see it, you can't unsee. The practical consequence is that writing an ESLint rule becomes pattern matching: paste bad code into AST Explorer, find the node types, write a function that matches that pattern.

The derived state rule itself is a good example of narrow scope done right. It catches only the obvious cases, the ones where a useEffect body consists entirely of setter calls from useState. It won't catch every variation, and the author is explicit that this is intentional. The first version of any custom rule should be narrow. You can always widen it when you see what it misses. False positives are more costly than false negatives for team tooling.

The AI section at the end is genuinely interesting. The author makes the case that custom ESLint rules are more effective than AI coding guidelines because they run after the AI writes its code. A lint error gives the AI a red underline to fix. A guidelines file gets forgotten in long context windows. The rule doesn't need to understand why the pattern is wrong; it just flags it.

Key takeaways:

  • AST Explorer is the essential tool for custom ESLint rule development
  • Start rules narrow: catch obvious cases with zero false positives, widen later
  • Custom local plugins don't require npm publishing, just a local eslint-rules directory and flat config
  • ESLint RuleTester makes rule testing trivial with valid/invalid code arrays
  • Custom rules are more reliable than AI guidelines for enforcing team patterns

Why do I care: Writing custom ESLint rules is one of those skills with compounding returns. The first rule takes time; the fifth is quick. And the specific argument about rules being more reliable than AI guidelines for enforcing patterns is one I'll be repeating in discussions about AI-assisted development.

Now more than ever, you need to master custom ESLint rules


You Can't Cancel a JavaScript Promise (Except Sometimes You Can)

TLDR: The Inngest SDK interrupts async workflow functions by returning a promise that never resolves, letting the garbage collector clean up the suspended function. It's a legitimate control flow technique that avoids the try/catch pitfalls of exception-based interruption.

Summary: The setup is clever: JavaScript has no built-in promise cancellation, but you can return a promise that never resolves, await it in a function, and the function stops at that point. No exceptions, no try/catch, no special return values. The function just stops. The garbage collector cleans up the suspended frame when nothing holds a reference to the hanging promise.

The context that makes this interesting is serverless workflow execution. The Inngest SDK needs to run long-running workflows on infrastructure with short-lived function timeouts. A workflow might take hours but each invocation can only run for seconds. The SDK needs to execute one step, save the result, interrupt the function, and re-invoke it later to pick up where it left off. The user's code should know nothing about any of this.

The alternatives are compared honestly. Throwing exceptions is the obvious approach, but any try/catch in user code becomes a trap that silently swallows the interruption. Generators give clean interruption semantics (you just stop calling .next()), but they require function* syntax and break down with concurrency since yield is sequential. A never-resolving promise gives you both: native async/await syntax and reliable interruption.

The memoization system that makes this work across multiple invocations is explained in detail. The execute function runs the workflow function in the background, waits for it to advance through already-completed steps (which resolve as microtasks), then checks if it found a new step. A setTimeout of 0 creates a macrotask boundary that lets all the microtasks drain first. Completed steps are stored in a Map and replayed instantly on subsequent invocations.

The garbage collection proof via FinalizationRegistry is a nice touch. The article demonstrates explicitly that hanging promises are collected when they become unreachable.

Key takeaways:

  • Promises that never resolve are a legitimate control flow tool, not a memory leak
  • The garbage collector collects hanging promises when nothing holds a reference to them
  • Exception-based interruption breaks when user code has try/catch; hanging promises don't
  • Generators offer clean interruption but require different syntax and don't compose with Promise.all
  • The pattern works because microtasks drain before macrotasks, enabling predictable step memoization

Why do I care: This is the kind of JavaScript deep knowledge that separates library authors from application developers. I may never build a workflow SDK, but understanding how promise resolution interacts with the garbage collector and the event loop is directly applicable to any complex async coordination code.

You can't cancel a JavaScript promise (except sometimes you can) - Inngest Blog


What Encore Learned Building a Rust Runtime for TypeScript

TLDR: Encore built a 67,000-line Rust runtime that handles the full HTTP lifecycle, database pooling, pub/sub, and tracing for TypeScript applications running on Node.js, achieving 9x the throughput of Express.js.

Summary: The engineering decisions here are well-reasoned and the tradeoffs are explained honestly. The choice to write a Rust runtime for TypeScript applications rather than extending the existing Go runtime came down to two things: wanting a single Rust core that could eventually power multiple language runtimes (like Prisma and Pydantic do), and the fact that Node.js is fundamentally single-threaded. Moving infrastructure concerns to Rust means the HTTP lifecycle, database connection management, pub/sub, and tracing all run fully multi-threaded on tokio. That's not achievable within Node.js.

The Go sidecar prototype had real latency problems. A single API request touching a database and publishing an event would cross IPC boundaries six or seven times, adding 2-4ms overhead per request just from serialization and context switching. The runtime needed to live in the same process, which meant napi-rs bindings.

The interesting technical challenge is bidirectional communication between Rust and JavaScript. The NAPI is designed for calling native code from JavaScript. Calling the other direction (dispatching a pub/sub message to a TypeScript handler) required forking napi-rs's ThreadSafeFunction to capture return values. When the handler returns a promise, Rust chains a .then callback that resolves back into a tokio channel.

The embedded Pingora gateway is the architectural choice I find most interesting. Instead of running nginx or envoy as a separate process, the gateway lives in the same process as the runtime. User-defined auth handlers written in TypeScript execute directly inside the gateway. No serialization, no IPC, no separate process to monitor. The runtime handles everything from accepting connections to collecting traces, and the TypeScript application code is pure business logic.

The performance numbers are significant. At 150 concurrent workers over 10 seconds, Encore.ts handles 121,000 requests/second against Express.js's 15,700. With validation enabled (where Encore uses the type information already extracted by its TypeScript parser), Express drops to 11,800 while Encore stays at 107,000. The validation gap alone is a 9x difference.

Key takeaways:

  • Moving infrastructure concerns to Rust preserves Node.js's event loop for business logic only
  • napi-rs bidirectional communication requires custom work to capture return values from TypeScript handlers
  • In-process API gateway (Pingora) eliminates IPC overhead and enables TypeScript-defined auth handlers
  • The TypeScript parser extracts infrastructure declarations at compile time, enabling compile-time validation at runtime
  • Express.js's validation bottleneck is where Encore's advantages compound the most

Why do I care: This isn't directly applicable to most application development, but it's an excellent illustration of what's possible when you're willing to own the infrastructure layer. The architecture also explains why Encore's performance scales so differently from frameworks that stay inside JavaScript for all infrastructure concerns.

What We Learned Building a Rust Runtime for TypeScript – Encore Blog


Agent React DevTools: Giving AI Agents Access to React Internals

TLDR: Callstack released a CLI that connects AI agents directly to React DevTools, letting agents inspect the live component tree, profiling data, and render performance rather than relying on visual UI trees alone.

Summary: The limitation being addressed here is real. AI agents testing React applications currently work mostly from the accessibility tree or visual representation of the UI. That's fine for simulating interactions and checking rendered output, but it falls short when something isn't rendering because a query hasn't resolved, or when a component is re-rendering more than expected. The agent sees a stuck UI without knowing why.

Agent React DevTools connects to the React DevTools protocol to give agents a live snapshot of the React component tree. This means agents can see the same profiling data, render timing, and component hierarchy that developers use when debugging manually. Instead of inferring what's happening from visual state, the agent can inspect actual component state, props, and re-render patterns.

The Rozenite integration extends this to third-party plugins. TanStack Query state, MMKV storage, React Navigation routing context, all the things that modern React Native apps rely on but that the core component tree doesn't expose. Rozenite for Agents makes those plugins accessible to AI agents through the same unified CLI.

The longer-term vision here is agentic QA that works the way experienced human developers debug. I'm not sure how far automated agents can push quality assurance before they need the kind of judgment that comes from understanding why a component is in a certain state, not just that it is. But having access to React internals is a prerequisite for that next level.

Key takeaways:

  • AI agents relying only on visual/accessibility trees can't diagnose why something isn't rendering
  • Agent React DevTools connects directly to the React DevTools protocol for live component tree access
  • Rozenite for Agents extends this to third-party plugins (TanStack Query, MMKV, React Navigation)
  • Setup is via npx skills add callstackincubator/agent-react-devtools
  • Useful for agent-driven QA on React and React Native applications

Why do I care: If you're building AI-assisted testing workflows for React apps, this is the missing piece that lets agents do more than surface-level interaction testing. Worth experimenting with if your QA process is currently bottlenecked by what visual inspection can catch.

Agent React DevTools: Debug React Apps with AI Agents