Mistral's Cloud Agents, NVIDIA's Omni Model, and the Week AI Got Everywhere

TLDR

This week felt like one of those moments where the pace just... accelerated. Mistral shipped async cloud coding agents, NVIDIA dropped a capable open multimodal model, Anthropic plugged Claude into Photoshop and Blender, and Google Gemini learned to generate actual documents. A lot moved at once.

Mistral Vibe: Coding Agents That Work Overnight

Mistral launched Vibe, and the pitch is simple but kind of wild: async cloud coding agents powered by their Medium 3.5 model that can open GitHub PRs while you're at dinner. It hits 77.6% on SWE-Bench Verified, which puts it in serious territory.

What I keep thinking about is the "async" part. Not a chat interface where you type and wait. You describe what needs to happen, hand it off, and come back to open PRs. That workflow shift matters more than the benchmark number. We've had autocomplete for years. We've had chat-in-editor tools for a while too. But autonomous agents that pick up a task and run with it overnight? That's a different category of tool.

There are obvious questions about code quality, security review, and how much handholding the task description needs. But the direction is clear.

Mistral Vibe announcement

NVIDIA Nemotron Nano Omni: One Model for Vision, Audio, and Language

NVIDIA released Nemotron 3 Nano Omni, a 30B open multimodal model that handles vision, audio, and language in one package. They're claiming 9x higher throughput than comparable models, which is a striking number if it holds up in practice.

The "nano" branding is a bit confusing for a 30B model, but the throughput angle is interesting. Raw capability benchmarks get all the attention, but inference speed is what actually determines whether something is useful in a real product. If you can run a multimodal model at 9x the throughput, you can afford to call it more, build faster feedback loops, and handle more users.

Open weights also matter here. The ability to run it locally, fine-tune it, and audit it is worth something that leaderboard numbers don't capture.

NVIDIA Nemotron Nano Omni

Claude Gets Embedded in Creative Tools

Anthropic connected Claude to nine creative applications: Photoshop, Blender, Ableton, Fusion, and others. The idea is that you describe what you want, and Claude acts on the tool directly. Prompt-based 3D modeling. Prompt-based scene editing. Prompt-based audio arrangement.

This is genuinely different from what we've had before. Chat-based AI assistants have been adjacent to creative workflows, giving suggestions you then implement yourself. Embedding Claude inside the tool so it can actually execute changes is a different relationship between the model and the software.

Whether it works well in practice depends entirely on the quality of the integrations. But the concept, creative tools where you describe intent and the tool responds, is where a lot of this is heading. I'm curious how Blender users specifically react, given that community has a strong DIY culture and high tolerance for learning complex interfaces.

Claude in creative apps

Google Gemini Generates Actual Files

Google Gemini can now produce full PDF, Word, Excel, Slides, and LaTeX files directly in chat. Not markdown you paste somewhere. Actual formatted documents.

This sounds small but it closes a real gap. The copy-paste-into-blank-template step was always annoying, especially for anything that needed proper formatting. Now you describe what you need and get a file back. For business use cases, reports, summaries, decks, that's a meaningful reduction in friction.

LaTeX support is interesting too. That's a signal about academic and technical users, people who care about precise typesetting and aren't going to accept a Word doc approximation.

Google Gemini file generation

Poolside Laguna: Two More Coding Models Worth Watching

Poolside shipped Laguna M.1 and the open-weights Laguna XS.2. The M.1 hits 72.5% on SWE-bench Verified, and the XS.2 is a 33B model you can run on a Mac.

The "runnable on a Mac" framing is doing a lot of work in that sentence. Running capable agentic coding models locally, without API costs, without sending your code to a third party, is something a lot of developers care about. 33B is still hefty, but modern MacBooks with Apple Silicon can handle it. If the benchmark numbers translate to real-world performance, this is worth testing.

Poolside Laguna models

Funding Rounds Worth Noting

Ineffable Intelligence raised a $1.1B seed round at a $5.1B valuation. A billion-dollar seed. The pitch is a "superlearner" that builds knowledge through reinforcement learning rather than human labels. The valuation and round size are remarkable even by recent standards.

ComfyUI pulled in $30M at a $500M valuation to scale their open-source visual AI workflow platform. ComfyUI has a passionate community and a genuinely useful node-based interface for working with diffusion models. Commercial backing without abandoning open-source is a tricky balance to maintain.

Tesla disclosed a roughly $2B all-stock acquisition of an unnamed AI hardware company in their Q1 filing. Unnamed. A two billion dollar deal and they haven't said who. That's going to generate speculation for a while.

AI investment news

Key Takeaways

Mistral Vibe introduces async cloud coding agents that file GitHub PRs autonomously, hitting 77.6% on SWE-Bench Verified
NVIDIA's Nemotron Nano Omni is a 30B open multimodal model claiming 9x throughput improvements over comparable models
Claude is now embedded directly in Photoshop, Blender, Ableton, and six other creative tools for prompt-driven editing
Google Gemini can generate fully formatted PDF, Word, Excel, Slides, and LaTeX files natively in chat
Poolside's Laguna XS.2 is a 33B open-weights coding model that runs on consumer Mac hardware
Ineffable Intelligence's $1.1B seed round and Tesla's unnamed $2B acquisition are the week's funding stories