Kilo Code for VS Code Goes GA — Parallel Agents, Unified Core, and a New Dev Workflow

Kilo Code for VS Code Goes GA — Parallel Agents, Unified Core, and a New Dev Workflow

So let's talk about something that caught my eye this week. Kilo Code — the AI coding assistant that's been living in VS Code for a while now — just shipped what they're calling their biggest update since launch. The new version is generally available today, and it's not just a polish pass. They rebuilt the whole thing from the ground up on something called OpenCode server, and I think that architectural shift is worth paying attention to.

TLDR: Kilo Code for VS Code is now GA on a rebuilt, portable OpenCode server foundation. It brings parallel tool execution, parallel subagents with git worktree isolation, a built-in diff reviewer with line-level comments, and cross-surface session continuity between the CLI and editor.

Let's back up for a second. The old Kilo Code extension served over 2.2 million developers, which is not nothing. But here's the thing that was quietly limiting the whole project — every surface Kilo ran on, whether that was the VS Code extension, the CLI, or JetBrains, was secretly running VS Code internals under the hood, even when those components had nothing to do with what the user was actually trying to accomplish. That's a classic case of an early architectural decision calcifying into a constraint. To their credit, the team recognized it and decided to fix it at the root rather than keep patching around it.

The new foundation is OpenCode server — MIT-licensed, open-source, and explicitly designed to be editor-agnostic. The VS Code extension and the CLI now share the same engine. That matters more than it might sound. When you fix a bug or add a feature to the core, every surface gets it at once. No more divergence, no more "works in the CLI but not the editor" situations. From a software architecture standpoint, this is the right call. Shared core, thin surfaces. Clean.

Now, the headline feature is parallelism, and I want to be honest about what that means in practice. Previously, the agent would work sequentially — read a file, wait, search the codebase, wait. Now it can execute multiple tool calls at the same time. Files get read concurrently, terminal commands run in parallel, searches happen simultaneously. You feel this immediately. It's one of those changes where the demo is almost underselling it because the actual day-to-day experience of waiting on an agent is something developers internalize as background frustration, and then suddenly it's just gone.

But here's where it gets genuinely interesting — parallel subagents. When a task is complex enough, Kilo can now spin up multiple independent agents and run them simultaneously. Imagine asking Kilo to build a feature, and it creates an implementation agent, a test-writing agent, and a documentation agent, each working on its piece in parallel and then merging results back to the parent. You can also define your own custom subagent configurations to match your team's actual workflow. Now, I want to push back slightly here — coordination and merge conflicts between parallel agents working on the same codebase are a real problem, and the article doesn't spend much time on how Kilo handles conflicts when multiple agents have modified overlapping files. That's the hard part of parallelism, and glossing over it in announcement copy is understandable but worth keeping in mind as you evaluate this in practice.

The Agent Manager is the control panel for all of this. You open multiple Kilo tabs, assign each one a role, and monitor what's happening across all of them. And critically, there's git worktree support. Each agent can operate in an isolated copy of the repository, which is the right answer to the conflict problem I just raised — if each agent has its own worktree, they can't step on each other's code. One agent adds an API endpoint, another refactors auth, a third writes tests, all simultaneously, all in separate worktrees. You review the results, merge what you want, and commit or open a PR. This is a mature approach to parallelism, and I'm glad they built it this way rather than trying to do something clever with locks or queues.

There's also a built-in diff reviewer, and this one I find genuinely useful in concept. Every change an agent makes is visible file by file in either unified or split view — standard stuff. But the interesting part is that you can leave line-level comments directly on the diff, exactly like you would in a GitHub pull request review. You annotate specific lines, hit send, and all of those comments with their file paths, line numbers, and surrounding code context get sent to the agent as structured input. This is a meaningful improvement over the binary approve-or-reject interaction pattern most AI coding tools offer. You're having a targeted conversation about specific lines rather than wrestling with an entire changeset at once.

There's also a model comparison feature that I think will get underused but is actually clever. You can run multiple agents on the same prompt using different models side by side — say Claude Opus alongside GPT-5 — and compare results directly. For ambiguous architectural decisions or complex refactors, this is a genuinely useful tool. The honest challenge here is that most developers won't take the time to do this unless the task is high-stakes enough to justify the effort. But for those cases, it's there.

Session continuity rounds out the picture. Because the VS Code extension and CLI share the same portable core, you can start a coding session in the terminal while SSH'd into a server, and pick it back up in VS Code when you're at your desk. The article makes this sound seamless, and architecturally it should be — shared engine means shared state format — but I'd want to see how this actually handles things like local file paths, environment differences, and long-running sessions before calling it solved.

One thing worth noting: this GA release included provider settings configurable directly in the extension — so you no longer need CLI setup to configure which AI model you're connecting to. The MCP marketplace is also available natively. These were pre-release friction points that got resolved before shipping, which is evidence that the feedback loop between their beta testers and the team actually worked.

The bottom line is that this is a serious architectural investment that should pay dividends for a while. Shared core, parallelism done right with worktree isolation, and a code review workflow that treats AI output with appropriate skepticism by requiring human annotation at the line level. Whether it lives up to that in daily use is something only time will tell, but the foundation looks solid.

Kilo Code for VS Code Goes GA — Parallel Agents, Unified Core, and a New Dev Workflow