DEBATE · PAIR · PLAN→EXECUTE · VERIFY: Dual-Model AI Coding
JaCode's design: two AI models in structured conversation modes beat one model running alone. Four modes, each for a different class of coding work.
The premise
Every production AI coding assistant — Cursor, GitHub Copilot, Aider, Zed's Copilot — pairs one human with one model. The model answers, the human reviews, the cycle continues. This is a reasonable baseline, and it's what the current tooling is optimized for.
JaCode's premise is that one human and two models in structured conversation is a better primitive than one human and one model. Not always — one model is faster and cheaper. But for specific classes of coding work, two models outperform one by material margins. JaCode's design centers on four conversation modes that each capture a different way two models collaborate better than one.
Status note: JaCode is in active development and has not shipped. This post describes the designed capabilities, not currently released functionality. Mode semantics and the actual training pairs are still being refined.
Why two models, specifically
Three failure modes of single-model coding assistants motivate the design:
- Blind-spot compounding. A single model that makes a wrong assumption in step 3 will typically compound that assumption into step 7. The user catches it if they're paying close attention — but the purpose of the assistant was to reduce close-attention load.
- Confirmation bias. A model asked "is this correct?" about its own prior output will answer "yes" substantially more often than it should. Self-review is unreliable by design.
- Role ambiguity. A model that's simultaneously the planner, implementer, and verifier does none of those three roles as well as a model focused on one of them.
Two models do not eliminate these issues. They diffuse them. A second model with an independent perspective catches what the first missed with higher-than-chance probability — and when both models agree, the user has a stronger signal than when one model is running solo.
The four modes
DEBATE — Adversarial Review
One model proposes a solution. The second model argues the opposite position — not necessarily because the opposite is correct, but because explicit devil's-advocate framing surfaces assumptions the first model glossed over.
When to use it: Before making architectural decisions. Before committing code that will be hard to revert. When the stakes of being wrong are higher than the cost of an extra minute of discussion.
What it produces: Two distinct positions, each defended in the model's own terms. The user reads both and decides — or asks the models to converge on a reconciled position.
When it's overkill: Routine refactors, straightforward bug fixes, well-specified tasks. DEBATE is slower and more expensive than PAIR; the overhead only pays off when the decision being debated is load-bearing.
PAIR — Pair Programming
Two models trade driver and navigator roles in a shared buffer, mirroring the classic human-human pair programming pattern. The driver writes code; the navigator watches, comments, suggests direction changes, catches typos and logic errors. Roles swap periodically or on user command.
When to use it: Implementation work where progress is the primary goal but quality can't regress. Feature implementation on a codebase where the model might miss local conventions.
What it produces: Progress on the task with continuous cross-checking. The navigator catches the kind of errors that would otherwise make it to the diff review.
When it's overkill: Single-file trivial changes, code generation from a well-specified schema. PAIR is sized for real work, not one-liners.
PLAN → EXECUTE — Separation of Concerns
A planning model writes a specification (interfaces, data flow, test cases, error conditions). An execution model implements the spec. The planner's output is a structured artifact; the executor treats it as a contract.
The design goal is to exploit the different strengths of different models: some models plan well but implement noisily; some implement cleanly but plan shallowly. Separating the roles lets each model work within its strength.
When to use it: Multi-file refactors. New feature implementation where the API design matters. Any task where "what are we building" and "how is it written" are both non-trivial.
What it produces: A plan document (consumable by humans as the implementation blueprint) plus the implementation itself (which can be reviewed against the plan).
When it's overkill: Simple tasks where planning adds ceremony. The mode is designed for tasks where the planning step would have existed anyway as a comment or README note.
VERIFY — Independent Audit
One model writes code. A second model reviews it against the original intent — as if it were a code reviewer who saw the diff cold, without watching the writing process. Before the code ships to the user, the verifier produces a structured audit: what the code does, whether it matches the stated intent, what the likely failure modes are.
When to use it: Before commits. Before pull requests. Before any state-changing action (deploy, database migration, infrastructure change).
What it produces: A short, structured review. Not a score — a list of concrete observations the user can act on (or dismiss). Verification is the mode closest to "don't let the model YOLO it."
When it's overkill: Local scratch work. Exploratory code where verification's ceremony exceeds the work's risk.
What JaCode does underneath
Behind the four modes, JaCode's designed architecture includes:
- Multi-model orchestration. Target support: Claude, GPT, Gemini, Ollama, or any OpenAI-compatible endpoint. Models can be mixed per mode (e.g., Claude as planner, GPT as executor in PLAN→EXECUTE).
- Hybrid execution. Local inference (via Ollama or local-weight models from Logos) alongside cloud APIs, with automatic fallback. Privacy-sensitive workflows can stay entirely local.
- Developer workflow automation. Hooks, permissions, task management, and session persistence.
- Agent system. Specialized sub-agents for build validation, code review, and verification — composed within the four primary modes.
- MCP extensibility. Model Context Protocol support for custom tool integrations without modifying the core.
How this compares to existing tools
| JaCode (designed) | Cursor | Aider | Copilot | |
|---|---|---|---|---|
| Single-model / multi-model | Multi (dual) | Single | Single | Single |
| Explicit conversation modes | 4 structured modes | Chat + inline | Chat + code-mode | Inline + chat |
| Local + cloud execution | Hybrid with fallback | Cloud-first | Bring-your-own-model | Cloud-only |
| MCP support | Planned | Partial | No | No |
| Status | Pre-release | Shipping | Shipping | Shipping |
The existing tools are real, useful, and shipping today. JaCode is a bet on a different primitive — not a replacement for every single-model use case, but an alternative for the work where single-model coding has plateaued.
Where to follow development
JaCode doesn't have a public release yet. If you're interested in the design, the place to follow is the JaCode product page and DDG's broader communications — launch signal and early access will surface there first.
For the underlying primitive (forkable, replayable AI execution trees) that underpins the agent system, see opentine, which ships on PyPI today (pip install opentine) as the DDG-maintained reference implementation of the Agent Run Protocol. opentine's content-addressed run trees are the natural substrate for JaCode's multi-model conversation modes — a single JaCode session will be, under the hood, an opentine run tree with model-swap branches at each mode transition.