opentine vs LangGraph vs Temporal vs DSPy: Choosing an Agent Runtime

The problem space — what is an "agent runtime"?

Four different projects use the word "agent" and mean four different things. Before comparing, it's worth naming what each one is actually for:

LangGraph — a stateful graph framework from LangChain Inc (reached stable v1.0 in October 2025, usable standalone or integrated with LangChain abstractions). You define nodes (tools, LLM calls) and edges (routing logic), with checkpointer-based state persistence.
Temporal — a durable workflow engine. Not agent-specific. You write code; Temporal persists the execution state so a crashed worker resumes from where it left off. Agents are one use case; billing pipelines, ETL, and saga orchestration are others.
DSPy — a Stanford-origin project for declarative LLM programming. You write modules and DSPy's optimizers (MIPROv2, GEPA, BetterTogether, LeReT) compile them into instructions, few-shot demos, or weight updates — labeled training data is one input mode, others are reflection-based. The abstraction is programs as compiled artifacts, not runs as forkable graphs.
opentine — a content-addressed run tree. Every agent step is a node in a DAG, hashed from its inputs, forkable and replayable from any point. Built around the primitive "git for agent runs."

These projects overlap in use cases but occupy different primitive spaces. Picking between them is mostly a question of which primitive fits your architecture.

Feature matrix

	opentine	LangGraph	Temporal	DSPy
Run identity model	Content-addressed DAG	Graph state checkpoints	Workflow ID + event history	Compiled program artifact
Fork runs from any step	Yes (pointer op)	Manual state copy	Signal-based branch (heavy)	Not natively (recompile)
Bit-exact replay	Yes (deterministic sampling)	Replay from checkpoint	Event-sourced replay	Via prompt cache
Pause / resume across machines	Yes (portable .tine file)	With external persistence	Yes (designed for it)	Not the abstraction
Model-agnostic provider layer	Anthropic, OpenAI, Google, Ollama, OpenAI-compatible	LangChain providers	Bring your own	Bring your own
Visual console	3 first-party (TUI, desktop, web)	LangSmith (free tier + paid)	Temporal Web UI	Notebook-oriented
Event-sourced kernel	Yes	Partial	Yes (the whole point)	No
Diff two runs side-by-side	Yes (first-class)	Not natively	Compare event histories manually	Not natively
Deployment surface	CLI + libraries (Python)	Python library	Workers + server	Python library
Best for	Agent experimentation, debugging, reproducibility	LangChain users who need graphs	Durable workflows beyond agents	Prompt compilation research

LangGraph — when it's the right answer

LangGraph is the natural choice if you're already invested in LangChain or want a stateful graph runtime. Your retrievers, tools, memory systems, and callbacks all compose into a LangGraph flow with minimal additional code. The LangSmith tracing product (free Developer tier for light use, paid plans beyond) adds visibility.

LangGraph is the right answer if:

You're writing in Python and already use LangChain's abstractions (retrievers, memory, output parsers, tool adapters)
You need a stateful graph with conditional routing and cycles
Your observability story runs through LangSmith (free tier or paid)
You don't need cross-machine resumability of agent runs or bit-exact replay

Where LangGraph stops being the right answer:

Run forkability is a first-class need. LangGraph lets you save state checkpoints, but a fork is a deep copy plus a diverged execution — not a cheap pointer. At scale, this matters.
You need bit-exact replay against a content-addressed step identity. LangGraph's replay is stateful, not content-addressed.
You don't want the LangChain abstraction layer. If LangChain is more machinery than you want, LangGraph inherits it.
You need deeper visibility without paying for a tracing product. opentine ships three consoles (TUI, native desktop, browser) in the open-source core; LangSmith's free Developer tier is generous for light use, but production-scale tracing pushes you onto a paid plan.

Temporal — when it's the right answer

Temporal is the strongest durable-workflow system on the market. Its model is: write code that calls activities and workflows, Temporal persists every event, workers crash and restart without losing state. Agents are one natural use case but far from the only one.

Temporal is the right answer if:

Your agent is embedded in a larger system with non-agent workflows (billing, provisioning, long-running sagas). Unifying the orchestration layer has value.
Durability, retries, and crash-safety are load-bearing requirements. Temporal's event-sourcing is production-battle-tested.
You're comfortable deploying and operating the Temporal server or paying for Temporal Cloud.
You don't need agent-specific abstractions (model adapters, tool definitions, run diffs) — you'll build those on top.

Where Temporal stops being the right answer:

The problem is agent-shaped, not workflow-shaped. Temporal makes you spell out activities and workflows explicitly. For an experimentation-heavy agent workload where you're constantly forking and comparing runs, Temporal's ceremony is high.
You want a visual console of the agent's reasoning. Temporal's UI is workflow-history-shaped, not DAG-visualization-shaped.
You want model-adapter abstractions. Temporal doesn't ship them; you'd build them or compose Temporal with something else.
You don't want to run a server. Temporal's local-dev is fine, but the production story wants Temporal Cloud or a self-hosted cluster.

opentine and Temporal are not in direct competition for most workloads — they're at different altitudes. A realistic architecture has Temporal orchestrating outer workflows and opentine executing the agent runs inside specific activities, with the .tine output stored as a workflow artifact.

DSPy — when it's the right answer

DSPy is an academic-origin framework that treats programs as compiled artifacts. You write declarative modules, and DSPy's optimizers (MIPROv2 for Bayesian prompt/demo search, GEPA for reflection-based evolution, BetterTogether and LeReT for weight updates) turn them into instructions, few-shot demos, or actual model weights. Training data is one input mode; reflection-based optimizers don't require labeled examples. The research contribution is significant; the production story is narrowing as the optimizer stack matures.

DSPy is the right answer if:

You have a clear metric and training examples, and you want a compiler to optimize prompt structure automatically.
You're willing to accept DSPy's abstraction ceiling: the framework shapes how you think about agents, not the other way around.
You don't need run-level forkability, cross-machine resumability, or bit-exact replay — the primitive is program compilation, not run execution.

Where DSPy stops being the right answer:

You need to debug a specific agent run in detail. DSPy's abstraction hides the individual step-level detail that opentine's content-addressed DAG exposes.
Your workflow doesn't fit the compile-and-run pattern. Exploratory, iteration-heavy agent work lives in a different execution mode.
You want a portable run artifact. DSPy compiles to a program; opentine produces a .tine file you can ship, inspect, and replay independently.

DSPy is a research tool that has production uses in narrow cases (classification, retrieval-heavy tasks). opentine is a runtime tool for broader agent-execution concerns. They can be composed: a DSPy-compiled module can run inside an opentine step, and the compilation step itself can be a content-addressed opentine sub-graph.

Decision tree

Answer these in order:

Is your primary problem durable workflows (not agent-specific)? If yes → Temporal. If no → continue.
Is your primary problem prompt compilation with training data? If yes → DSPy. If no → continue.
Are you already deeply invested in LangChain's abstractions? If yes → LangGraph. If no → continue.
Do you need forkable runs, bit-exact replay, content-addressed step identity, a portable run file, or model-agnostic provider adapters? If any of these → opentine.

What opentine deliberately doesn't do

Fair comparison requires naming non-features:

opentine is not a durable workflow engine. If your requirement is "survive process crashes with saga semantics across non-agent activities," Temporal is correct. opentine persists step state but does not provide saga / compensation semantics.
opentine does not ship retrievers, memory systems, or output parsers. The agent kernel is minimal — Read, Write, Edit, Shell primitives plus a planner/executor split with approval gates. Rich retrieval and memory are upstream of the runtime and can be built with any library.
opentine is Python-first. Python SDK, three consoles. JVM / Go / Rust SDKs are not on the near-term roadmap.
opentine does not host model endpoints. Model calls go out to Anthropic / OpenAI / Google / Ollama / any OpenAI-compatible endpoint. opentine does not operate inference infrastructure.

Summary

opentine is the right answer when:

Run identity matters. You need to say "this exact step, here's its content hash" and have that mean something.
Forking is frequent. You iterate on agents by branching from a specific point, not re-running from scratch.
Replay has to be bit-exact. Regression tests, multi-agent evaluations, drift detection.
Model-agnosticism is non-negotiable. You want to swap Claude for GPT for Ollama without touching agent code.
You need a visual console without paying for a tracing product. Three first-party consoles ship with the core.

opentine product page →