Content-Addressed Steps: How opentine Guarantees Bit-Exact Replays
Every step in an opentine run is hashed from its inputs into a content-addressed ID. Identical inputs always produce the same step, making replays and caching free.
What content-addressing means in opentine
Every step in an opentine run has an ID derived from the hash of its inputs. In the current core (see opentine.core.step_id), the hash is computed over three canonical components:
kind— the step type (prompt, tool-call, tool-result, etc.)inputs— the step's payload dict (the prompt text, the tool name + arguments, the fetched bytes, etc.)parent_id— the ID of the step this one follows
def step_id(kind, inputs, parent_id):
blob = msgspec.json.encode({"k": kind.value, "i": inputs, "p": parent_id})
return hashlib.sha256(blob).hexdigest()[:12]The ID is a 12-character prefix of the SHA-256 — short enough to reference in a terminal, long enough to avoid collisions at agent-run scale. Two steps built from the same (kind, inputs, parent_id) triple produce the same ID every time.
This is the same primitive git uses for commits and Nix uses for packages. opentine applies it to agent execution.
What the ARP spec formalizes beyond the current core
The Agent Run Protocol specification — the framework-agnostic standard opentine also implements — extends content-addressing in ways the reference core will pull in over time:
- Richer hash inputs: system prompt, tool schemas, model identifier + version, sampling parameters — all intended to be part of the canonical hash so two runs with different models produce different IDs even when the user text matches. The shipping core today hashes the inputs payload; deeper inputs are recorded in step metadata but not yet in the ID itself.
- Canonical serialization rules: lexicographic JSON key ordering, float normalization, NFC Unicode, trailing-zero stripping. The ARP spec documents these; opentine today relies on msgspec's default deterministic ordering, which is stable in practice across msgspec versions but not yet a formally-canonical spec.
- Object-store layout: the spec defines a content-addressed
.arp/objects/<prefix>/<suffix>/layout for cross-run deduplication. Today opentine serializes whole runs to.tinefiles; the object-layer CAS is a forward item.
The posts in this silo describe both the shipping core and the direction the ARP spec is pulling it. Where a feature is aspirational, we flag it.
Why it matters — the three properties it unlocks
1. Replay is bit-exact when the model is deterministic
If you replay a step whose inputs include a deterministic-sampling model call (temperature 0, fixed seed, or a local model with fixed weights), the output is bit-identical to the original. No "re-run and hope it matches" — the semantics are precisely reproducible.
For debugging, regression tests, and multi-agent evaluations, this is the difference between "I think it was working yesterday" and "the step ID is identical, here's the diff between then and now."
For stochastic-sampling runs, the step ID still uniquely identifies the inputs that were requested — so a replay tells you whether the model's behavior has drifted between the original run and now. If the model is newer (e.g., Anthropic updated the underlying weights), the output will differ, and the diff is the drift signal.
2. Resume and fork reuse work without re-execution
Since step IDs are stable, a run that re-executes a subtree doesn't need to recompute ancestors that are already serialized in the .tine file. tine resume and tine fork re-materialize the parent chain by reading it from disk — no model round-trips for steps that have already happened.
A note on caching vs. resume: content-addressing is the foundation a transparent step cache can be built on, and the ARP spec describes that object-layer cache. The shipping opentine core today does not yet implement automatic cache-lookup on new runs (each tine run starts fresh); what it does implement is resume and fork from persisted step trees, which gets you most of the practical benefit for iteration workflows. Treat cross-run caching as a near-term addition, not a current feature.
Practical effect today: iterating on an agent workflow where only the later steps change costs near-zero tokens to re-run the early steps, because you resume or fork from the saved tree rather than starting a new run.
3. Forking is a pointer operation
When you fork a run from step 36 — changing the next prompt, tool definition, or model — you don't need to copy the preceding 35 steps. The fork is a new run that shares the step IDs 1–35 with the parent and diverges at 36. Disk and network overhead is a handful of bytes.
If you later want to compare the two runs, opentine can diff them: identical subtrees are elided; divergence points are highlighted. The fork primitive post covers the user-facing ergonomics of this.
How the hashing stays stable
Naive content-addressing breaks on serialization-order ambiguity. If today {"a": 1, "b": 2} and tomorrow {"b": 2, "a": 1} represent the same logical input, a naive hash treats them as different IDs.
opentine's shipping core uses msgspec.json.encode, which produces a deterministic byte order for equivalent inputs on the same msgspec version and platform — stable enough that the same (kind, inputs, parent_id) triple reliably yields the same step ID.
The ARP specification goes further and defines a formal canonical form (lexicographic JSON key ordering, NFC Unicode normalization, normalized floats, trailing-zero stripping, fixed enum/boolean representations) so that step IDs are portable across runtimes and implementations. That canonical form is the direction the reference core is moving — today it is the spec, tomorrow it is also the enforcement layer.
The run tree (with DAG-shaped composition)
Each run's step graph today is a tree — every step has a single parent_id. Forks create new runs that share ancestor step IDs with the parent run but diverge from the fork point forward. When you look at the full picture across a family of forked runs, the composition is DAG-shaped: shared ancestor subtrees, divergent branches.
Runs are serialized to portable .tine files — a single archive containing every step's metadata (ID, kind, inputs, outputs) and the parent-chain edges. A .tine file is self-contained: ship it to a colleague, they can replay, inspect, or fork it locally.
Step IDs are globally consistent because the hash inputs are. Two .tine files that both contain a step whose ID is fe3a767307a4 describe the same step, and import tooling can deduplicate on that property — though the object-level cross-run deduplication store is part of the ARP-spec roadmap (the current core stores whole-run .tine files).
What this gives you in the three consoles
opentine exposes three first-party consoles over the same content-addressed kernel:
| Console | Purpose | What content-addressing unlocks |
|---|---|---|
| opentine-tui | Terminal dashboard (Textual) | Live filtering on step IDs, keyboard-driven fork / resume / diff against any parent step |
| opentine-gui | Native desktop (Dear PyGui) | Interactive node-editor of the DAG, minimap navigation, fork from any visual node |
| opentine-web | Browser (Starlette + Mermaid.js) | Shareable DAG views via URL, REST API endpoints keyed on step ID, embedding in internal tooling |
Each console reads the same .tine runs directory (configurable via OPENTINE_RUNS_DIR / TINE_RUNS_DIR). Switching surfaces mid-workflow is a no-op — your state is in the on-disk run files, identified by the same content-derived step IDs in every console.
Comparison: how this differs from other agent frameworks
| Framework | Step identity model | Replay | Fork cost |
|---|---|---|---|
| opentine | Content-addressed hash of inputs | Bit-exact (deterministic sampling) | Pointer operation |
| LangGraph | State checkpointing with custom serializer | Replay from checkpoint, not bit-exact | Deep-copy of state |
| DSPy | Program compilation + prompt caching | Cache keyed on compiled program | Recompile to fork |
| Raw agent loops (no framework) | None | Impossible without re-running everything | Re-run from scratch |
The distinctive property is that content-addressing makes replay, caching, and forking the same mechanism — all three reduce to "look up this ID in the content-addressed store." Other frameworks treat them as separate concerns.
When content-addressing doesn't help
A realistic post about any primitive should cover its limits:
- Non-deterministic model sampling. If you run at high temperature with no seed, two executions of the same inputs produce different outputs. The step ID is stable (keyed on inputs), but the realized output in the
.tinefile is whatever the model returned that time. Resume/fork reuses the saved output; a fresh run from the same inputs would get a different one. Callers can rely on the saved tree for reproducibility or override it by deleting the stored output — explicit invalidation rather than automatic. - External mutable state. If a step calls an external API whose response depends on wall-clock time or an external database's state, the outputs will differ between runs. opentine records this by including the external fetch response in the step's content-addressed inputs for downstream dependencies, but it cannot retroactively make the external world deterministic.
- Model provider drift. If Anthropic silently updates
claude-opus-4-7weights between runs, the content-derived step ID remains the same (under the current hash scope) but the output drifts. Under the ARP spec's richer hash inputs this changes — the model identifier and version become part of the ID so drift shows up as a different step ID. Until that lands, "same ID" does not imply "same output" across time when a model identifier is coarse-grained.
In practice most agent workflows are tolerable of all three caveats, and the 90% case where content-addressing works cleanly is where most of the value lives.
Where to go next
- opentine: Why the Fork Primitive Matters — the user-level story
- opentine vs LangGraph vs Temporal vs DSPy — runtime comparison
- The opentine product page for installation, CLI reference, and the three consoles
Or install the CLI and inspect a run yourself: tine run <your-agent.py> and then tine show <run_id> to see the content-addressed step tree that gets built.
More from opentine
opentine vs LangGraph vs Temporal vs DSPy: Choosing an Agent Runtime
LangGraph fits LangChain users who don't need forkable runs. Temporal fits durable-workflow problems where agents are incidental. DSPy fits prompt-compilation research. opentine fits when run forkability, bit-exact replay, and model-agnosticism are the primitive you're building on.
opentine: Why the Fork Primitive Matters
opentine treats every agent step as a content-addressed node in a DAG so you can fork, replay, and diff runs the way git lets you branch and rebase code.