Replay debugging
Replay debugging is the fastest way to answer “what actually happened” in NodalMerge systems. Because state is derived from history deterministically, you can reproduce and inspect outcomes from node artifacts instead of guessing from UI behavior.When to use replay debugging
Use replay debugging for:- Divergent client outcomes
- Suspected policy timeline mismatches
- Post-incident validation of canonical state
- Migration/cutover verification
Core workflow
- Capture relevant node pack artifact(s)
- Replay with expected context (policy timeline if needed)
- Compare canonical hash and resolved state
- Isolate mismatch source (data, policy, transport, or app assumptions)
Replay command
How to interpret results
Replay output gives:- Resolved key/value state
- Canonical hash
- Hash matches expected baseline -> likely transport/UI or interpretation issue
- Hash mismatch with same expected inputs -> data/policy artifact mismatch
- Replay failure (missing parent/signature/etc.) -> artifact integrity or capture completeness issue
Debug checklist
When replay and runtime disagree:- Confirm artifact completeness (not partial pack)
- Confirm policy timeline context matches runtime window
- Confirm compared environments use equivalent inputs
- Confirm app’s “canonical” view isn’t actually speculative lane output
Common root causes
- Incomplete pack capture (missing parent history)
- Wrong policy timeline used for replay
- Comparing different checkpoint windows
- UI assuming latest-server snapshot instead of deterministic replay-derived truth
Practical evidence bundle
For serious incidents, retain:- Source pack artifact
- Replay command and policy arguments used
- Output canonical hash
- Relevant runtime logs and metric windows
Automation suggestions
Automate replay checks in:- High-risk migrations
- Promotion workflows in room-family topologies
- Restore/disaster-recovery drills