Runtime attribution
Runtime attribution explains where time is spent, not just how fast a benchmark finished. Use this page to avoid incorrect optimization decisions (for example, over-tuning core CRDT logic when overhead is actually transport or host-boundary related).Why attribution matters
End-to-end latency in collaborative systems is a composite of multiple layers:- Core merge/replay logic
- Serialization and envelope processing
- Host/runtime boundary crossings
- Persistence and I/O paths
- Transport delivery and reconnect behavior
Attribution layers
Use this conceptual breakdown:- Core engine layer
- Deterministic CRDT/replay operations
- Signature checks and hash work
- Runtime boundary layer
- FFI/bridge marshaling
- Host/runtime adapter overhead
- Transport layer
- WebSocket flow control
- Reconnect, fan-out, and relay filtering behavior
- Persistence/lifecycle layer
- Node/blob persistence latency
- Hydration, GC, and room lifecycle effects
Reading benchmark outcomes correctly
When analyzing benchmark results:- Separate steady-state throughput from cold-path setup costs.
- Compare like-for-like topology and auth posture.
- Track per-layer metrics with benchmark outputs.
- Use multi-run alternation to reduce drift bias.
Known sensitivity axes
In NodalMerge workloads, overhead sensitivity often differs by operation type:- Map/list operations often show modest overhead shifts across runtime profiles.
- Blob-heavy paths are usually more sensitive to transport/storage choices.
- Auth/policy features may add measurable but acceptable overhead depending on host profile.
Methodology baseline
Use repeatable benchmark methodology:- Fixed command mix and payload sizes
- Explicit peer-count tiers
- Repeated runs per cell
- Alternating baseline/candidate sequences to expose drift
- Environment capture (host hardware/runtime/OS)
Practical interpretation rubric
Use simple decision zones for candidate vs baseline deltas:- Green: within tolerated overhead budget for target workload
- Yellow: investigate likely boundary/transport/storage causes
- Red: regressions large enough to block promotion until explained/fixed
Attribution-to-action mapping
When a benchmark degrades, map symptom to likely layer:- Merge/replay CPU increase -> core or signature path investigation
- Large host variance with stable core microbench -> boundary/runtime overhead
- Blob regressions with stable map/list -> transport/blob storage path
- Tail spikes with reconnect/queue signals -> backpressure and session lifecycle
Operator/CI integration guidance
For release confidence:- Keep benchmark artifacts versioned per run.
- Record baseline and candidate with identical scenario matrix.
- Gate promotion on agreed delta budgets per operation family.
- Pair benchmark verdict with observability metrics from realistic workloads.
Common mistakes
- Treating all overhead as CRDT-core overhead
- Ignoring environment drift between runs
- Using only average latency and ignoring tail behavior
- Comparing different topology/auth profiles as if equivalent
- Optimizing hot paths that are not current bottlenecks