Performance overview
This page gives you the benchmark framing needed to evaluate NodalMerge performance without overfitting to single-run numbers. Use it as the entry point before diving into attribution or operation-specific benchmark artifacts.What benchmarks should answer
NodalMerge benchmarking should answer:- Is current behavior within acceptable performance budgets?
- Did a change cause meaningful regression?
- Which layer likely caused the change?
- Is the system still safe to promote under expected workload shape?
Benchmark dimensions that matter
At minimum, benchmark matrices should vary:- Peer cardinality
- Operation mix (map/list/blob)
- Payload sizes
- Auth/policy posture
- Runtime host surface
Scenario classes
Maintain at least these scenario classes:- Steady-state sync: typical collaborative command mix
- Auth/policy-enabled: capability and governance overhead posture
- Blob-sensitive: payload-heavy behavior
- Replay/validation: deterministic reconstruction cost
- Lifecycle-sensitive: persistence/GC/topology side effects under load
Baseline vs candidate methodology
For credible comparisons:- Keep scenario matrix fixed.
- Run baseline and candidate in alternating sequence when possible.
- Capture environment metadata (hardware/runtime/OS).
- Compare per-dimension deltas, not only aggregate averages.
Reading results safely
When interpreting results:- Look at map/list/blob separately
- Compare p95/p99 and not only mean
- Validate stability across peer-count tiers
- Cross-check with runtime attribution signals
Suggested decision bands
Use configurable decision bands for release gating:- Green: deltas within agreed operation-specific budgets
- Yellow: investigate and justify before promotion
- Red: block promotion pending fix or explicit risk acceptance
Benchmark artifact hygiene
Treat benchmark outputs as release artifacts:- Timestamped result files
- Baseline/candidate pair tracking
- Scenario metadata embedded in results
- Clear provenance of tooling/version
Common benchmark anti-patterns
- Comparing non-equivalent environments
- Using one benchmark flavor to represent all workloads
- Ignoring auth/policy-enabled scenarios
- Ignoring blob-heavy cases
- Treating microbench wins as guaranteed product-level wins
What to do after a regression
- Confirm reproducibility with rerun.
- Identify affected dimension (map/list/blob, peers, host, auth mode).
- Check runtime attribution to isolate likely layer.
- Decide: optimize, accept with rationale, or revert.