Storage and blobs

NodalMerge separates transaction history storage from blob payload storage. That split is intentional:

Node history is small, frequent, and query-oriented
Blob bytes are large, less frequent, and lifecycle-oriented

This page explains the architecture-level model that keeps persistence durable without breaking deterministic replay.

Core storage model

NodalMerge persists two data classes:

Nodes: signed transaction DAG entries
Blobs: content-addressed binary payloads referenced by hash

Structured state is reconstructed by replaying nodes. Blobs are fetched and retained by hash reference, not embedded inline in every transaction.

Why nodes and blobs are split

This split supports practical backend composition:

Node backends can use SQL/document stores optimized for many small writes
Blob backends can use filesystem or object storage optimized for larger objects

The server persistence surface is designed so these halves can be mixed without changing replication semantics.

Durability modes

In-memory mode

Without a configured persistent store, server state is memory-only. Use this for:

Local development
Ephemeral demos
Short-lived test environments

In-memory mode is intentionally conservative about lifecycle automation that could cause data loss.

Durable mode

With store configuration enabled, server state survives restarts. A common built-in durable layout uses:

SQLite-backed node persistence
File-based blob persistence in content-addressed paths

Durability enables safe room hydration and lifecycle jobs like idle eviction and blob GC.

Hydration and runtime behavior

When a room is created in durable mode, persisted state is loaded into in-memory runtime structures. Hydration flow is conceptually:

Load persisted nodes for the room
Apply nodes into runtime graph state
Load persisted blobs into runtime blob store
Resume room processing from hydrated state

This preserves replayability while keeping hot room access fast after load.

Blob architecture

Blobs are content-addressed and referenced from map keys via blob hashes. Key properties:

Hash identity allows deduplication semantics
Binary payload transport stays separate from transaction DAG exchange
Blob retrieval can use direct URL flows where backend supports it

This keeps DAG replication efficient while still supporting large media/file payloads.

Blob lifecycle and GC safety

Blob retention is governed by reachability, not peer presence. At a high level:

A blob is live if referenced by authoritative room history/state rules
Unreferenced blobs are first tombstoned, then deleted after grace period
Re-referenced blobs during grace must be preserved

The two-phase model avoids accidental hard delete from transient state churn.

Important correctness rule

Blob GC must be driven by deterministic reachability semantics from room data, not transport/session behavior. That ensures storage cleanup does not change replay outcomes or cause missing payloads during normal catch-up flows.

Operational implications

From an architecture perspective:

Enable durable storage before relying on long-lived room history
Treat node backups as replay-critical artifacts
Treat blob retention policy as part of product data governance
Keep grace windows explicit and environment-specific

Detailed tuning, flags, and runbook guidance belong in operator docs.

Common design mistakes

Treating in-memory mode as production durability
Coupling blob liveness to active WebSocket sessions
Deleting blobs without grace/tombstone phase
Designing app state that assumes inline blob payloads in map values

​Storage and blobs

​Core storage model

​Why nodes and blobs are split

​Durability modes

​In-memory mode

​Durable mode

​Hydration and runtime behavior

​Blob architecture

​Blob lifecycle and GC safety

​Important correctness rule

​Operational implications

​Common design mistakes

​Related pages