How Substrate stores data: append-only logs, LSM indexes, interval trees, and content-addressed artifacts.
Storage is organized into three planes with distinct responsibilities. The truthlog is canonical; everything else is derived.
Immutable, append-only commit log. Every write goes here first. Hash-chained commits with merkle tree for tamper evidence.
log/tenant=<t>/shard=<s>/segment=<n>.blk log/.../segment=<n>.idx log/.../merkle/ LSM-based indexes and columnar snapshots for fast queries. Fully rebuildable from truthlog. Never authoritative.
kv/wal/ kv/sst/level=N/*.sst snapshots/sys=<bucket>/*.parquet Content-addressed blob storage. Deduped, encrypted, chunked. Documents, code, logs, builds, SBOMs.
cas/chunk/<hash_prefix>/<digest> cas/manifest/<root_hash> cas/keys/<root_hash> ┌─────────────────────────────────────────────┐
│ magic │ version │ tenant_id │ shard_id │
├─────────────────────────────────────────────┤
│ commit_id │ sys_time │ entry_count │
├─────────────────────────────────────────────┤
│ header_crc │
├─────────────────────────────────────────────┤
│ entries... (length-prefixed) │
│ - version_insert │
│ - version_close │
│ - assertion │
│ - policy │
│ - redaction_event │
├─────────────────────────────────────────────┤
│ commit_hash │ prev_hash │ signature_ref │
└─────────────────────────────────────────────┘ Protobuf with deterministic serialization. Map fields sorted by key. Default values explicit. Enables reproducible hashes.
Each commit includes prev_hash, creating an unbroken
chain. Verifiable without full log traversal.
Commits are leaves in a merkle tree. Signed roots published periodically. Inclusion and consistency proofs available.
Check authn/authz and schema. Reject invalid writes before logging.
Encode entries in deterministic protobuf. Compute integrity hashes.
Write to local segment file. Fsync for durability.
Replicate to quorum (5 replicas default). Commit on quorum ack.
Add commit to merkle log. Generate inclusion proof.
Stream to index plane. Update LSM and snapshots.
Point-in-time lookups per lineage
key: (tenant, lineage_id, valid_from)
val: { valid_to, versions_by_sys }
versions_by_sys = (
sys_from, sys_to, value_ptr, assertion_id
) Supports AS-OF queries. Find floor(valid_from), then binary search versions for matching system interval.
What changed when
key: (tenant, sys_bucket, sys_from, commit_id)
val: { changed_lineage_ids, changed_edge_ids } Enables diff queries. "What changed between s1 and s2?" scans buckets and joins to affected lineages.
Graph adjacency with temporal intervals
key: (tenant, src, type, valid_interval_key, edge_id)
val: { dst, valid_to, props_ptr } Interval tree structure for valid-time stabbing queries. Fast 1-hop and bounded n-hop traversal.
Point-in-time frozen state
Immutable columnar files (Parquet) at system-time bucket boundaries. For system-time < now, use snapshot + delta overlay.
Artifacts in the vault use content-defined chunking and merkle DAGs for deduplication and integrity verification.
Content-defined chunking (CDC) splits files at content-driven boundaries. Similar files share chunks for dedup.
Multihash format: <algo><length><digest>.
Default SHA-256 (0x12) with 32-byte digest.
Large artifacts are trees of chunks. Root hash verifies entire artifact. Individual chunks independently verifiable.
Each chunk encrypted with DEK (AES-256-GCM). DEK wrapped by tenant KEK in HSM. Key metadata stored separately.
SST blocks Zstd (balanced speed/ratio)
IDs Dictionary + delta encoding
Timestamps Delta-of-delta within blocks
Dense neighbors Roaring bitmaps for high-degree nodes