Storage
Engine

How Substrate stores data: append-only logs, LSM indexes, interval trees, and content-addressed artifacts.

Physical architecture

Storage is organized into three planes with distinct responsibilities. The truthlog is canonical; everything else is derived.

1

Truthlog Plane

Canonical

Immutable, append-only commit log. Every write goes here first. Hash-chained commits with merkle tree for tamper evidence.

log/tenant=<t>/shard=<s>/segment=<n>.blk log/.../segment=<n>.idx log/.../merkle/
2

Index Plane

Derived

LSM-based indexes and columnar snapshots for fast queries. Fully rebuildable from truthlog. Never authoritative.

kv/wal/ kv/sst/level=N/*.sst snapshots/sys=<bucket>/*.parquet
3

Vault Plane

Artifacts

Content-addressed blob storage. Deduped, encrypted, chunked. Documents, code, logs, builds, SBOMs.

cas/chunk/<hash_prefix>/<digest> cas/manifest/<root_hash> cas/keys/<root_hash>

Truthlog format

Segment layout

┌─────────────────────────────────────────────┐
│ magic │ version │ tenant_id │ shard_id     │
├─────────────────────────────────────────────┤
│ commit_id │ sys_time │ entry_count          │
├─────────────────────────────────────────────┤
│ header_crc                                  │
├─────────────────────────────────────────────┤
│ entries... (length-prefixed)                │
│   - version_insert                          │
│   - version_close                           │
│   - assertion                               │
│   - policy                                  │
│   - redaction_event                         │
├─────────────────────────────────────────────┤
│ commit_hash │ prev_hash │ signature_ref     │
└─────────────────────────────────────────────┘

Canonical encoding

Protobuf with deterministic serialization. Map fields sorted by key. Default values explicit. Enables reproducible hashes.

Hash chain

Each commit includes prev_hash, creating an unbroken chain. Verifiable without full log traversal.

Merkle log

Commits are leaves in a merkle tree. Signed roots published periodically. Inclusion and consistency proofs available.

Write path

1

Validate

Check authn/authz and schema. Reject invalid writes before logging.

2

Canonicalize

Encode entries in deterministic protobuf. Compute integrity hashes.

3

Append to segment

Write to local segment file. Fsync for durability.

4

Raft replication

Replicate to quorum (5 replicas default). Commit on quorum ack.

5

Update merkle

Add commit to merkle log. Generate inclusion proof.

6

Async indexing

Stream to index plane. Update LSM and snapshots.

Index structures

BT_PRIMARY

Point-in-time lookups per lineage

key: (tenant, lineage_id, valid_from)
val: { valid_to, versions_by_sys }

versions_by_sys = (
  sys_from, sys_to, value_ptr, assertion_id
)

Supports AS-OF queries. Find floor(valid_from), then binary search versions for matching system interval.

SYS_DELTA

What changed when

key: (tenant, sys_bucket, sys_from, commit_id)
val: { changed_lineage_ids, changed_edge_ids }

Enables diff queries. "What changed between s1 and s2?" scans buckets and joins to affected lineages.

OUT_ADJ / IN_ADJ

Graph adjacency with temporal intervals

key: (tenant, src, type, valid_interval_key, edge_id)
val: { dst, valid_to, props_ptr }

Interval tree structure for valid-time stabbing queries. Fast 1-hop and bounded n-hop traversal.

Snapshots

Point-in-time frozen state

Immutable columnar files (Parquet) at system-time bucket boundaries. For system-time < now, use snapshot + delta overlay.

Content addressing

Artifacts in the vault use content-defined chunking and merkle DAGs for deduplication and integrity verification.

Chunking

Content-defined chunking (CDC) splits files at content-driven boundaries. Similar files share chunks for dedup.

Hashing

Multihash format: <algo><length><digest>. Default SHA-256 (0x12) with 32-byte digest.

Merkle DAG

Large artifacts are trees of chunks. Root hash verifies entire artifact. Individual chunks independently verifiable.

Encryption

Each chunk encrypted with DEK (AES-256-GCM). DEK wrapped by tenant KEK in HSM. Key metadata stored separately.

Compression strategy

SST blocks

Zstd (balanced speed/ratio)

IDs

Dictionary + delta encoding

Timestamps

Delta-of-delta within blocks

Dense neighbors

Roaring bitmaps for high-degree nodes

Continue reading