Substrate
Database

The bitemporal truth store that powers CFS implementations. Append-only, graph-native, verifiable. Designed for org reality, built for AI.

Graph-Native Bitemporal Tamper-Evident

What is Substrate?

Substrate is a database architecture designed for "org reality" - the web of entities, relationships, events, and artifacts that make up how an organization actually works.

Bitemporal by design

Every fact has two time dimensions: valid time (when it was true in reality) and system time (when the system knew it). You can ask "what was true then?" and "what did we believe then?" - separately.

Graph-native

Entities and relationships are first-class. Traversal is indexed. "What depends on this service?" is a fast query, not a join explosion. Property graph model aligned with ISO GQL.

Append-only truth

The canonical log never changes. Corrections append new records, they don't rewrite history. Every commit is hash-chained. Merkle proofs verify integrity.

Agent-native

Built for AI from the start. Capability-scoped access. Citations on every claim. Safe write workflows. Minimal disclosure by default.

Architecture

Substrate separates concerns into three physical planes. The canonical truthlog is the source of all authority; everything else is derived.

Truthlog Plane

Canonical

Immutable append-only commit log. Hash chain + merkle tree. Signed roots. This is the audit-proof foundation.

  • Segment files per tenant shard
  • Raft quorum for writes
  • Merkle log for tamper evidence

Index Plane

Derived

LSM-based indexes + columnar snapshots. Fast queries. Fully rebuildable from truthlog if needed.

  • Bitemporal interval indexes
  • Graph adjacency indexes
  • System-time snapshots

Vault Plane

Artifacts

Content-addressed chunk store. Deduplicated, encrypted, policy-aware. Docs, code, logs, builds, SBOMs.

  • Multihash content addressing
  • Per-object encryption (DEK/KEK)
  • Retention + redaction policies

Key insight: Indexes are never the source of truth. If an index disagrees with the truthlog, the index is wrong and gets rebuilt. This is how you get both performance and correctness.

Key capabilities

AS-OF

Time-travel queries

Query the graph at any point in valid-time or system-time. "What did we know on January 5th about what was true in December?"

DIFF

Change detection

"What changed between Monday and Friday?" Fast delta queries using system-time partitioned indexes.

PROVENANCE

Full audit trail

Every fact traces to an assertion: who asserted it, from what sources, using what derivation. Citations are first-class.

TRAVERSE

Bounded graph walks

Efficient 1-hop, n-hop, and pattern matching. Dependency graphs, blast radius, ownership chains - all indexed.

REDACT

Delete without destroying

Cryptographic erasure via key shredding. The audit trail stays intact ("something existed here") but content is unrecoverable.

VERIFY

Tamper evidence

Inclusion proofs for any commit. Consistency proofs for the log. Independent witnesses can verify no history was rewritten.

Performance targets

Substrate optimizes for interactive queries on "system=now" while supporting arbitrary historical access. These are the design targets:

<150ms p95 1-hop expansion

Expand neighbors for a node with degree ≤5,000

<800ms p95 bounded BFS

Depth-3 traversal with 50k node cap

<2s Daily diff query

"What changed in the last 24 hours?"

500k/sec Ingestion rate

Sustained facts per second per shard

<50ms Inclusion proof

p99 proof generation for any committed tx

24h/1B Full reindex

Rebuild indexes from truthlog (parallel)

These targets assume "system=now" queries (the common case). Historical queries may use snapshot lookup + delta application with modest overhead.

Deep dives

Reference scenario

Substrate is designed for orgs like this: a company with repos, services, tickets, incidents, contracts, and policies. A new leader joins and needs to understand reality fast.

"Show me the dependency graph for the payments service as of last month"

Bitemporal query: valid-time = last month, system-time = now. Returns what was actually true then, with current knowledge.

"What did we believe about service ownership on January 5th?"

System-time query. Shows what the org knew at that point, even if later corrections changed the picture.

"What changed in the last week that could affect API reliability?"

Diff query + blast radius. Uses system-time deltas and graph traversal to surface relevant changes.

"Show me the provenance chain for this incident conclusion"

Provenance query. Traces the claim back through assertions, sources, and derivations. Cites every artifact.

Start here