Holographic Tombstones#
An AI agent under load generates more episodic memory than any reasonable VRAM budget can hold. Standard systems handle this by evicting old memory to disk and accepting that “evicted” effectively means “gone” — when the reasoning engine reaches for that memory it finds a null pointer, or worse, a stale one.
ANNIE refuses that trade. When a memory page is evicted, what remains in its place is a Holographic Tombstone: a small, fixed-size cryptographic witness that preserves the structural integrity of the memory graph and can be rehydrated on demand within a hard latency budget.
What it does#
A Tombstone is a 64-byte object that takes the place of an evicted memory payload. It contains a cryptographic digest of the original state, a monotonic generation counter, a nanosecond-precision timestamp, an event-flag bitmask describing what kind of memory it stands in for, and an integrity checksum.
Three things matter about that description:
- It fits in a cache line. Sixty-four bytes is not an accident. The reasoning engine pays the cost of a single cache load when it traverses a tombstone, not a heap walk.
- It is structurally indistinguishable from a live pointer at the call site. The reasoning engine does not branch on “is this memory loaded or evicted?” It dereferences. If the page is hot, it gets the page. If the page is cold, it gets rehydrated transparently and gets the page.
- It is cryptographically anchored to the ledger. A tombstone is not just a forwarding address — it is a proof that the evicted state was the state ANNIE saw and approved. Tampering with the backing store and replaying the tombstone produces a checksum mismatch, and the rehydration fails closed.
We call it “holographic” because, like a hologram, a small piece of it still contains the structure of the whole. The full payload may be elsewhere, but the tombstone alone is sufficient to prove what was there.
Why it matters#
Without this mechanism, two ugly options remain:
- Keep everything in VRAM. Bounds your agent’s working set to your GPU’s memory. Fine for demos, untenable for long-running operations.
- Evict freely and accept null reads. The reasoning engine has to handle “this memory may or may not exist” on every access. Branches everywhere. Latency variance explodes. Determinism is lost.
Tombstones eliminate the choice. Working sets scale beyond physical VRAM, the reasoning engine sees a uniform memory model, and the ledger remains intact because the tombstone references it.
How you observe it#
Tombstones are an internal mechanism — you do not interact with them directly. What you observe is the property they provide:
- Bounded rehydrate latency. On supported platforms, ANNIE commits to rehydration within the documented budget. If a rehydration exceeds it, the event is logged to the ledger as a soft fault and a counter is incremented in the metrics export.
- No null-pointer reasoning paths. The reasoning engine is not required to handle a “memory not found” case at the call site; rehydration is transparent or the read fails closed.
- Eviction-safe replay. Replaying a ledger reproduces the same approvals and denials regardless of which payloads happened to be evicted at the time, because the tombstones contain everything the kernel needed to verify the original decision.
What it does not do#
A Tombstone is not a compression scheme. It does not let you reconstruct the original payload from the witness alone — only verify it, locate it, and trigger its retrieval. If the backing store is destroyed, the data is gone; the tombstone will detect this and fail the rehydrate cleanly rather than hand the reasoning engine a corrupted page.
Tombstones do not extend indefinitely. They are subject to the platform’s retention policy, which is configurable per deployment.
Related#
- The Ledger of Reality — where tombstones are anchored.
- The Iron Lung — the deterministic control loop that the rehydrate budget feeds into.