KAPEX (pronounced K-Apex) is memoryware — memory middleware that gives any AI application governed, persistent memory. It intercepts LLM inputs and outputs, maintains a hierarchical memory graph with salience-scored entity nodes, and injects the highest-salience context at query time. Built by Sandstone Cloud.

Memoryware is a new software category coined by Sandstone Cloud. It refers to middleware that governs AI memory — not just storing conversations, but scoring them for salience, decaying them over time, detecting contradictions, and controlling how confidently past context is presented. KAPEX is the first product in this category.

How is KAPEX different from Mem0 or Zep?

Mem0 stores memories and retrieves them by similarity. Zep provides session management with keyword-based retrieval. KAPEX scores memories using a multi-signal composite base score, applies temporal decay, uses three-channel retrieval (salience + recency + constraints), and gates presentation confidence with groundedness integrity scoring. Other products store memories — KAPEX governs which ones matter.

How does KAPEX handle memory decay?

In KAPEX, worked-through memories fade while unresolved content persists. This prevents stale, repetitive memories from dominating retrieval and ensures the context window reflects where the user is now, not where they were. Patent pending.

What LLMs does KAPEX work with?

KAPEX is provider-agnostic. It works with Anthropic Claude, OpenAI GPT, Google Gemini, Meta Llama, Mistral, Cohere, and any model accessible via OpenAI-compatible API, including local models via Ollama, LM Studio, or vLLM. You bring the key.

How do I integrate KAPEX?

KAPEX offers four integration surfaces: OpenAI-compatible proxy (zero code changes), MCP server (Claude Desktop, Cursor, Windsurf), REST API, and webhooks. Client libraries are available for Python and JavaScript. Typical integration takes under a day for a senior engineer.

What happens if Sandstone Cloud goes down?

If you are on the self-hosted plan, your KAPEX container continues running independently with a grace period. Your memory graph is stored in your PostgreSQL database, which you control. You would lose access to future updates and cloud-only features, but your existing memory data and current functionality remain intact.

How does KAPEX handle privacy and PII?

KAPEX includes a PII scrubber that detects and redacts sensitive information (SSN, credit cards, bank accounts, passports) at ingestion time before it enters the memory graph. GDPR Article 17 deletion is supported — users can request complete memory deletion. All provider API keys are encrypted with Fernet before storage.

Patent pending · Memoryware

Your AI forgets
everything.
We fix that.

Name: KAPEX
Author: Sandstone Cloud

KAPEX is memoryware — memory middleware that scores what matters, lets the rest fade, and injects the right context at query time. One memory graph across every LLM. Tell one model something. Ask another. It knows.

Self-hosted, licensed middleware that runs in your VPC. Try it in our hosted sandbox. Deploy it in your own infrastructure for production.

Start free trial Read the docs

Free sandbox · 50 messages/mo · No credit card · Self-hosted for production

How the integration reads

1 Send the turn. Your app passes the user's message and your LLM's reply to KAPEX. KAPEX scores, stores, and updates the memory graph.
2 Recall what matters. At query time, KAPEX returns a token-budgeted context block of the highest-salience memories for that user.
3 Inject into your prompt. Paste the context into your system prompt. The model now remembers — across sessions, across products, across providers.

How it works

Five operations, between your app and your model.

KAPEX runs as middleware. No model changes. No prompt rewriting. Drop it in via SDK, MCP server, or REST.

01 · Intercept

Captures I/O

Sits between your app and any LLM. No model changes required. Reads inputs and outputs.

02 · Extract

Entities & topics

Entity extraction, topic detection, relationship mapping. Builds the structure the scorer reads from.

03 · Score

12-signal composite

Twelve independent linguistic signals fuse into a legitimacy score at write-time. Cold-start capable from message one.

04 · Decay

Processing-modulated

Processed memories fade. Unresolved ones persist. The graph self-manages without manual pruning. Patent pending.

05 · Inject

At query time

Top-salience nodes assemble into a token-budgeted context block, ready for your prompt.

Salience over time

Conventional recall is flat. KAPEX models the curve.

Every prior approach treats memories as equal-weight nodes that strengthen with access. KAPEX takes a different approach: worked-through content fades from active context while unresolved content persists at high salience. Patent pending.

Conventional recall — flat, undifferentiated

KAPEX salience — scored, decayed, restored

Three differentiators

Other products store memories. KAPEX understands which ones matter.

12-signal composite scoring

Every memory node is scored at write-time across twelve independent linguistic signals. Not recency. Not frequency. Cold-start capable from message one. Patent pending.

Adaptive memory lifecycle

Memories that have been addressed naturally fade, while unresolved content persists. Worked-through topics make room for what matters now. The graph self-manages without manual curation. Patent pending.

13-module safety layer

Crisis detection, anti-fabrication guard, PII scrubbing, trigger management, GDPR per-node deletion. Runs identically regardless of which LLM you use downstream.

Use Cases

Built for any AI product where memory drives retention.

AI companions

Character memory across sessions. Personality consistency. Story continuity that holds up over weeks, not turns. The user’s world stays coherent because the memory graph never forgets who matters to them.

AI coding

Codebase context across sessions. The Postgres-vs-MySQL decision from six months ago persists. Routine noise decays. Your coding assistant remembers architectural choices without stuffing the context window.

AI therapy & coaching

Clinical-grade session memory. Treatment progress tracking. Per-client salience models that stay stable across months. Resolved grief decays from active context while unresolved concerns persist.

AI sales & SDR

Prospect memory across multi-week sequences. Stakeholder context that persists across every touchpoint in a deal cycle.

AI meeting tools

Cross-meeting memory. Recurring-participant context. Decision tracking that doesn’t forget what was decided in March.

AI education

Per-student learning models. Adaptive difficulty that persists across sessions. What the learner knows is treated as state, not noise.

KAPEX vs. alternatives

Storage. Search. Both miss the layer that matters.

Capability

KAPEX

Mem0

Zep

Raw RAG

Salience scoring · 12-signal composite

—

Governed memory decay · patent pending

—

Cold-start scoring · salient from message one

—

Safety layer · crisis, anti-fabrication, PII, GDPR

—

Entity hierarchy · domain → entity → facet

—

MCP native · 8 tools · stdio, SSE, HTTP

—

A/B validated · preference under NDA

—

Per-node deletion · GDPR / right-to-be-forgotten

—

Multi-tenant isolation

—

Provider-agnostic · any LLM

Self-hosted by default

—

Enterprise

KAPEX vs. everyone

Memory Quality Over Time

Every competitor flatlines. Scored, governed memory compounds.

Illustrative. Based on observed degradation patterns across 150+ builder conversations and KAPEX study data (1,655 participants, 99K+ messages).

The data

Blinded A/B study. 1,655 participants. The gap never closes.

Users chatted with two AI panels side by side — one with KAPEX memory, one without. They didn’t know which was which. At session one, it’s a coin flip. By session twenty, they choose KAPEX four out of five times.

80%

Preference at depth

3,744

Blinded ratings

p<10^-17

Significance

Patents filed

The longer they use it, the more they choose it

S1-3

46%

S11-20

69%

S21-30

80%

Preference monotonically increases with session depth. The memory graph becomes increasingly differentiated. Flat memory can’t catch up.

86%

of users with 10+ sessions ultimately preferred KAPEX

Integration

Deploy, connect, inject.

KAPEX ships as a Docker container. Run it in your VPC, point your app at it, inject the salience context into your LLM prompt. Typical integration is under a day — and your users’ data never leaves your infrastructure.

01 · Deploy

Run the container in your cloud

One-click CloudFormation template for AWS, or a signed Docker package for any environment that runs containers. You bring PostgreSQL, optional Redis, and your LLM key. Sandstone supplies the container and the license.

AWS · GCP · Azure · on-prem · air-gap

02 · Connect

Pick a surface

Your app talks to your KAPEX endpoint via Python SDK, MCP server (stdio · SSE · HTTP), or plain REST. Same engine, three surfaces — provider-agnostic from any language or framework.

SDK · MCP · REST · OpenAPI 3.1

03 · Inject

Memory in your prompt

Send each turn to KAPEX, get a token-budgeted salience context block back, paste it into your system prompt. No model changes. Works with Claude, GPT, Gemini, Llama — you bring the key.

model-agnostic · zero prompt rewriting

Your data never leaves

Self-hosted by default. Sandstone has no access to your data.

User messages, memory graphs, scoring state, and conversation history live in your PostgreSQL instance — in your VPC, your region, your encryption. The container performs a daily license heartbeat (key hash only, zero user data) and runs for 7 days offline before degrading. Enterprise tier offers full air-gapped deployment. See deployment model →

Pricing

Licensed software. Self-hosted. Your data stays yours.

We gate volume, never intelligence — every plan includes the full salience engine. Paid plans are self-hosted in your VPC. The free tier runs in our hosted sandbox for evaluation.

Free

$0/ forever

50 messages/mo · Full engine · Hosted sandbox

Full scoring engine (same as paid)
50 messages/month
1 LLM provider
Memory graph visualization

Start free trial

Starter

$49/month

100K messages/mo · Unlimited retention

Everything in Free
6 LLM providers
MCP server · SDK · REST
Webhooks · Graph export
Email support

Start free trial

Popular

Scale

$199/month

1M messages/mo · Custom decay · Priority support

Everything in Starter
6 LLM providers
Custom decay parameters
Custom safety policies
Priority support · Slack · 12hr SLA

Start free trial

Enterprise

Custom

Unlimited · Self-hosted · Air-gap capable

Everything in Scale
SAML · SCIM · RBAC
SOC 2 · HIPAA BAA · DPA
Dedicated support + SLA

Talk to us

Full scoring engine at every tier — we gate volume, never intelligence. 30-day free trial on Starter and Scale. No credit card required to start.
See full pricing & feature matrix

Built for production

30 patents filed. 4,600+ tests passing. Built by Sandstone Cloud.

Patent pending

Give your AI a memory that matters.

Try the sandbox free. Deploy in your VPC for production.

Start free trial Talk to us

Self-hosted in your VPC. Your data never leaves your infrastructure.