12-signal composite scoring
Every memory node is scored at write-time across twelve independent linguistic signals. Not recency. Not frequency. Cold-start capable from message one. Patent pending.
KAPEX is memoryware — memory middleware that scores what matters, lets the rest fade, and injects the right context at query time. One memory graph across every LLM. Tell one model something. Ask another. It knows.
Self-hosted, licensed middleware that runs in your VPC. Try it in our hosted sandbox. Deploy it in your own infrastructure for production.
How it works
KAPEX runs as middleware. No model changes. No prompt rewriting. Drop it in via SDK, MCP server, or REST.
Sits between your app and any LLM. No model changes required. Reads inputs and outputs.
Entity extraction, topic detection, relationship mapping. Builds the structure the scorer reads from.
Twelve independent linguistic signals fuse into a legitimacy score at write-time. Cold-start capable from message one.
Processed memories fade. Unresolved ones persist. The graph self-manages without manual pruning. Patent pending.
Top-salience nodes assemble into a token-budgeted context block, ready for your prompt.
Salience over time
Every prior approach treats memories as equal-weight nodes that strengthen with access. KAPEX takes a different approach: worked-through content fades from active context while unresolved content persists at high salience. Patent pending.
Three differentiators
Every memory node is scored at write-time across twelve independent linguistic signals. Not recency. Not frequency. Cold-start capable from message one. Patent pending.
Memories that have been addressed naturally fade, while unresolved content persists. Worked-through topics make room for what matters now. The graph self-manages without manual curation. Patent pending.
Crisis detection, anti-fabrication guard, PII scrubbing, trigger management, GDPR per-node deletion. Runs identically regardless of which LLM you use downstream.
Use Cases
Character memory across sessions. Personality consistency. Story continuity that holds up over weeks, not turns. The user’s world stays coherent because the memory graph never forgets who matters to them.
Codebase context across sessions. The Postgres-vs-MySQL decision from six months ago persists. Routine noise decays. Your coding assistant remembers architectural choices without stuffing the context window.
Clinical-grade session memory. Treatment progress tracking. Per-client salience models that stay stable across months. Resolved grief decays from active context while unresolved concerns persist.
Prospect memory across multi-week sequences. Stakeholder context that persists across every touchpoint in a deal cycle.
Cross-meeting memory. Recurring-participant context. Decision tracking that doesn’t forget what was decided in March.
Per-student learning models. Adaptive difficulty that persists across sessions. What the learner knows is treated as state, not noise.
KAPEX vs. alternatives
KAPEX vs. everyone
Every competitor flatlines. Scored, governed memory compounds.
Illustrative. Based on observed degradation patterns across 150+ builder conversations and KAPEX study data (1,655 participants, 99K+ messages).
The data
Users chatted with two AI panels side by side — one with KAPEX memory, one without. They didn’t know which was which. At session one, it’s a coin flip. By session twenty, they choose KAPEX four out of five times.
Preference monotonically increases with session depth. The memory graph becomes increasingly differentiated. Flat memory can’t catch up.
Integration
KAPEX ships as a Docker container. Run it in your VPC, point your app at it, inject the salience context into your LLM prompt. Typical integration is under a day — and your users’ data never leaves your infrastructure.
One-click CloudFormation template for AWS, or a signed Docker package for any environment that runs containers. You bring PostgreSQL, optional Redis, and your LLM key. Sandstone supplies the container and the license.
Your app talks to your KAPEX endpoint via Python SDK, MCP server (stdio · SSE · HTTP), or plain REST. Same engine, three surfaces — provider-agnostic from any language or framework.
Send each turn to KAPEX, get a token-budgeted salience context block back, paste it into your system prompt. No model changes. Works with Claude, GPT, Gemini, Llama — you bring the key.
User messages, memory graphs, scoring state, and conversation history live in your PostgreSQL instance — in your VPC, your region, your encryption. The container performs a daily license heartbeat (key hash only, zero user data) and runs for 7 days offline before degrading. Enterprise tier offers full air-gapped deployment. See deployment model →
Pricing
We gate volume, never intelligence — every plan includes the full salience engine. Paid plans are self-hosted in your VPC. The free tier runs in our hosted sandbox for evaluation.
Full scoring engine at every tier — we gate volume, never intelligence.
30-day free trial on Starter and Scale. No credit card required to start.
See full pricing & feature matrix
Built for production
30 patents filed. 4,600+ tests passing. Built by Sandstone Cloud.
Try the sandbox free. Deploy in your VPC for production.