CTXONE // system-centric AI

Stop paying a reasoning tax for context you already have.

Every session you start from scratch, the model burns tokens reconstructing what it should already know. That's not intelligence — that's context compensation. Your model isn't thinking. It's catching up.

CTXone is the external context layer that ends it. Persistent memory, structured plans, and full provenance — so the model executes within context instead of reasoning its way back to it. Smaller models work better. Token costs drop. You stay in control.

$ curl -sSL https://raw.githubusercontent.com/ctxone/ctxone-docs/main/install.sh | sh

System-Centric Model-Agnostic Self-hosted Zero telemetry BSL-1.1 → Apache 2.0

MODEL_CENTRIC.liability

Model-Centric AI has a tax.

When the model is expected to infer state, intent, plan, and memory from raw context, it compensates with reasoning. You pay for that compensation — in tokens, in drift, and in bills that only make sense in retrospect.

01 // CONTEXT DEBT

Every session starts in debt.

You told the model on Tuesday that you're using SQLite, not Postgres, and why. Wednesday morning you open a new session and it's asking about your "Postgres schema" again. You paste the same paragraph. You budget tokens for memory the model should already have. That's not a memory problem — that's accumulated context debt, paid in reasoning every single session.

CTXone eliminates the debt. Every fact survives sessions, branches, and tool switches because it lives in a local graph — not in the model's context window. The model executes within context. It doesn't reconstruct it.

02 // REASONING TAX

Your teammate is paying twice.

You primed your Cursor install with the team's architectural decisions. Priya didn't. Now she's arguing with the model about whether to use Redis for the job queue — a question you settled three months ago. The model isn't wrong; it just doesn't have the context. So it reasons. So does Priya. So does the next engineer. The same reasoning tax, paid over and over.

CTXone is shared. The graph is a file. Commit it, sync it, mount it across the team. Whatever you primed is what everyone sees — one reasoning cost, paid once.

03 // CONTEXT COMPENSATION

The model filled in a gap you didn't know existed.

You open a file the model edited last week and there's a reference to "the new API versioning policy." You don't remember a policy. Nobody does. The model wasn't hallucinating — it was compensating. It had no structured context, so it inferred. The decision is now orphaned in a file no one can audit.

CTXone makes compensation visible. Every write carries an agent ID, a timestamp, an intent, and reasoning. ctx blame traces it back to the session, the tool, and the fact that prompted it. No orphaned decisions.

04 // ECONOMIC OPACITY

The bill arrives. You have no idea why.

Token pricing is published on the docs page. It is completely opaque at the moment of use. You run a session, you ship a feature, you move on. Then the invoice lands and you're reverse-engineering which prompts cost what. The expensive ones are almost always the same: large context, reasoning model, agent that lost its place and started over.

CTXone shifts control back to you. Structured context means smaller models work. Plans mean agents don't lose their place. Recall means you send what's relevant — not everything you've ever said. The reasoning tax becomes a flat cost you understand.

Read the full context compensation argument →

ARCHITECTURE //

Three layers. One system.

CTXone is external context and structure for models to work within — not another memory plugin, not a prompt wrapper. Three layers compose the full system-centric architecture.

01 // MEMORY LAYER

Persistent, searchable context

Write a fact once. It survives sessions, branches, and tool switches — stored in a local graph, not the model's context window. Topic-matched recall sends only what's relevant, eliminating context debt before the session starts.

$ ctx remember "SQLite only — no Postgres" \
    --importance high --context architecture

# model recalls on topic, not everything:
recall(topic="architecture", budget=1200)
→ 3 facts matched / 847 tokens sent
02 // PLANS LAYER

Agents that stay on task

Plans give agents a structured task list to execute within — no plan drift, no focus drift, no execution hallucination. When an agent knows exactly what's next, it doesn't re-reason its position. Token usage drops because the work is the work.

$ ctx plan new auth-refactor
$ ctx plan add auth-refactor "Replace JWT middleware"
$ ctx plan next

# model picks up exactly where work left off:
plan_next(assigned_to="claude-code")
→ task: Replace JWT middleware [in_progress]
03 // GOVERNANCE LAYER

Provenance, taint, and audit trail

Every memory commit carries an agent ID, a timestamp, an intent, and reasoning. Taint guards sensitive paths with warn or block policies. Branches let you stage memory changes before they land on main — git-style, for your context graph.

$ ctx blame /memory/architecture/db-choice

# full audit trail:
agent:  claude-code
when:   2026-04-01T11:42:07Z
intent: Observe
reason: user confirmed SQLite decision

$ ctx taint apply "secrets/**" --policy block
PLANS //

Agents drift. Plans don't.

Without structure, an agent re-reasons its position on every turn — re-reading prior output, reconstructing intent, deciding what to do next. That's plan drift, and you pay for every token of it. Plans give the agent an external task list to execute within. It picks up exactly where work left off.

No re-reasoning between sessions

A plan persists in the state graph. When an agent resumes — tomorrow, or after a context switch — it calls plan_next and gets exactly what to do next. No context reconstruction. No token overhead.

Shared across agents and tools

Plans live on branches, not in a session. Claude Code, Cursor, and a Python script can all work from the same plan simultaneously — each picking up the next available task with plan_next(assigned_to="me").

Proof-of-work on every task

Closing a task requires a proof: a commit SHA, a file path, or a test. No execution hallucination — no agent marking work done that wasn't. The plan is an audit trail as well as a task list.

ctx plan show auth-refactor
plan:  auth-refactor  branch: main
─────────────────────────────────────────

    Audit existing JWT middleware
    Document token expiry edge cases
    Replace JWT middleware  in_progress
    Update integration tests
    Update docs/AUTH.md

─────────────────────────────────────────
2 done · 1 in progress · 2 pending

$ ctx plan next
 Replace JWT middleware
  blocked_by: [] · assigned_to: claude-code
TOKEN_CONTROL //

You don't need a bigger model.
You need better context.

Token pricing is published on the docs page. It's completely opaque at the moment of use. The expensive sessions follow a pattern: bloated context, a reasoning model compensating for what it wasn't told, an agent that lost its place and started over. That's not an AI problem — it's an architecture problem.

Model-Centric
  • Full conversation history sent every session
  • Reasoning model required to compensate for context debt
  • Agent re-reasons its position after every context switch
  • Token cost grows with project age
  • Bill arrives. You can't explain it.
System-Centric
  • Topic-matched recall — only relevant facts sent
  • Smaller, non-thinking models work better because context is structured
  • Plans keep agents on task — no re-reasoning, no drift
  • Token cost stays flat as the graph grows
  • You understand the cost before you pay it
THE CONTEXT COMPENSATION PRINCIPLE As context size and entropy increase, required reasoning grows nonlinearly. Current model reasoning is compensating for bad context — not doing work. CTXone eliminates the compensation, leaving only execution.
See the full argument →

Works with the tools you already use

CTXone exposes MCP for AI coding tools, native plugins for chat UIs, and direct client libraries for everything else. ctx init auto-detects and wires them in one command.

Model-Agnostic Works with any model behind any tool — Claude, GPT, Gemini, local models via Ollama. CTXone doesn't depend on the model's memory. The system holds the context.

MULTI-MODEL //

One memory graph.
Every model you use.

Most teams don't use one AI tool. They use Claude for complex reasoning, a local model for routine work, Gemini for search-grounded tasks. Each one starts every session knowing nothing about the others.

CTXone is the shared context layer they all draw from — no glue code, no custom orchestration, no framework lock-in. Claude remembers a decision. Cursor picks it up. A Python script acts on it. They're all working from the same plan, the same memory, the same branch.

  • Any MCP-capable tool reads and writes the same graph
  • Plans are model-agnostic — any agent picks up the next task
  • No orchestration code required — coordination happens through shared state
  • Switch models mid-project — context follows, not the model
CTXone Hub
memory · plans · branches
Claude Code
Cursor
Gemini
Local model
Python script

All agents. Same memory graph. Same plan. No code connecting them.

Stop compensating.
Start executing.

Give your models the context they need — once. Your agents have a plan. They'll stay on it. No signup, no server to rent, no SaaS bill. A binary on your laptop and a graph under ~/.ctxone.

$ curl -sSL https://raw.githubusercontent.com/ctxone/ctxone-docs/main/install.sh | sh

Source-available under BSL-1.1. Every release converts to Apache-2.0 four years after it ships — full story.