ctx — The memory your LLM pretends to have.

Knowledge store with weighted 4-way RRF retrieval, multi-tenant scope isolation, multi-dimensional cyclic temporal gravity, and autonomous cross-referencing. Built for AI workflows that need to remember.

What it does

ctx gives your LLM a persistent, searchable memory. Store knowledge blocks, query them with hybrid retrieval (semantic + bilingual fulltext + trigram), then rerank with multi-dimensional cyclic gravity — each temporal cycle (weekday, month, quarter, week, monthday, seasonal, daily) scored as its own Gaussian field. Queries like "immer dienstags" or "Weihnachten" activate specific dimensions; "Meeting am Dienstag, Ergebnis am Mittwoch" still pulls the Wednesday block (just weaker).

Multiple anchors per block: every block carries dimensions from both its content (dates mentioned in text) AND its created_at timestamp. A block about "Meeting am Dienstag" written on a Friday gets weekday=2 (content anchor) AND weekday=5 (meta anchor). Both signals contribute independently — "immer dienstags" queries find the content anchor; "Freitags-Arbeit" finds the meta anchor. Same principle for monthday, seasonal, daily, etc.

Dream Mode runs as a continuous background loop — autonomously discovering relationships between blocks, marking outdated information, and promoting high-quality content. Supports a separate model for evaluation (e.g. a larger model for better causal/supersedes reasoning). Parallel workers (CTX_DREAM_PARALLELISM, default 1) with atomic FOR UPDATE SKIP LOCKED block-claim — safe under contention. Your knowledge base grows, self-organizes, and stays current.

How LLMs use ctx

ctx is designed to be the persistent memory layer for LLM agents. Five primitives, composable:

Use case	Tool	When
Retrieve prior knowledge before answering	`ctx query "question"`	Whenever the answer might depend on past sessions, project state, or stored decisions
Persist a new finding	`ctx save <category> <title> - <content>`	After non-obvious discoveries, architecture decisions, resolved bugs, config changes
Update an existing block	`ctx save` with same `<category> <title>`	category+title is upsert key — re-saving replaces
Browse without LLM cost	`ctx search [category] [query:text]`	Listing, sanity-checking, lightweight lookups
Inspect a specific block	`ctx get <block-id>`	Following an id from query sources or another block

Categories (semantic, not enforced)

infrastructure, decisions, projects, reference, learnings, agent-briefing, index. Pick by intent: one fact per block, precise title, tags for cross-cutting. ~1-1.5k chars max — split, don't grow.

Access paths (in order of preference for LLM agents)

MCP — claude.ai ctx server (Streamable HTTP transport). Tools: query, store, search, get, recent. JSON-schemas, no shell-quoting. Use this in Claude Code / claude.ai sessions.
CLI — /usr/local/bin/ctx — shell pipelines, cron, scripts. Config in ~/.config/ctx/config.
HTTP — POST /api/{query,store,search,manage} direct — fallback when MCP/CLI unavailable.

Multi-Tenant Architecture

scope column on context_blocks (private | work | shared | additional tenant scopes), enforced via API-key home_scope. Each LLM/tenant key sees:

All blocks in its own scope
All blocks in shared (cross-tenant knowledge layer)
Nothing from other tenants' private scopes

API-key provisioning (v2.0.0+): ctx keys create <label> --home <scope> — --home is required, no implicit default.

Using ctx effectively

Installing ctx gives an agent memory. Using it well takes discipline — because a memory shared across sessions has a failure mode a single chat doesn't: drift.

Why stored memory drifts

Each time an LLM reads a note and re-saves or summarizes it, it re-interprets it through its own training biases. That isn't random noise — it's a directional filter that pushes the same way every pass: more conservative, more absolute, less attributed. Observations harden into recommendations, recommendations into rules, rules into dogma — and the certainty becomes untraceable.

A stored block is also a point-in-time observation, not live state. A note that was true when written ("we migrated off X") can stay true and still drive a wrong action (deleting X's still-running sibling service) — because the scope shifted and the note never said so. The note tells you where to look, not what's true right now.

Discipline — put this in your agent's instructions

Load conventions into context before working — don't just file them away. Effectiveness ranks training-weights > file-instructions > in-context anchors: only an anchor in the current context reliably overrides a trained default. A discipline doc that's never loaded gets silently re-undermined by each new session. (ctx query your project conventions at session start.)
Trace every stored claim to a source. Save quote + date; keep verified user statements separate from your own interpretation. An interpretation re-saved as fact is how a "probably" disappears across three persistence layers.
Cross-check stored claims against live state before acting. Before a destructive or status-dependent step, verify against the authoritative source — live config, a test, the actual file — not the note.
Don't gate on self-reported confidence. Models are often just as sure when wrong. Gate on external truth: a test, the source, observed behavior.
Prefer external signals over self-reminders. Naming a failure mode as a rule ("don't forget the tests") tends to re-evoke it; build a check instead — a test script, a grep on the output, a verifier against the raw data.

Calibration

LLM defaults are tuned for a median user who must be protected from uninformed decisions. For an experienced operator with a defined target, the same training produces systematic distortion: judging against the current state instead of the target ("good enough for now"), preferring the familiar over the better option, asking permission on obvious next steps while making user-facing decisions unprompted, and presenting trained caution as judgement ("that's overkill") with no concrete risk named.

Compensating it is a one-time setup the agent should drive:

Store the calibration as a block. Have the agent write your conventions and observed failure modes into ctx — a dedicated "RLHF warnings" block is a good seed — so every future session can retrieve them instead of relearning them.
Point your durable instructions at that block. Your platform's personal-preference / custom-instruction field, or a project-level instruction file, should reference it. This is the step the agent should prompt you to do — it's the one layer the agent can't write for itself, and without it the block just sits there unread.
Each session loads the anchor. The durable instruction tells the agent to ctx query that block before working, so the calibration lands in context — the only layer that reliably overrides a trained default — instead of staying filed away.

State the desired behavior rather than the unwanted one (naming the bad behavior re-evokes it). This isn't about disabling safety — it's about re-aiming a calibration meant for someone else, and keeping that aim across sessions.

Quick Install

# Binary (Linux/macOS/Windows)
curl -fsSL https://github.com/GottZ/ctx/releases/latest/download/ctx-$(uname -s | tr A-Z a-z)-$(uname -m | sed 's/x86_64/amd64/;s/aarch64/arm64/') -o /usr/local/bin/ctx && chmod +x /usr/local/bin/ctx

# Or with Go
go install github.com/GottZ/ctx/cmd/ctx@latest

Setup

1. Configure endpoint

# Linux/macOS
mkdir -p ~/.config/ctx
cat > ~/.config/ctx/config << 'EOF'
CTX_BASE_URL=https://your-ctx-host.example
CTX_KEY=your-api-key-here
EOF

Windows (PowerShell)

New-Item -ItemType Directory -Force "$env:APPDATA\ctx"
@"
CTX_BASE_URL=https://your-ctx-host.example
CTX_KEY=your-api-key-here
"@ | Set-Content "$env:APPDATA\ctx\config"

2. Verify

ctx health    # DB + Ollama connectivity
ctx stats     # Block count, categories, storage

3. Claude Code integration (optional)

Statusline — live block count, health, and rate limits:

{ "statusLine": { "type": "command", "command": "ctx statusline" } }

Slash commands — add to ~/.claude/settings.json:

{
  "customSlashCommands": [
    { "name": "ctx",        "command": "ctx query \"$PROMPT\"" },
    { "name": "ctx-save",   "command": "ctx save $PROMPT" },
    { "name": "ctx-browse", "command": "ctx search $PROMPT" },
    { "name": "ctx-stats",  "command": "ctx stats" }
  ]
}

Agent hooks — automatic project briefing for subagents:

{
  "hooks": {
    "SubagentStart": [{ "hooks": [{ "type": "command", "command": "ctx brief --hook" }] }],
    "SubagentStop":  [{ "hooks": [{ "type": "command", "command": "ctx persist --hook" }] }]
  }
}

CLI

Command	Description
`ctx query question`	Hybrid search + LLM synthesis (formatted, `--json` for raw)
`ctx save <cat> <title> - <content>`	Upsert knowledge block
`ctx save --tag tag1,tag2 <cat> <title>`	Upsert with tags
`ctx search [category] [query:text]`	Compact search (no LLM)
`ctx get <id>`	Fetch full block
`ctx delete <id>`	Soft-delete (archive)
`ctx categories`	List all categories
`ctx stats`	Database statistics + Dream backlog (`dream_queue`: pickable/cooldown/incoming-forecast)
`ctx health`	Healthcheck
`ctx guard [list\|stats\|resolve]`	Write Guard management
`ctx dream [stats\|review]`	Dream Mode stats — mode, `queue` (backlog + incoming forecast), `backoff` (per-eval-count maturity distribution: how far each block has cooled off + effective cooldown); human-readable on a TTY, JSON when piped + link review
`ctx dream enable\|disable\|throttle`	Runtime dream mode control (on/off/throttled)
`ctx brief`	Project briefing from store
`ctx persist`	Persist `[PERSIST:cat:title]` markers
`ctx ingest <path>`	Ingest Obsidian vault
`ctx digest`	Rebuild topic map
`ctx statusline`	Claude Code status bar
`ctx mcp [add\|list\|delete]`	Manage MCP OAuth client registrations
`ctx keys create <label> --home <scope>`	Provision API key (v2.0.0: `--home` required, no default scope)
`ctx keys [list\|delete]`	List / revoke provisioned API keys
`ctx version`	Print version

Architecture

Query ──► Parse Temporal ──► Embed ──► 4-Way RRF ──► Gravity Boost ──► filterSuperseded ──► LLM Synthesis
          │                            ├─ Semantic (0.45)    │
          │                            ├─ EN-FTS   (0.25)    ├─ Linear (Power-Law, content_times)
          │                            ├─ DE-FTS   (0.20)    └─ Cyclic (Gaussian, EAV dimensions)
          │                            └─ Trigram  (0.10)       ├─ weekday σ=0.07  ┌─────────────────────────────┐
          │                                                     ├─ month   σ=0.10  │  Dream Mode (continuous)     │
          └─► DimensionWeights                                  ├─ quarter σ=0.12  │  N workers (PARALLELISM=N)   │
              {weekday:1.0}  "immer dienstags"                  ├─ week    σ=0.08  │  atomic claim (SKIP LOCKED)  │
              {month:0.4, seasonal:0.6}  "Weihnachten"          ├─ monthday σ=0.10 │  Pick → Keywords → RRF       │
              {monthday:1.0}  "Monatsanfang"                    ├─ seasonal σ=0.08 │  → LLM Eval → Links          │
              {daily:1.0}    "morgens"                          └─ daily   σ=0.08  │  → ApplySupersedes           │
                                                                                   │  → PromoteToCanonical        │
                                                                                   └─────────────────────────────┘

Store ──► Extract Times ──► Hash NOOP ──────────────► Guard (async, 60s)
          (content + created_at)          │           ├─ ≥0.98: auto-archive
          │                               │           ├─ 0.92-0.98: flag needs_review
          │                               │           └─ <0.92: clean
          │                               └─► Embed (async, scheduler backfill, tx-wrapped)
          └─► Dimensions = Union(content anchors ∪ meta anchor)
              • Content: dates mentioned in text (semantic)
              • Meta: created_at timestamp (every block, always)
              • ON CONFLICT dedups overlapping timestamps

Stack: Go 1.26, PostgreSQL 18 + pgvector 0.8.2, 48 SQL migrations. Dual-protocol inference (Ollama native or OpenAI-compatible) via any provider — per-pipeline configurable via CTX_*_PROTOCOL, CTX_EMBED_*, CTX_CHAT_*, CTX_DREAM_* env vars.

Key environment variables

Var	Default	Purpose
`CTX_BASE_URL` / `CTX_KEY`	–	CLI client config (`~/.config/ctx/config`)
`CONTEXT_DB` / `CONTEXT_DB_USER` / `CONTEXT_DB_PASSWORD`	–	Database (separate from inference)
`CTX_EMBED_HOST` / `_PROTOCOL` / `_MODEL` / `_DIMS`	`ollama` / – / `1024`	Embedding pipeline (e.g. qwen3-embedding:8b)
`CTX_CHAT_HOST` / `_PROTOCOL` / `_MODEL` / `_THINK`	`ollama` / – / `false`	Generator pipeline (RRF synthesis)
`CTX_DREAM_ENABLED`	`false`	Toggle continuous Dream loop
`CTX_DREAM_PARALLELISM`	`1`	Concurrent Dream workers — race-safe via atomic claim
`CTX_DREAM_HOST` / `_PROTOCOL` / `_MODEL` / `_NUM_CTX`	inherits chat	Separate Dream model (e.g. larger, slower)
`CTX_DREAM_EMBED_*`	inherits embed	Separate embedding endpoint for Dream (e.g. CPU sidecar)
`CTX_DREAM_IDLE_WAIT`	`20` (s)	Backoff when no pending blocks
`CTX_DREAM_BACKOFF_MODE` / `_FACTOR` / `_MIN` / `_GRACE` / `_CAP` / `_INERT_OFFSET`	`exp` / `1.6` / `12h` / `0` / `45d` / `7`	Re-dream back-off by eval count (`exp`/`log`/`linear`/`off`). Cooldown grows from `MIN` (n=0) to `CAP`: fresh blocks re-dream sub-day to catch new links, mature blocks back off to the cap. `_MIN`/`_CAP` take a duration with a unit suffix — `h` hours, `d` days, `w` weeks, `m` months (30d), `y` years (365d), e.g. `12h`, `45d`, `1w` (bare number = hours). `_INERT_OFFSET` starts a no-links cycle further up the curve
`CTX_PROMPT_VERSION`	`v5.2`	Generator-prompt version (`v5.2` default, `v6` opt-in graded confidence)
`CTX_TIMEZONE`	`Europe/Berlin`	Cyclic-temporal phase calculation
`CTX_CONFIDENT_THRESHOLD`	`0.008`	Generator-side refusal threshold (RRF score below → "I don't know")
`CTX_READ_SCOPES`	scope-derived	API key's effective read-scope set (v2.0.0+ scheduler config)

Key features:

GottZ 4-Way RRF — reciprocal rank fusion across semantic, bilingual fulltext, and trigram channels; block_role-aware (4-class enum: system-meta hard-excluded incl. digest-generated topic-maps via Welle-44 hook, audit-trail/reference/knowledge full-pass — uniform damping shown ineffective in Welle 40, query-aware damping pending Folge-Welle 41+)
GottZ Scope Model — multi-tenant isolation (private/work/shared) via API key scoping
GottZ Guard — async deduplication via PG LISTEN/NOTIFY + HNSW similarity
GottZ Cyclic Phase Model — 7 cyclic temporal dimensions (weekday/month/quarter/week/monthday/seasonal/daily) with normalized phase [0,1) and per-dimension Gaussian decay. Queries route to dimensions via parser (18-matcher deterministic engine). Timezone-aware via CTX_TIMEZONE.
Forward Telescoping — older blocks get a wider linear gravity well (effective power scaled by 1 / (1 + 0.3·ln(1+age/30))) so a 6-month-old block isn't drowned out by a 1-week-old block when the user asks about a date in that window. Future dates keep their 1.2× sharper cutoff. Matches Rubin & Baddeley 1989's age-dependent recall imprecision.
GottZ Temporal Dimension Table — EAV storage with partial B-Tree indexes, O(log n) dimension lookups at 1M+ scale. Every block carries multiple anchors: content-mentioned times (semantic) + created_at (meta) as independent signals.
Dream Mode — continuous autonomous cross-referencing with dual-model support (v5 prompt for qwen3.6:27b non-thinking sampler, dream pipeline version 5 with recurrent relationship class detected via context_temporal+title-similarity Phase 1 + LLM Phase 2), adaptive cooldown, supersedes detection, temporal validation, hard-cap of 5 links per cycle with type-diversity tie-break, replace-semantics with snapshot revert, and runtime mode control (on/throttled/off via API). Throttled mode pauses between GPU-intensive steps for thermal management. Parallel workers (CTX_DREAM_PARALLELISM, default 1) using atomic FOR UPDATE SKIP LOCKED block-claim — race-condition-safe under contention. Robust LLM-output parsing: tolerates array-form, single-object, fenced-array, and compact-multi-key-object link formats from heterogeneous LLM outputs. Config: CTX_DREAM_IDLE_WAIT (seconds, default 20)
Supersedes Filtering — temporal-gated removal of outdated blocks from query results
Embed Cache — content-hash-keyed embedding cache (context_embed_cache) to avoid re-embedding identical text across pipelines
LLM Log — per-call request/response capture (context_llm_log) with input/output token counts (Ollama + OpenAI), dream-pipeline version tagging, and parse-format drift tagging (metadata.parse_format: array | object | fenced-array | fenced-object) for pipeline debugging + offline benchmark replay
MCP Remote — Streamable HTTP transport with OAuth 2.1 PKCE for claude.ai/Claude Code integration. Tools: query, store, search, get, recent. Client registration via ctx mcp add. Tool handlers return Content[].text (no structured output) — tested in test.sh T17/T18

API

All endpoints under /api/*. Auth via X-Context-Key header or Authorization: Bearer token.

Endpoint	Description
`POST /api/query`	4-Way RRF + LLM synthesis (auto-backfills pending embeddings; optional `categories_exclude` / `block_roles_exclude` arrays filter slot-stealers)
`POST /api/store`	Upsert (embedding async via scheduler)
`POST /api/search`	Lightweight search (no LLM)
`POST /api/manage`	CRUD, Guard API, stats, API-key management (`api-key-create` requires `home_scope`)
`POST /api/digest`	Topic map generation
`POST /api/ingest`	Obsidian vault ingestion
`POST /api/blob/*`	Binary storage (store/fetch/search/manage)
`GET /health`	DB + Ollama connectivity
`POST\|GET\|DELETE /mcp`	MCP Streamable HTTP (remote tool server)
`GET /authorize`	OAuth 2.1 authorization (PKCE)
`POST /token`	OAuth 2.1 token exchange

Building

go build -o ctx ./cmd/ctx/           # CLI
go build -o ctxd ./cmd/ctxd/         # Daemon
go test ./... -short                  # Unit tests

CRAG-Bench Test Instance

Optional, profile-gated isolated ctx instance for the CRAG retrieval benchmark. Reuses the prod n8n-ctx image, runs against a separate test DB (ctx_crag_test), and binds to 127.0.0.1:18080 only.

# 1. Provision env file (NEVER commit the real .env.crag)
cp .env.crag.example .env.crag
# edit: openssl rand -hex 32 for CONTEXT_DB_PASSWORD and CRAG_API_KEY

# 2. Bootstrap the test DB (one-shot)
docker exec \
  -e POSTGRES_USER=admin -e POSTGRES_DB=n8n -e POSTGRES_PASSWORD=<admin> \
  -e CONTEXT_DB=ctx_crag_test \
  -e CONTEXT_DB_USER=crag_user \
  -e CONTEXT_DB_PASSWORD=<from .env.crag> \
  n8n-db-1 bash /docker-entrypoint-initdb.d/init-data.sh

# 3. Start the profile-gated service
docker compose --profile crag --env-file .env.crag up -d ctx-crag
docker compose --profile crag --env-file .env.crag logs -f ctx-crag

# 4. Verify
curl -sf http://127.0.0.1:18080/health

The default docker compose up -d ignores profile-gated services — prod ctx is unaffected.

License

MPL-2.0 — By GottZ

Name		Name	Last commit message	Last commit date
Latest commit History 255 Commits
.github/workflows		.github/workflows
.hooks		.hooks
.project @ 623a5f0		.project @ 623a5f0
db-image		db-image
go		go
.env.crag.example		.env.crag.example
.env.example		.env.example
.gitignore		.gitignore
.gitmodules		.gitmodules
.golangci.yml		.golangci.yml
CLAUDE.md		CLAUDE.md
CONTRIBUTORS.md		CONTRIBUTORS.md
LICENSE		LICENSE
README.md		README.md
backup.sh		backup.sh
docker-compose.yml		docker-compose.yml
eval-cyclic.sh		eval-cyclic.sh
eval-temporal.sh		eval-temporal.sh
eval.sh		eval.sh
init-data.sh		init-data.sh
state.sh		state.sh
test.sh		test.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ctx — The memory your LLM pretends to have.

What it does

How LLMs use ctx

Categories (semantic, not enforced)

Access paths (in order of preference for LLM agents)

Multi-Tenant Architecture

Using ctx effectively

Why stored memory drifts

Discipline — put this in your agent's instructions

Calibration

Quick Install

Setup

1. Configure endpoint

2. Verify

3. Claude Code integration (optional)

CLI

Architecture

Key environment variables

API

Building

CRAG-Bench Test Instance

License

About

Uh oh!

Releases 71

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ctx — The memory your LLM pretends to have.

What it does

How LLMs use ctx

Categories (semantic, not enforced)

Access paths (in order of preference for LLM agents)

Multi-Tenant Architecture

Using ctx effectively

Why stored memory drifts

Discipline — put this in your agent's instructions

Calibration

Quick Install

Setup

1. Configure endpoint

2. Verify

3. Claude Code integration (optional)

CLI

Architecture

Key environment variables

API

Building

CRAG-Bench Test Instance

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 71

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages