Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 9 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -222,6 +222,11 @@ provider-backed ELF evidence was required.
and surfaces it as `page_version_diff` in benchmark artifacts. The live command now
reports `version_diff_coverage = 1.000` while preserving deterministic page content
hashes and `source_mutation_allowed = false`.
- Graph topic-map reports after XY-1020: the June 20 follow-up adds
`elf.graph_report/v1` through service, HTTP, and MCP readback. Reports use
Postgres graph-lite facts to show current, historical, future, sourced, inferred,
ambiguous, stale, and superseded markers without introducing a separate graph
database or replacing source evidence.
- Operator-approved public-proxy addendum after XY-930: the June 19 follow-up runs
`cargo make baseline-production-private-addendum` with a simulated/public-proxy
production corpus manifest approved for this stage. The run records 12 documents,
Expand Down Expand Up @@ -348,6 +353,7 @@ Detailed evidence and interpretation:
- [Service-Native Dreaming Readback Report - June 19, 2026](docs/evidence/benchmarking/2026-06-19-service-native-dreaming-readback-report.md)
- [OpenMemory UI/Export Product Readback Report - June 19, 2026](docs/evidence/benchmarking/2026-06-19-openmemory-ui-export-product-readback-report.md)
- [Operator-Approved Public-Proxy Production-Private Addendum - June 19, 2026](docs/evidence/benchmarking/2026-06-19-operator-approved-public-proxy-production-private-addendum.md)
- [Graph Topic-Map Report - June 20, 2026](docs/evidence/benchmarking/2026-06-20-graph-topic-map-report.md)
- [Knowledge Workspace Version-Diff Report - June 20, 2026](docs/evidence/benchmarking/2026-06-20-knowledge-workspace-version-diff-report.md)
- [Live Knowledge-Page Rebuild/Lint Report - June 20, 2026](docs/evidence/benchmarking/2026-06-20-live-knowledge-page-rebuild-lint-report.md)
- [Live Baseline Benchmark Runbook](docs/runbook/benchmarking/live_baseline_benchmark.md)
Expand Down Expand Up @@ -451,8 +457,9 @@ Detailed comparison, mechanism-level analysis, and source map:
- [Dreaming Product Surface Follow-Up Research](docs/research/dreaming_product_surface_followup.md)

Latest real-world benchmark report: June 20, 2026. Latest external research refresh:
June 11, 2026; June 20 adds the Knowledge Workspace Version-Diff Report - June 20, 2026
and the Live Knowledge-Page Rebuild/Lint Report - June 20, 2026 after the June 19
June 11, 2026; June 20 adds the Graph Topic-Map Report - June 20, 2026,
Knowledge Workspace Version-Diff Report - June 20, 2026, and the Live
Knowledge-Page Rebuild/Lint Report - June 20, 2026 after the June 19
XY-930 operator-approved public-proxy production addendum and service-native Dreaming
readback, the qmd debug-ergonomics Dreaming retest, the June 17 competitor-strength
closeout, and the June 16 temporal reconciliation, live consolidation self-check,
Expand Down
79 changes: 68 additions & 11 deletions apps/elf-api/src/routes.rs
Original file line number Diff line number Diff line change
Expand Up @@ -54,17 +54,17 @@ use elf_service::{
DocsGetRequest, DocsGetResponse, DocsPutRequest, DocsPutResponse, DocsSearchL0Request,
DocsSearchL0Response, EntityMemoryViewRequest, EntityMemoryViewResponse, Error, EventMessage,
GranteeKind, GraphQueryEntityRef, GraphQueryPredicateRef, GraphQueryRequest,
GraphQueryResponse, IngestionProfileSelector, KnowledgePageGetRequest,
KnowledgePageLintRequest, KnowledgePageLintResponse, KnowledgePageRebuildRequest,
KnowledgePageRebuildResponse, KnowledgePageResponse, KnowledgePageSearchRequest,
KnowledgePageSearchResponse, KnowledgePagesListRequest, KnowledgePagesListResponse,
ListRequest, ListResponse, MemoryHistoryGetRequest, MemoryHistoryResponse, NoteFetchRequest,
NoteFetchResponse, NoteProvenanceBundleResponse, NoteProvenanceGetRequest, PayloadLevel,
PublishNoteRequest, QueryPlan, RankingRequestOverride, RebuildReport, SearchDetailsRequest,
SearchDetailsResult, SearchExplainRequest, SearchExplainResponse, SearchIndexItem,
SearchRequest, SearchResponse, SearchSessionGetRequest, SearchTimelineGroup,
SearchTimelineRequest, SearchTrajectoryResponse, SearchTrajectorySummary, ShareScope,
SpaceGrantRevokeRequest, SpaceGrantRevokeResponse, SpaceGrantUpsertRequest,
GraphQueryResponse, GraphReportRequest, GraphReportResponse, IngestionProfileSelector,
KnowledgePageGetRequest, KnowledgePageLintRequest, KnowledgePageLintResponse,
KnowledgePageRebuildRequest, KnowledgePageRebuildResponse, KnowledgePageResponse,
KnowledgePageSearchRequest, KnowledgePageSearchResponse, KnowledgePagesListRequest,
KnowledgePagesListResponse, ListRequest, ListResponse, MemoryHistoryGetRequest,
MemoryHistoryResponse, NoteFetchRequest, NoteFetchResponse, NoteProvenanceBundleResponse,
NoteProvenanceGetRequest, PayloadLevel, PublishNoteRequest, QueryPlan, RankingRequestOverride,
RebuildReport, SearchDetailsRequest, SearchDetailsResult, SearchExplainRequest,
SearchExplainResponse, SearchIndexItem, SearchRequest, SearchResponse, SearchSessionGetRequest,
SearchTimelineGroup, SearchTimelineRequest, SearchTrajectoryResponse, SearchTrajectorySummary,
ShareScope, SpaceGrantRevokeRequest, SpaceGrantRevokeResponse, SpaceGrantUpsertRequest,
SpaceGrantsListRequest, TextPositionSelector, TextQuoteSelector, TraceBundleGetRequest,
TraceBundleResponse, TraceGetRequest, TraceGetResponse, TraceRecentListRequest,
TraceRecentListResponse, TraceTrajectoryGetRequest, UnpublishNoteRequest, UpdateRequest,
Expand Down Expand Up @@ -285,6 +285,16 @@ struct GraphQueryBody {
explain: Option<bool>,
}

#[derive(Clone, Debug, Deserialize)]
struct GraphReportBody {
subject: GraphQueryEntityRef,
predicate: Option<GraphQueryPredicateRef>,
scopes: Option<Vec<String>>,
as_of: Option<String>,
limit: Option<u32>,
explain: Option<bool>,
}

#[derive(Clone, Debug, Deserialize)]
struct SearchCreateRequest {
mode: SearchMode,
Expand Down Expand Up @@ -652,6 +662,7 @@ pub fn router(state: AppState) -> Router {
.route("/v2/searches/{search_id}/timeline", routing::get(searches_timeline))
.route("/v2/searches/{search_id}/notes", routing::post(searches_notes))
.route("/v2/graph/query", routing::post(graph_query))
.route("/v2/graph/report", routing::post(graph_report))
.route("/v2/notes", routing::get(notes_list))
.route(
"/v2/notes/{note_id}",
Expand Down Expand Up @@ -1846,6 +1857,52 @@ async fn graph_query(
Ok(Json(response))
}

#[utoipa::path(
post,
path = "/v2/graph/report",
tag = "graph",
request_body = Value,
responses(
(status = 200, description = "Source-backed graph topic-map report.", body = Value),
(status = 400, description = "Invalid request.", body = ErrorBody),
(status = 401, description = "Authentication required.", body = ErrorBody),
(status = 403, description = "Scope denied.", body = ErrorBody),
(status = 422, description = "Non-English input rejected.", body = ErrorBody),
(status = 500, description = "Internal error.", body = ErrorBody),
)
)]
async fn graph_report(
State(state): State<AppState>,
headers: HeaderMap,
payload: Result<Json<GraphReportBody>, JsonRejection>,
) -> Result<Json<GraphReportResponse>, ApiError> {
let ctx = RequestContext::from_headers(&headers)?;
let read_profile = required_read_profile(&headers)?;
let Json(payload) = payload.map_err(|err| {
tracing::warn!(error = %err, "Invalid request payload.");

json_error(StatusCode::BAD_REQUEST, "INVALID_REQUEST", "Invalid request payload.", None)
})?;
let as_of = parse_optional_rfc3339(payload.as_of.as_ref(), "$.as_of")?;
let response = state
.service
.graph_report(GraphReportRequest {
tenant_id: ctx.tenant_id,
project_id: ctx.project_id,
agent_id: ctx.agent_id,
read_profile,
subject: payload.subject,
predicate: payload.predicate,
scopes: payload.scopes,
as_of,
limit: payload.limit,
explain: payload.explain,
})
.await?;

Ok(Json(response))
}

#[utoipa::path(
post,
path = "/v2/searches",
Expand Down
45 changes: 45 additions & 0 deletions apps/elf-eval/tests/real_world_job_benchmark.rs
Original file line number Diff line number Diff line change
Expand Up @@ -304,6 +304,14 @@ fn graph_rag_citation_navigation_promotion_report_markdown_path() -> Result<Path
.join("2026-06-19-graph-rag-citation-navigation-promotion-report.md"))
}

fn graph_topic_map_report_markdown_path() -> Result<PathBuf> {
Ok(workspace_root()?
.join("docs")
.join("evidence")
.join("benchmarking")
.join("2026-06-20-graph-topic-map-report.md"))
}

fn operator_approved_public_proxy_private_addendum_report_markdown_path() -> Result<PathBuf> {
Ok(workspace_root()?
.join("docs")
Expand Down Expand Up @@ -3822,6 +3830,43 @@ fn graph_rag_citation_navigation_promotion_preserves_typed_non_passes() -> Resul
Ok(())
}

#[test]
fn graph_topic_map_report_wires_source_backed_graph_lite_readback() -> Result<()> {
let markdown = fs::read_to_string(graph_topic_map_report_markdown_path()?)?;
let benchmarking_index = fs::read_to_string(benchmarking_index_path()?)?;
let readme = fs::read_to_string(readme_path()?)?;
let graph_report_service =
fs::read_to_string(workspace_root()?.join("packages/elf-service/src/graph_report.rs"))?;
let api_routes = fs::read_to_string(workspace_root()?.join("apps/elf-api/src/routes.rs"))?;
let mcp_server = fs::read_to_string(workspace_root()?.join("apps/elf-mcp/src/server.rs"))?;
let graph_spec =
fs::read_to_string(workspace_root()?.join("docs/spec/system_graph_memory_postgres_v1.md"))?;

assert!(markdown.contains("Graph Topic-Map Report - June 20, 2026"));
assert!(markdown.contains("elf.graph_report/v1"));
assert!(markdown.contains("sourced"));
assert!(markdown.contains("inferred"));
assert!(markdown.contains("ambiguous"));
assert!(markdown.contains("stale"));
assert!(markdown.contains("superseded"));
assert!(markdown.contains("valid_from"));
assert!(markdown.contains("valid_to"));
assert!(markdown.contains("valid_at"));
assert!(markdown.contains("invalid_at"));
assert!(graph_report_service.contains("ELF_GRAPH_REPORT_SCHEMA_V1"));
assert!(graph_report_service.contains("GraphReportSummary"));
assert!(graph_report_service.contains("build_topic_map"));
assert!(api_routes.contains("/v2/graph/report"));
assert!(mcp_server.contains("elf_graph_report"));
assert!(graph_spec.contains("elf.graph_report/v1"));
assert!(graph_spec.contains("Graphiti/Zep `valid_at` and `invalid_at`"));
assert!(benchmarking_index.contains("2026-06-20-graph-topic-map-report.md"));
assert!(readme.contains("Graph Topic-Map Report - June 20, 2026"));
assert!(readme.contains("Graph topic-map reports after XY-1020"));

Ok(())
}

fn assert_openviking_trajectory_materialization_summary(report: &Value) -> Result<()> {
assert_eq!(
report.pointer("/schema").and_then(Value::as_str),
Expand Down
22 changes: 21 additions & 1 deletion apps/elf-mcp/src/server.rs
Original file line number Diff line number Diff line change
Expand Up @@ -268,6 +268,15 @@ impl ElfMcp {
self.forward(HttpMethod::Post, "/v2/graph/query", params, None).await
}

#[rmcp::tool(
name = "elf_graph_report",
description = "Build a source-backed graph topic map with current, historical, future, inferred, ambiguous, stale, and superseded fact markers.",
input_schema = graph_report_schema()
)]
async fn elf_graph_report(&self, params: JsonObject) -> Result<CallToolResult, ErrorData> {
self.forward(HttpMethod::Post, "/v2/graph/report", params, None).await
}

#[rmcp::tool(
name = "elf_events_ingest",
description = "Ingest an event by extracting evidence-bound notes using the configured LLM extractor.",
Expand Down Expand Up @@ -1024,6 +1033,10 @@ fn graph_query_schema() -> Arc<JsonObject> {
}))
}

fn graph_report_schema() -> Arc<JsonObject> {
graph_query_schema()
}

fn events_ingest_schema() -> Arc<JsonObject> {
Arc::new(rmcp::object!({
"type": "object",
Expand Down Expand Up @@ -1603,7 +1616,7 @@ mod tests {

type RequestRecorder = Arc<Mutex<Option<oneshot::Sender<RecordedRequest>>>>;

const ALL_TOOL_DEFINITIONS: [ToolDefinition; 31] = [
const ALL_TOOL_DEFINITIONS: [ToolDefinition; 32] = [
ToolDefinition::new(
"elf_notes_ingest",
HttpMethod::Post,
Expand All @@ -1616,6 +1629,12 @@ mod tests {
"/v2/graph/query",
"Query graph entities and relations by structured criteria.",
),
ToolDefinition::new(
"elf_graph_report",
HttpMethod::Post,
"/v2/graph/report",
"Build a source-backed graph topic map with current, historical, future, inferred, ambiguous, stale, and superseded fact markers.",
),
ToolDefinition::new(
"elf_events_ingest",
HttpMethod::Post,
Expand Down Expand Up @@ -1828,6 +1847,7 @@ mod tests {
let expected = [
"elf_notes_ingest",
"elf_graph_query",
"elf_graph_report",
"elf_events_ingest",
"elf_core_blocks_get",
"elf_entity_memory_get",
Expand Down
74 changes: 74 additions & 0 deletions docs/evidence/benchmarking/2026-06-20-graph-topic-map-report.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
---
type: Evidence
title: "Graph Topic-Map Report - June 20, 2026"
description: "Checked-in benchmark evidence record: Graph Topic-Map Report - June 20, 2026."
resource: docs/evidence/benchmarking/2026-06-20-graph-topic-map-report.md
status: active
authority: current_state
owner: evidence
last_verified: 2026-06-20
tags:
- docs
- evidence
- benchmarking
---
# Graph Topic-Map Report - June 20, 2026

Goal: Close XY-1020's graph-lite product increment by proving ELF can report
Postgres-backed temporal graph facts as source-backed topic maps without introducing
a separate graph database or hidden source of truth.
Read this when: You need to know whether graph facts expose current, historical,
future, inferred, ambiguous, stale, and superseded status markers.
Inputs: `packages/elf-service/src/graph_report.rs`, `/v2/graph/report`,
`elf_graph_report`, and `docs/spec/system_graph_memory_postgres_v1.md`.
Outputs: Service, HTTP, MCP, and documentation evidence for `elf.graph_report/v1`.

## Executive Judgment

ELF now has a first-class graph report surface for one subject entity. The report
uses existing Postgres graph-lite facts, evidence links, predicate registry metadata,
validity windows, and supersession rows. It returns a topic map plus fact rows with
status markers for `sourced`, `inferred`, `ambiguous`, `stale`, and `superseded`
states.

This is an ELF-native graph-memory readback improvement. It does not claim Graphiti,
Zep, GraphRAG, RAGFlow, LightRAG, llm-wiki, gbrain, or graphify parity. Graphiti/Zep
`valid_at` and `invalid_at` vocabulary remains adapter-boundary terminology only;
ELF internal schema and reports use `valid_from` and `valid_to`.

## Command Evidence

| Command | Result |
| --- | --- |
| `cargo test -p elf-service graph_report -- --nocapture` | Passed; proves temporal/source/supersession markers and topic-map edges are shaped by service code. |
| `cargo test -p elf-mcp registers_all_tools -- --nocapture` | Passed; guards that `elf_graph_report` remains registered. |
| `cargo test -p elf-eval --test real_world_job_benchmark graph_topic_map_report_wires_source_backed_graph_lite_readback -- --nocapture` | Passed; guards the service, HTTP, MCP, spec, README, and evidence-report wiring. |
| `cargo make check` | Passed; runs formatting, docs, clippy, vstyle, and workspace tests. |

## Contract Readback

| Surface | Contract |
| --- | --- |
| Service | `ElfService::graph_report(GraphReportRequest)` returns `elf.graph_report/v1`. |
| HTTP | `/v2/graph/report` builds a source-backed graph topic-map report under the authenticated read profile. |
| MCP | `elf_graph_report` forwards to `/v2/graph/report` for agent readback. |
| Storage | Existing Postgres graph-lite tables remain authoritative; no graph database is introduced. |
| Vocabulary | Internal schema uses `valid_from`/`valid_to`; Graphiti/Zep `valid_at`/`invalid_at` remains adapter-boundary vocabulary. |

## Status Markers

| Marker | Meaning |
| --- | --- |
| `sourced` | The fact has one or more `graph_fact_evidence.note_id` links. |
| `inferred` | The predicate is pending or unresolved rather than operator-activated. |
| `ambiguous` | Multiple current facts conflict under a single-cardinality predicate. |
| `stale` | The fact is historical at the report `as_of` timestamp. |
| `superseded` | A `graph_fact_supersessions` row links the fact to a replacement. |

## Follow-Up Queue

| Follow-up | Reason |
| --- | --- |
| XY-1021 | Dreaming/background proposal review can now cite graph report markers before recommending rebuilds or mutations. |
| XY-1022 | Plugin/admin surfaces can expose graph report readback without bypassing source evidence. |
| XY-1023 | Benchmark adapters can score graph report parity only after comparable external artifacts exist. |
1 change: 1 addition & 0 deletions docs/evidence/benchmarking/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,5 +43,6 @@ Routes to: Benchmarking evidence concepts under `docs/evidence/benchmarking/`.
- `2026-06-19-operator-approved-public-proxy-production-private-addendum.md`: Operator-Approved Public-Proxy Production-Private Addendum - June 19, 2026; closes the current XY-930 proxy/simulated-corpus stage with 8/8 query pass, 0 wrong_result, and explicit boundaries that this is not real private-corpus or provider-backed proof.
- `2026-06-19-qmd-debug-ergonomics-dreaming-retest-report.md`: qmd Debug-Ergonomics Dreaming Retest Report - June 19, 2026; confirms qmd's default top-k/replay edge is unchanged while ELF keeps the narrow operator-debug trace/stage visibility wins.
- `2026-06-19-service-native-dreaming-readback-report.md`: Service-Native Dreaming Readback Report - June 19, 2026; materializes memory summary, proactive brief, and scheduled-memory derived outputs through `ElfService` readback with 9 pass, 0 wrong_result, and 2 typed XY-930 blockers.
- `2026-06-20-graph-topic-map-report.md`: Graph Topic-Map Report - June 20, 2026; adds the ELF-native `elf.graph_report/v1` readback for Postgres graph-lite facts with sourced, inferred, ambiguous, stale, and superseded topic-map markers.
- `2026-06-20-knowledge-workspace-version-diff-report.md`: Knowledge Workspace Version-Diff Report - June 20, 2026; proves ELF knowledge pages now expose previous-version diff metadata without perturbing page content hashes while preserving citation, lint, and source-of-truth boundaries.
- `2026-06-20-live-knowledge-page-rebuild-lint-report.md`: Live Knowledge-Page Rebuild/Lint Report - June 20, 2026; adds a Docker-contained ELF service-native knowledge-page materialization command while preserving llm-wiki, gbrain, GraphRAG, RAGFlow, LightRAG, and graphify as separate comparison targets until they emit comparable scored page artifacts.
4 changes: 4 additions & 0 deletions docs/log.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,3 +71,7 @@ logs.
rebuild metadata now exposes `elf.knowledge_page.version_diff/v1`, live benchmark
artifacts expose `page_version_diff`, and the Docker-contained live knowledge
report now publishes `version_diff_coverage`.
- Added the Graph Topic-Map report for XY-1020. ELF now exposes
`elf.graph_report/v1` through service, HTTP, and MCP readback, using existing
Postgres graph-lite facts with sourced, inferred, ambiguous, stale, and superseded
markers while keeping `valid_from`/`valid_to` as the internal temporal vocabulary.
7 changes: 7 additions & 0 deletions docs/spec/system_graph_memory_postgres_v1.md
Original file line number Diff line number Diff line change
Expand Up @@ -217,6 +217,13 @@ Supersession rule (write-time):
- `historical` when `valid_to <= read_at`.
- `future` when `valid_from > read_at`.
- Search relation context may include historical facts when they are evidence-linked to a returned note, but it must label them as historical instead of silently treating them as current.
- Graph report APIs expose `elf.graph_report/v1` topic maps from the same Postgres
graph-lite tables. Report facts must retain `valid_from`, `valid_to`,
`evidence_note_ids`, and supersession links, and must mark sourced, inferred,
ambiguous, stale, and superseded states distinctly.
- Graphiti/Zep `valid_at` and `invalid_at` vocabulary is adapter-boundary
terminology only. ELF internal schema, reports, docs, and service payloads use
`valid_from` and `valid_to`.

============================================================
7. CALL EXAMPLES
Expand Down
Loading