Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/user/content/architecture-patterns/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,4 @@ menu:
Pattern | Description
--------|------------
[Live Context Graph](/architecture-patterns/live-context-graph/) | Model your business as a compounding ontology of live data products and build apps, services, and AI agents on top of it.
[Use an ontology table](/architecture-patterns/ontology/) | Create an ontology table of join relationships that agents query before writing multi-table SQL.
88 changes: 88 additions & 0 deletions doc/user/content/architecture-patterns/ontology.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
---
title: "Use an ontology table"
description: "Create an ontology table that helps agents write correct joins."
menu:
main:
parent: architecture-patterns
weight: 10
---

The ontology table is a curated catalog of join relationships between tables in
your database. Each row describes a single join: the columns in one table that
reference columns in another.

Through the Materialize [MCP server](/integrations/mcp-server/)'s `query` tool,
an agent can query the ontology table before writing multi-table SQL.

{{< note >}}
This pattern relies on the MCP server's `query` tool, which is enabled by
default starting in v26.27 for the agent MCP server and v26.30 for the developer
MCP server.
{{</ note >}}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left it out but do we want to distinguish this from the internal mz_internal.mz_ontology_... views?


```sql
CREATE TABLE ontology (
table_name text NOT NULL,
columns text[] NOT NULL,
referenced_table text NOT NULL,
referenced_columns text[] NOT NULL
);

COMMENT ON TABLE ontology IS
'Defines the join relationships between tables in the database. Each row
describes a single join: the columns in table_name that reference
referenced_columns in referenced_table. ALWAYS query this table before
writing any multi-table query. Use it to confirm exact join keys rather
than guessing column names. Filter by table_name OR referenced_table to
find all relationships involving a given table.';

COMMENT ON COLUMN ontology.table_name IS
'The dependent table, the one that holds the foreign key.';
COMMENT ON COLUMN ontology.columns IS
'The FK columns in table_name, in order. Pair positionally with referenced_columns.';
COMMENT ON COLUMN ontology.referenced_table IS
'The parent table, the one being pointed to.';
COMMENT ON COLUMN ontology.referenced_columns IS
'The PK or unique columns in referenced_table, in order matching columns.';

CREATE DEFAULT INDEX ON ontology;
```

## Agent system prompt

Add the following to the agent's system prompt to enforce the intended behavior:

```text
Before writing or executing any joins, query the ontology table for the involved table names. Use the returned join keys verbatim.
```

## Example: e-commerce schema

Given the following tables and join-relevant columns:

| Table | Key columns |
| --- | --- |
| `customers` | `id`, `email` |
| `addresses` | `id`, `customer_id` |
| `orders` | `id`, `customer_id`, `shipping_address_id` |
| `order_items` | `id`, `order_id`, `product_id` |
| `products` | `id`, `category_id` |
| `categories` | `id` |
| `support_tickets` | `id`, `customer_email` *(implicit join, no FK)* |

The ontology table is populated as:

```sql
INSERT INTO ontology (table_name, columns, referenced_table, referenced_columns) VALUES
('addresses', ARRAY['customer_id'], 'customers', ARRAY['id']),
('orders', ARRAY['customer_id'], 'customers', ARRAY['id']),
('orders', ARRAY['shipping_address_id'], 'addresses', ARRAY['id']),
('order_items', ARRAY['order_id'], 'orders', ARRAY['id']),
('order_items', ARRAY['product_id'], 'products', ARRAY['id']),
('products', ARRAY['category_id'], 'categories', ARRAY['id']),
('support_tickets', ARRAY['customer_email'], 'customers', ARRAY['email']);
```

Tables with multiple relationships, like `orders`, contribute one row per
relationship. Implicit joins, such as `support_tickets` → `customers`, are
documented exactly like the declared foreign-key relationships.
3 changes: 3 additions & 0 deletions doc/user/content/integrations/mcp-server/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,9 @@ and support the MCP `initialize`, `tools/list`, and `tools/call` methods.

## See also

- [Use an ontology table](/architecture-patterns/ontology/) to curate join
relationships that agents query through the `query` tool before writing
multi-table SQL.
- [MCP Server
Troubleshooting](/integrations/mcp-server/mcp-server-troubleshooting/)
- [Appendix: MCP Server (Python)](/integrations/mcp-server/llm) for locally-run,
Expand Down
23 changes: 15 additions & 8 deletions doc/user/content/integrations/mcp-server/mcp-agent.md
Original file line number Diff line number Diff line change
Expand Up @@ -781,14 +781,6 @@ curl -X POST <baseURL>/api/mcp/agent \

## Start querying

Once connected to the MCP server, you can query your curated data products using
either natural language or SQL:

- *Via `materialize-agent`: What data products can I query?*
- *SELECT * FROM mcp_product_performance LIMIT 5;*
- *What's the `total_revenue` for product 42?*
- *Perform a Pareto analysis on my products.*

{{< warning >}}

By default, the [`query` tool](/integrations/mcp-server/mcp-agent-tools/#query)
Expand All @@ -803,9 +795,24 @@ configuration](/integrations/mcp-server/mcp-agent-config/).

{{< /warning >}}

{{< tip >}}
Because the `query` tool can join across objects, consider maintaining an
[ontology table](/architecture-patterns/ontology/): a curated catalog of the
join relationships in your schema that the agent can query to confirm exact join
keys before writing multi-table SQL.
{{< /tip >}}

Once connected to the MCP server, you can query your curated data products using
either natural language or SQL:

- *Via `materialize-agent`: What data products can I query?*
- *SELECT * FROM mcp_product_performance LIMIT 5;*
- *What's the `total_revenue` for product 42?*
- *Perform a Pareto analysis on my products.*

## Related pages

- [Use an ontology table](/architecture-patterns/ontology/)
- [`materialize-agent` MCP Server available
tools](/integrations/mcp-server/mcp-agent-tools/)
- [`materialize-agent` MCP Server
Expand Down
7 changes: 7 additions & 0 deletions doc/user/content/integrations/mcp-server/mcp-developer.md
Original file line number Diff line number Diff line change
Expand Up @@ -468,6 +468,12 @@ curl -X POST <baseURL>/api/mcp/developer \

## Start asking questions

{{< tip >}}
When the agent reads your user objects with the `query` tool, an [ontology
table](/architecture-patterns/ontology/) of curated join relationships in your
schema helps it confirm exact join keys before writing multi-table SQL.
{{< /tip >}}

Once connected to the MCP server, you can ask natural language questions like:

| Question | What the agent does | Tool |
Expand Down Expand Up @@ -500,6 +506,7 @@ The privileges required to use the `materialize-developer` MCP server are:

## Related pages

- [Use an ontology table](/architecture-patterns/ontology/)
- [`materialize-developer` MCP Server available
tools](/integrations/mcp-server/mcp-developer-tools/)
- [`materialize-developer` MCP Server
Expand Down
Loading