Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 18 additions & 18 deletions apps/web/src/content/docs/docs/evaluation/eval-files.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -407,26 +407,26 @@ dataset rows out of oversized inline YAML, see [Benchmark Provenance](/docs/guid

## Environment Variable Interpolation

All string fields in eval files support `${{ VAR }}` syntax for environment variable interpolation. This enables portable eval configs that work across machines and CI environments without hardcoded paths.
All string fields in eval files support `{{ env.VAR }}` syntax for environment variable interpolation. This enables portable eval configs that work across machines and CI environments without hardcoded paths.

```yaml
workspace:
repos:
- path: ./RepoA
repo: "${{ REPO_A_URL }}"
commit: "${{ REPO_A_COMMIT }}"
repo: "{{ env.REPO_A_URL }}"
commit: "{{ env.REPO_A_COMMIT }}"

tests:
- id: test-1
input: "Evaluate the code in ${{ PROJECT_NAME }}"
criteria: "${{ EVAL_CRITERIA }}"
input: "Evaluate the code in {{ env.PROJECT_NAME }}"
criteria: "{{ env.EVAL_CRITERIA }}"
```

### Behavior

- **Syntax:** `${{ VARIABLE_NAME }}` with optional whitespace around the name
- **Syntax:** `{{ env.VARIABLE_NAME }}` with optional whitespace around the name
- **Missing variables** resolve to an empty string
- **Partial interpolation** is supported: `${{ HOME }}/repos/${{ PROJECT }}` becomes `/home/user/repos/myproject`
- **Partial interpolation** is supported: `{{ env.HOME }}/repos/{{ env.PROJECT }}` becomes `/home/user/repos/myproject`
- **Non-string values** (numbers, booleans) are not affected
- Interpolation is applied recursively to all nested objects and arrays
- Works in YAML eval files, external YAML/JSONL case files, and external workspace config files
Expand All @@ -438,8 +438,8 @@ tests:
# workspace.yaml — works on any machine
repos:
- path: ./my-repo
repo: "${{ MY_REPO_URL }}"
commit: "${{ MY_REPO_COMMIT }}"
repo: "{{ env.MY_REPO_URL }}"
commit: "{{ env.MY_REPO_COMMIT }}"
```

```bash
Expand All @@ -450,31 +450,31 @@ MY_REPO_COMMIT=main

## Per-Test Template Variables

Eval YAML also supports per-test `vars` for data-driven prompt templates. Use `{{name}}` placeholders in test-facing text fields, and AgentV resolves them when the suite loads.
Eval YAML also supports per-test `vars` for data-driven prompt templates. Use `{{ vars.name }}` placeholders in test-facing text fields, and AgentV resolves them when the suite loads.

```yaml
input: "Answer clearly: {{question}}"
input: "Answer clearly: {{ vars.question }}"

tests:
- id: capital
vars:
question: What is the capital of France?
expected_answer: Paris
criteria: "Answers {{question}} correctly"
criteria: "Answers {{ vars.question }} correctly"
input:
- role: user
content: "Question: {{question}}"
expected_output: "{{expected_answer}}"
content: "Question: {{ vars.question }}"
expected_output: "{{ vars.expected_answer }}"
```

### Behavior

- `vars` is defined per test as an object
- `{{name}}` and dotted paths like `{{ user.name }}` are supported
- Substitution applies to suite-level `input`, test `input`, `input_files`, `criteria`, `expected_output`, and conversation turn `input` / `expected_output`
- `{{ vars.name }}` and dotted paths like `{{ vars.user.name }}` are supported
- Substitution applies to suite-level `input`, test `input`, `input_files`, `criteria`, `expected_output`, assertion values/metrics, and conversation turn `input` / `expected_output` / assertions
- When the whole string is a single placeholder, the original JSON value is preserved
- Missing variables are left unchanged, so unrelated template syntax is not silently blanked out
- `vars` interpolation is separate from environment interpolation: `{{question}}` uses test data, `${{ PROJECT_NAME }}` uses environment variables
- Missing variables render as empty strings following Nunjucks semantics
- `vars` interpolation is separate from environment interpolation: `{{ vars.question }}` uses test data, `{{ env.PROJECT_NAME }}` uses environment variables

## JSONL Format

Expand Down
20 changes: 16 additions & 4 deletions bun.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions examples/features/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,8 +72,8 @@ Focused examples for specific AgentV capabilities. Find your use case below, the
| [input-files-shorthand](input-files-shorthand/) | Attach files to every test using a compact shorthand |
| [suite-level-input](suite-level-input/) | Prepend a shared system prompt to every test in the suite |
| [suite-level-input-files](suite-level-input-files/) | Share file attachments across every test in the suite |
| [env-interpolation](env-interpolation/) | Inject environment variables into eval config with `${{ VAR }}` |
| [test-vars-templating](test-vars-templating/) | Inject per-test `vars` into `{{name}}` templates in eval fields |
| [env-interpolation](env-interpolation/) | Inject environment variables into eval config with `{{ env.VAR }}` |
| [test-vars-templating](test-vars-templating/) | Inject per-test `vars` into `{{ vars.name }}` templates in eval fields |

---

Expand Down
6 changes: 3 additions & 3 deletions examples/features/env-interpolation/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Environment Variable Interpolation

Demonstrates `${{ VAR }}` syntax for portable eval configs.
Demonstrates `{{ env.VAR }}` syntax for portable eval configs.

## Usage

Expand All @@ -14,7 +14,7 @@ Or create a `.env` file — AgentV loads `.env` files automatically from the dir

## Features

- **Full-value**: `criteria: "${{ EVAL_CRITERIA }}"` — entire field from env var
- **Partial/inline**: `"must be ${{ EXPECTED }} and clear"` — env var within a string
- **Full-value**: `criteria: "{{ env.EVAL_CRITERIA }}"` — entire field from env var
- **Partial/inline**: `"must be {{ env.EXPECTED }} and clear"` — env var within a string
- **Missing vars**: resolve to empty string (downstream validation catches required blanks)
- **All fields**: works in any string field — criteria, input, workspace paths, etc.
8 changes: 4 additions & 4 deletions examples/features/env-interpolation/evals/dataset.eval.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Environment Variable Interpolation Example
#
# All string fields support ${{ VAR }} syntax for env variable interpolation.
# Config-load fields support {{ env.VAR }} syntax for env variable interpolation.
# Missing variables resolve to empty string.
#
# Usage:
Expand All @@ -10,7 +10,7 @@
# Or use a .env file in the project root:
# CUSTOM_SYSTEM_PROMPT=You are a helpful assistant who always greets warmly.

description: Demonstrates ${{ VAR }} interpolation in eval fields
description: Demonstrates {{ env.VAR }} interpolation in eval fields

target: llm

Expand All @@ -19,13 +19,13 @@ tests:
- id: full-value
criteria: Responds with a friendly greeting
input: "Hello!"
expected_output: "${{ EXPECTED_GREETING }}"
expected_output: "{{ env.EXPECTED_GREETING }}"

# Partial/inline interpolation: env var embedded in a larger string
- id: partial-value
criteria: Response uses the system prompt persona
input:
- role: system
content: "${{ CUSTOM_SYSTEM_PROMPT }}"
content: "{{ env.CUSTOM_SYSTEM_PROMPT }}"
- role: user
content: "Hi there!"
6 changes: 3 additions & 3 deletions examples/features/test-vars-templating/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Per-Test Vars Templating

Demonstrates `tests[].vars` with `{{name}}` placeholders in eval files.
Demonstrates `tests[].vars` with `{{ vars.name }}` placeholders in eval files.

## Usage

Expand All @@ -11,6 +11,6 @@ agentv eval examples/features/test-vars-templating/evals/dataset.eval.yaml
## Features

- **Per-test data**: each test defines its own `vars` object
- **Template substitution**: `{{question}}` and dotted paths like `{{expected.answer}}`
- **Template substitution**: `{{ vars.question }}` and dotted paths like `{{ vars.expected.answer }}`
- **Suite-level templates**: shared `input` can reference per-test vars too
- **Separate from env interpolation**: `{{question}}` uses test data, `${{ VAR }}` uses environment variables
- **Separate from env interpolation**: `{{ vars.question }}` uses test data, `{{ env.VAR }}` uses environment variables
18 changes: 9 additions & 9 deletions examples/features/test-vars-templating/evals/dataset.eval.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Per-test vars templating example
#
# tests[].vars provides per-test data for {{name}} placeholders in eval fields.
# Placeholders support dotted paths like {{expected.answer}}.
# tests[].vars provides per-test data for {{ vars.name }} placeholders in eval fields.
# Placeholders support dotted paths like {{ vars.expected.answer }}.
#
# Usage:
# agentv eval examples/features/test-vars-templating/evals/dataset.eval.yaml
Expand All @@ -12,7 +12,7 @@ target: llm

input:
- role: system
content: "You are a concise assistant answering {{category}} questions."
content: "You are a concise assistant answering {{ vars.category }} questions."

tests:
- id: capital-france
Expand All @@ -21,9 +21,9 @@ tests:
question: What is the capital of France?
expected:
answer: Paris
criteria: "Answers {{question}} correctly"
input: "Question: {{question}}"
expected_output: "{{expected.answer}}"
criteria: "Answers {{ vars.question }} correctly"
input: "Question: {{ vars.question }}"
expected_output: "{{ vars.expected.answer }}"

- id: greet-ada
vars:
Expand All @@ -32,8 +32,8 @@ tests:
name: Ada
expected:
answer: Hello, Ada!
criteria: "Greets {{person.name}} warmly"
criteria: "Greets {{ vars.person.name }} warmly"
input:
- role: user
content: "Say hello to {{person.name}}."
expected_output: "{{expected.answer}}"
content: "Say hello to {{ vars.person.name }}."
expected_output: "{{ vars.expected.answer }}"
2 changes: 2 additions & 0 deletions packages/core/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@
"fast-glob": "^3.3.3",
"json5": "^2.2.3",
"micromatch": "^4.0.8",
"nunjucks": "^3.2.4",
"yaml": "^2.8.3",
"zod": "^3.23.8"
},
Expand All @@ -72,6 +73,7 @@
},
"devDependencies": {
"@types/micromatch": "^4.0.10",
"@types/nunjucks": "^3.2.6",
"zod-to-json-schema": "^3.25.1"
}
}
Loading
Loading