You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
just harness-smoke now runs the deterministic no-live-token fake /v1/responses smoke suite.
The runner targets code-rs/target/dev-fast/code by default and gives a clear error if the binary is missing.
exec-basic-smoke.json proves basic code exec --json startup, final assistant message emission, event count shape, request capture, and exit 0.
The harness now injects current Codex-base fake-server config with -c openai_base_url=..., skips the removed legacy --max-seconds flag when unsupported, stages scenario-local .code/skills fixtures into isolated CODE_HOME/skills, stores raw events in summary artifacts, and quotes fake gh shim paths.
Validation:
./build-fast.sh passed cleanly before merge.
just harness-smoke passed all six deterministic scenarios before merge.
Focused review agents inspected the branch. Findings fixed before merge: event-count schema expected item.agent_message, not legacy msg.agent_message; fake gh shim path needed shell quoting.
Remaining parity questions discovered by this gate:
The old config-disabled manual-skill explicit invocation scenario no longer matches current Codex-base behavior and is excluded from the first smoke gate pending a product decision.
The old context-ledger stderr marker assertions no longer exist in current code and were replaced by request-shape assertions.
The old image replay omission text changed; the current gate asserts no raw data:image/ replay plus generated-image path carry-forward.
Acceptance Criteria
Add a deterministic runner script for all fake /v1/responses scenarios.
The runner uses code-rs/target/dev-fast/code by default and errors with a clear message when the binary is missing.
Add a basic exec smoke scenario that expects code exec --json to return a final assistant message and exit 0.
Update the harness README and local AGENTS.md so future agents know this is the Dogfood Parity 1 P0 gate.
Update repo workflow metadata to include the harness command as a quality gate.
Validate with ./build-fast.sh and the deterministic harness command.
Out Of Scope
Live-model GitHub planning smoke in CI or release gates.
Auto Drive, Code Bridge/browser, Auto Review, and multi-agent feature restoration.
Large direct code copies from the pre-pivot Every Code branch.
Finish Line
tools/code-exec-harness is the first deterministic no-live-token gate for Dogfood Parity 1, running against the dev-fast code binary.
Current Status
State: First gate landed in PR #406.
Merged:
just harness-smokenow runs the deterministic no-live-token fake/v1/responsessmoke suite.code-rs/target/dev-fast/codeby default and gives a clear error if the binary is missing.exec-basic-smoke.jsonproves basiccode exec --jsonstartup, final assistant message emission, event count shape, request capture, and exit 0.-c openai_base_url=..., skips the removed legacy--max-secondsflag when unsupported, stages scenario-local.code/skillsfixtures into isolatedCODE_HOME/skills, stores raw events in summary artifacts, and quotes fakeghshim paths.Validation:
./build-fast.shpassed cleanly before merge.just harness-smokepassed all six deterministic scenarios before merge.item.agent_message, not legacymsg.agent_message; fakeghshim path needed shell quoting.Remaining parity questions discovered by this gate:
data:image/replay plus generated-image path carry-forward.Acceptance Criteria
/v1/responsesscenarios.code-rs/target/dev-fast/codeby default and errors with a clear message when the binary is missing.code exec --jsonto return a final assistant message and exit 0.AGENTS.mdso future agents know this is the Dogfood Parity 1 P0 gate../build-fast.shand the deterministic harness command.Out Of Scope