feat(integrations): support SPECIFY_<KEY>_EXTRA_ARGS env var for agent subprocess flags#2596
feat(integrations): support SPECIFY_<KEY>_EXTRA_ARGS env var for agent subprocess flags#2596doquanghuy wants to merge 8 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
Adds a generic env-var hook SPECIFY_<KEY>_EXTRA_ARGS so operators can append CLI flags to the spawned agent subprocess (e.g. --dangerously-skip-permissions for Claude in CI/non-interactive contexts) without editing skills, workflows, or integration code. The hook is implemented once on IntegrationBase and invoked from the three concrete build_exec_args implementations in base.py.
Changes:
- New
IntegrationBase._apply_extra_args_env_varhelper that readsSPECIFY_<KEY>_EXTRA_ARGS, normalizes the key (hyphens→underscores, uppercased), andshlex.splits the value intoargs. MarkdownIntegration,TomlIntegration, andSkillsIntegrationbuild_exec_argsnow call the helper between[key, -p, prompt]and the--model/--output-formatflags.- Adds
tests/integrations/test_extra_args.pycovering no-op default, single/multi-token parsing, whitespace handling, per-integration scoping, key normalization, andrequires_cli=Falseshort-circuit.
Show a summary per file
| File | Description |
|---|---|
| src/specify_cli/integrations/base.py | Adds _apply_extra_args_env_var helper and calls it from the three concrete build_exec_args implementations. |
| tests/integrations/test_extra_args.py | New test module exercising the env-var hook through stub SkillsIntegration subclasses. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 2/2 changed files
- Comments generated: 2
…ncode/Copilot
Four integrations override `build_exec_args` and were silently
ignoring the env-var hook introduced in the previous commit:
- CodexIntegration (`codex exec ...`)
- DevinIntegration (`devin -p ...`)
- OpencodeIntegration (`opencode run ...`)
- CopilotIntegration (`copilot -p ...`)
Each now calls `self._apply_extra_args_env_var(args)` between the
base argv and the canonical Spec Kit flags (matching the placement
in `MarkdownIntegration`, `TomlIntegration`, and `SkillsIntegration`),
so operator-injected flags cannot clobber `--model` / `--output-format`
/ `--json`.
Adds 4 parameterized override-integration tests locking the wiring
per agent. Also cleans up an inline `__import__("os").environ` in the
fixture to a top-of-file `import os`.
Drive-by typing fix: guard `self.registrar_config.get(...)` in
`CopilotIntegration.add_commands` against the `None` case, matching
the pattern already used in `base.py` for the same access.
Addresses Copilot review on github#2596.
|
Pushed 8341c12 addressing both Copilot findings: 1. Override integrations wired up. 2. Test fixture cleanup. Replaced inline Coverage. Added 4 parameterized override-integration tests locking the wiring per agent — full suite is now 2947 passed (was 2943), 35 skipped, no regressions. Drive-by. Pyright surfaced a pre-existing latent AI disclosure: drafted with Claude Opus, human-reviewed. |
There was a problem hiding this comment.
Copilot's findings
Comments suppressed due to low confidence (1)
src/specify_cli/integrations/copilot/init.py:149
- CopilotIntegration overrides dispatch_command() and constructs its own
cli_argswithout calling_apply_extra_args_env_var. Sinceworkflowcommand steps callimpl.dispatch_command(...),SPECIFY_COPILOT_EXTRA_ARGSwill not actually reach the spawnedcopilotsubprocess in workflows even though build_exec_args() now supports it. Apply the env-var hook in dispatch_command() as well (ideally right aftercli_args = ["copilot", "-p", prompt]and before adding--agent/--yolo/--model/--output-formatso canonical flags still win).
args = ["copilot", "-p", prompt]
self._apply_extra_args_env_var(args)
if _allow_all():
args.append("--yolo")
if model:
args.extend(["--model", model])
if output_json:
args.extend(["--output-format", "json"])
return args
- Files reviewed: 6/6 changed files
- Comments generated: 2
…command When the Opencode prompt starts with `/`, `build_exec_args` injects `--command <X>` derived from the prompt. The previous placement of `self._apply_extra_args_env_var(args)` appended operator-injected args AFTER that `--command`, so a user setting `SPECIFY_OPENCODE_EXTRA_ARGS="--command override"` could redirect the command under typical last-wins repeated-flag CLI semantics. Move the hook to immediately after `args = [self.key, "run"]`, before the prompt-parsing block. The operator's `--command override` (if any) now precedes the Spec Kit-derived `--command speckit`, so the canonical choice wins. Adds `test_opencode_extra_args_cannot_clobber_prompt_derived_command` locking the ordering. Also corrects the module docstring to describe the hook as living in `IntegrationBase` (not `SkillsIntegration`) and to acknowledge that this file covers Codex/Devin/Opencode/Copilot in addition to SkillsIntegration stubs. Addresses Copilot review on github#2596.
|
Pushed a17b951 addressing both new Copilot findings: 1. Opencode ordering fix. Moved 2. Test docstring. Rewrote the module docstring to describe the hook as living in Full suite: 2948 passed (+1), 35 skipped, no regressions. AI disclosure: drafted with Claude Opus, human-reviewed. |
mnriem
left a comment
There was a problem hiding this comment.
Please use SPECIFY_INTEGRATION__EXTRA_ARGS so we know it is targeting the integration subsystem. Good addition overall!
`CopilotIntegration` is the only integration that overrides `dispatch_command` — it builds `cli_args` inline rather than going through `build_exec_args`. The previous commit wired `_apply_extra_args_env_var` into `build_exec_args` but workflow execution calls `dispatch_command`, so `SPECIFY_COPILOT_EXTRA_ARGS` was silently ignored at runtime. Add the hook in `dispatch_command` immediately after `cli_args = ["copilot", "-p", prompt]`, mirroring the placement in `build_exec_args` (between `-p prompt` and the canonical `--agent` / `--yolo` / `--model` / `--output-format` flags). `IntegrationBase.dispatch_command` already delegates to `build_exec_args`, so Codex, Devin, and Opencode continue to honour their respective env vars through inheritance — no further changes needed for them. Adds two end-to-end tests that monkeypatch `subprocess.run` and assert the env-var args reach the executed argv: - `test_copilot_dispatch_command_includes_extra_args` locks the bypass fix on the overridden path. - `test_codex_dispatch_command_includes_extra_args` locks the inherited-base-dispatch path for the other three integrations. Addresses Copilot review on github#2596.
|
Pushed acab474 addressing both new findings: 1. Copilot 2. End-to-end test coverage. Added two tests that monkeypatch
Full suite: 2950 passed (was 2948), 35 skipped, no regressions. AI disclosure: drafted with Claude Opus, human-reviewed. |
…XTRA_ARGS Per maintainer request: scope the operator-injected env var to the integration subsystem by prepending `INTEGRATION_` to the key segment, so it does not collide with other Spec Kit env-var namespaces. Renames everywhere it appears: - Helper `IntegrationBase._apply_extra_args_env_var` env_name format and docstring (`base.py`). - Inline comment in `CopilotIntegration.dispatch_command`. - All `monkeypatch.setenv(...)` calls, docstrings, and the autouse fixture's scope filter in `tests/integrations/test_extra_args.py`. No behaviour change beyond the variable name. Default (env var unset) still byte-identical to before this PR. Addresses review on github#2596.
|
@mnriem — done. Pushed 932cfb9 renaming the env var to Updated everywhere it lives:
No behaviour change beyond the variable name. Full suite still 2950 passed, 35 skipped. AI disclosure: drafted with Claude Opus, human-reviewed. |
| ) | ||
| extra = os.environ.get(env_name, "").strip() | ||
| if extra: | ||
| args.extend(shlex.split(extra)) |
…ting Wrap `shlex.split` in `_apply_extra_args_env_var` so an unmatched quote in `SPECIFY_INTEGRATION_<KEY>_EXTRA_ARGS` surfaces a clear `ValueError` naming the offending env var and showing the invalid value, instead of crashing workflow dispatch with a bare shlex traceback. Adds a new test locking the actionable error path. Addresses Copilot review feedback on github#2596.
|
@mnriem — addressed the latest Copilot finding in The fix.
A malformed env var now surfaces as a clear operator-facing error instead of a bare The rebase. Three of the existing commits touched the same line in Test coverage. New Branch is now AI disclosure: drafted with Claude Opus, human-reviewed. |
…t subprocess flags Read a per-integration env var (SPECIFY_<KEY>_EXTRA_ARGS) inside `SkillsIntegration.build_exec_args`, `MarkdownIntegration.build_exec_args`, and `TomlIntegration.build_exec_args` and append the parsed flags to the spawned agent's argv, gated per integration key. Operators can now opt into extra CLI flags (e.g. `SPECIFY_CLAUDE_EXTRA_ARGS=--dangerously-skip-permissions`) without modifying any SKILL or workflow YAML. Useful in CI / non-interactive contexts where the spawned `<agent> -p ...` would otherwise hang on an internal permission or input prompt invisible to the parent `specify workflow run` process. Key normalization: `kiro-cli` → `SPECIFY_KIRO_CLI_EXTRA_ARGS` (hyphen replaced with underscore, then uppercased). Default (env var unset or whitespace-only) is byte-identical to previous behaviour. Extra args are inserted between `-p prompt` and the model / output-format flags so they cannot clobber canonical Spec Kit args. Implementation: a single helper `IntegrationBase._apply_extra_args_env_var` encapsulates the env-var read + shlex parsing; each of the three concrete `build_exec_args` implementations calls it. Closes github#2595 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ncode/Copilot
Four integrations override `build_exec_args` and were silently
ignoring the env-var hook introduced in the previous commit:
- CodexIntegration (`codex exec ...`)
- DevinIntegration (`devin -p ...`)
- OpencodeIntegration (`opencode run ...`)
- CopilotIntegration (`copilot -p ...`)
Each now calls `self._apply_extra_args_env_var(args)` between the
base argv and the canonical Spec Kit flags (matching the placement
in `MarkdownIntegration`, `TomlIntegration`, and `SkillsIntegration`),
so operator-injected flags cannot clobber `--model` / `--output-format`
/ `--json`.
Adds 4 parameterized override-integration tests locking the wiring
per agent. Also cleans up an inline `__import__("os").environ` in the
fixture to a top-of-file `import os`.
Drive-by typing fix: guard `self.registrar_config.get(...)` in
`CopilotIntegration.add_commands` against the `None` case, matching
the pattern already used in `base.py` for the same access.
Addresses Copilot review on github#2596.
…command When the Opencode prompt starts with `/`, `build_exec_args` injects `--command <X>` derived from the prompt. The previous placement of `self._apply_extra_args_env_var(args)` appended operator-injected args AFTER that `--command`, so a user setting `SPECIFY_OPENCODE_EXTRA_ARGS="--command override"` could redirect the command under typical last-wins repeated-flag CLI semantics. Move the hook to immediately after `args = [self.key, "run"]`, before the prompt-parsing block. The operator's `--command override` (if any) now precedes the Spec Kit-derived `--command speckit`, so the canonical choice wins. Adds `test_opencode_extra_args_cannot_clobber_prompt_derived_command` locking the ordering. Also corrects the module docstring to describe the hook as living in `IntegrationBase` (not `SkillsIntegration`) and to acknowledge that this file covers Codex/Devin/Opencode/Copilot in addition to SkillsIntegration stubs. Addresses Copilot review on github#2596.
`CopilotIntegration` is the only integration that overrides `dispatch_command` — it builds `cli_args` inline rather than going through `build_exec_args`. The previous commit wired `_apply_extra_args_env_var` into `build_exec_args` but workflow execution calls `dispatch_command`, so `SPECIFY_COPILOT_EXTRA_ARGS` was silently ignored at runtime. Add the hook in `dispatch_command` immediately after `cli_args = ["copilot", "-p", prompt]`, mirroring the placement in `build_exec_args` (between `-p prompt` and the canonical `--agent` / `--yolo` / `--model` / `--output-format` flags). `IntegrationBase.dispatch_command` already delegates to `build_exec_args`, so Codex, Devin, and Opencode continue to honour their respective env vars through inheritance — no further changes needed for them. Adds two end-to-end tests that monkeypatch `subprocess.run` and assert the env-var args reach the executed argv: - `test_copilot_dispatch_command_includes_extra_args` locks the bypass fix on the overridden path. - `test_codex_dispatch_command_includes_extra_args` locks the inherited-base-dispatch path for the other three integrations. Addresses Copilot review on github#2596.
…XTRA_ARGS Per maintainer request: scope the operator-injected env var to the integration subsystem by prepending `INTEGRATION_` to the key segment, so it does not collide with other Spec Kit env-var namespaces. Renames everywhere it appears: - Helper `IntegrationBase._apply_extra_args_env_var` env_name format and docstring (`base.py`). - Inline comment in `CopilotIntegration.dispatch_command`. - All `monkeypatch.setenv(...)` calls, docstrings, and the autouse fixture's scope filter in `tests/integrations/test_extra_args.py`. No behaviour change beyond the variable name. Default (env var unset) still byte-identical to before this PR. Addresses review on github#2596.
…ting Wrap `shlex.split` in `_apply_extra_args_env_var` so an unmatched quote in `SPECIFY_INTEGRATION_<KEY>_EXTRA_ARGS` surfaces a clear `ValueError` naming the offending env var and showing the invalid value, instead of crashing workflow dispatch with a bare shlex traceback. Adds a new test locking the actionable error path. Addresses Copilot review feedback on github#2596.
0ddac52 to
faf5129
Compare
|
Please address test errors |
… test `test_copilot_integration_honours_extra_args` hardcoded `"copilot"` in the expected argv, but `CopilotIntegration.build_exec_args` calls `_copilot_executable()` which returns `"copilot.cmd"` on Windows (`os.name == "nt"`). The test passed on Linux/macOS and failed on all three Windows-latest matrix entries. Resolve by importing `_copilot_executable` alongside `CopilotIntegration` and using it as the first expected argv token. The companion `test_copilot_dispatch_command_includes_extra_args` already uses `index()` lookups rather than full-argv equality so it was unaffected.
|
@mnriem — Windows pytest failure fixed in Root cause. Fix. Import Audit for similar issues on the sibling PRs. Checked #2663 and #2664 for the same Windows hazard pattern:
Full local suite: 3009 passed, 40 skipped. Re-pushed to AI disclosure: drafted with Claude Opus, human-reviewed. |
| @@ -53,6 +53,7 @@ def build_exec_args( | |||
| ) -> list[str] | None: | |||
| # Codex uses ``codex exec "prompt"`` for non-interactive mode. | |||
| args: list[str] = ["codex", "exec", prompt] | |||
| The hook is implemented in `IntegrationBase._apply_extra_args_env_var` | ||
| and wired into every concrete `build_exec_args` — | ||
| `MarkdownIntegration`, `TomlIntegration`, `SkillsIntegration`, plus the | ||
| overrides in Codex, Devin, Opencode and Copilot. These tests cover both | ||
| the shared mechanism (via `SkillsIntegration` stubs near the top of the |
|
Please address Copilot feedback |
…lasses Two Copilot findings on the latest pass: 1. `CodexIntegration.build_exec_args` hardcoded the executable name as the literal `"codex"` while the env-var lookup derives from `self.key`. The two should stay coupled (matching Devin/Opencode, which both use `self.key` already). Replace the literal with `self.key` so the argv and env-var scoping cannot drift. 2. `tests/integrations/test_extra_args.py` covered the `SkillsIntegration` mechanism (via stubs near the top) and the four `build_exec_args` overrides (Codex/Devin/Opencode/Copilot) end-to-end, but did not exercise the `MarkdownIntegration` or `TomlIntegration` base implementations directly. Add bare `_MarkdownAgentStub` and `_TomlAgentStub` test stubs and a test each — the most common integration pattern (Amp, Auggie, Generic, Gemini, Tabnine, …) inherits without overriding, so the base wiring is now locked. Full suite: 3011 passed (was 3009), 40 skipped, no regressions. Addresses Copilot review feedback on github#2596.
|
@mnriem — addressed both new Copilot findings in Why does Copilot keep finding things? Honest answer: each successive pass surfaces something the previous pass missed, not a deeper code-quality issue. The PR is incrementally tightening as the visible surface area expands (override integrations → dispatch_command bypass → malformed-quoting error → Windows portability → now executable-name coupling + base-class coverage). The underlying mechanism — single helper on 1. Codex executable name decoupled from 2. Base-class coverage added. The test suite covered the
Full suite: 3011 passed (was 3009), 40 skipped, no regressions. Pushed to AI disclosure: drafted with Claude Opus, human-reviewed. |
Description
Closes #2595.
Adds a per-integration env-var hook
(
SPECIFY_INTEGRATION_<KEY>_EXTRA_ARGS) so operators can injectextra CLI flags into the spawned agent subprocess without modifying
any SKILL or workflow YAML. The motivating use case is
non-interactive contexts (CI, batch, sandboxed eval) where the
spawned
<agent> -p ...hangs silently on an internal permissionor input prompt invisible to the parent
specify workflow runprocess.
The
INTEGRATIONsegment scopes the variable to this subsystem soit does not collide with other Spec Kit env-var namespaces.
What changed
IntegrationBase._apply_extra_args_env_var(args)—reads
SPECIFY_INTEGRATION_<KEY>_EXTRA_ARGSfrom env (keynormalized:
-→_, uppercased), parses withshlex.split,and appends to args. No-op when the env var is unset or
whitespace-only.
MarkdownIntegration.build_exec_args,TomlIntegration.build_exec_args, andSkillsIntegration.build_exec_argseach call the helper betweenargs = [self.key, "-p", prompt]and the model / output-formatappends — so extra args sit between
-p promptand thecanonical Spec Kit flags and cannot clobber them.
build_exec_args(
CodexIntegration,DevinIntegration,OpencodeIntegration,CopilotIntegration) also call the helper in the matchingposition so the env var works for every
requires_cliintegration.
CopilotIntegration.dispatch_commandbuildscli_argsinlinerather than going through
build_exec_args; the hook is invokedthere too so workflow execution honours the env var (not just
build_exec_argscallers).tests/integrations/test_extra_args.pywith 14 testscovering: default no-op, single-token, multi-token via shlex,
whitespace-only treated as unset, other-integration env var
ignored (per-key scoping), key normalization
(
kiro-cli→KIRO_CLI),requires_cli: Falseshort-circuit,per-integration
build_exec_argschecks for Codex/Devin/Opencode/Copilot, Opencode precedence vs. prompt-derived
--command, and end-to-enddispatch_commandchecks (Copilotoverride path + inherited base path).
Default behaviour preserved
When
SPECIFY_INTEGRATION_<KEY>_EXTRA_ARGSis unset for the activeintegration,
build_exec_argsreturns byte-identical argv tobefore this change. Existing operators see no difference.
Examples
Testing
uv run specify --helppytest tests/ -q→2950 passed, 35 skipped (was 2936 before; +14 new tests
added in this PR).
specify workflow runagainst a Claude command step with
SPECIFY_INTEGRATION_CLAUDE_EXTRA_ARGS=--dangerously-skip-permissionsset — flag reaches the spawned
claude -p ...subprocessand the run completes non-interactively. Same workflow
without the env var still hangs as before (consistent with
the pre-PR behaviour).
Manual test results
Agent: Claude Code (CLI) | OS/Shell: macOS 14 / zsh
--dangerously-skip-permissionsappears between-p promptand--modelSPECIFY_INTEGRATION_GEMINI_EXTRA_ARGS=...does not affect Claude argv"--flag-a --flag-b 3"splits into 3 tokens" "→ no-opkiro-cli→KIRO_CLISPECIFY_INTEGRATION_KIRO_CLI_EXTRA_ARGScorrectly readrequires_cli: Falseshort-circuitbuild_exec_argsreturns None; env-var hook never reachedAI Disclosure
Used Claude Opus to draft the implementation (env-var hook +
helper extraction), the test suite, the issue body for #2595, and
this PR body. The reproducer scenario (non-interactive
claude -phang on permission prompts) was a real-world problem encountered
in operator practice. Code, tests, and design decisions were
human-reviewed before submission.