feat(cli): add lk agent session for headless text-mode agent runs#857
feat(cli): add lk agent session for headless text-mode agent runs#857toubatbrian wants to merge 5 commits into
lk agent session for headless text-mode agent runs#857Conversation
Introduces a three-process model (ephemeral CLI command, detached singleton daemon, agent subprocess) that drives a Python/JS agent over TCP using the lk.agent.session protobuf protocol, with no audio/CGO dependency: - `lk agent session start <file>`: re-execs the lk binary as a detached daemon bound to a fixed loopback port (singleton), which spawns the agent and applies text mode; rejects start if a session already runs. - `lk agent session say "..."`: streams a user turn and renders the agent reply, tool calls/outputs, and handoffs to the terminal. - `lk agent session end`: tears down the daemon and agent. The CLI<->daemon control protocol reuses pkg/ipc length-prefixed framing over the same TCP port, disambiguated from agent connections by a magic preamble. The headless renderer covers all ChatItem variants plus the FunctionToolsExecuted event. Drops the now-unnecessary U1000 file-ignore directives added while the helpers were unused. Co-authored-by: Cursor <cursoragent@cursor.com>
Tools that return no string (e.g. handoff tools returning an Agent) produced a bare "✓ " line. Suppress the output line when the summarized output is empty for successful calls; error outputs still render. Co-authored-by: Cursor <cursoragent@cursor.com>
| // When re-exec'd as the detached session daemon, run that and never reach | ||
| // the CLI framework (the daemon is not an exposed subcommand). | ||
| if os.Getenv(envSessionDaemon) == "1" { | ||
| runSessionDaemon() | ||
| return | ||
| } | ||
|
|
There was a problem hiding this comment.
Should we create a separate entrypoint instead?
There was a problem hiding this comment.
What I mean is can't this be it's own binary?
There was a problem hiding this comment.
Is there a hard reason we need two binaries? One works fine for us today. Re-exec'ing os.Executable() guarantees the daemon is the exact same version as the CLI (no skew), it reuses the console/ipc/detection code directly, and it's a hidden impl detail — nobody installs or runs it on its own. A second binary would also double our release/build matrix. Happy to split it out if there's a concrete need though.
Replace the env-gated branch at the top of main() with a dedicated, hidden `lk agent session daemon` subcommand (mirroring the existing hidden `generate-fish-completion` command). `start` now re-execs the binary into that subcommand instead of setting LK_SESSION_DAEMON=1, so the daemon has its own entrypoint dispatched by the CLI framework rather than special-casing main(). Re-exec of the same binary is retained (a separate binary can't be located reliably after `go install`); runtime params still flow through the LK_SESSION_* env vars. Co-authored-by: Cursor <cursoragent@cursor.com>
A registered subcommand is always invokable (Hidden only drops it from help), so a stray `lk agent session daemon` previously spawned a half-configured daemon (random port, empty project dir) that exited silently. Guard the entrypoint on the inherited readiness pipe that `start` always provides: without it, return a clear error directing the user to `lk agent session start`. Co-authored-by: Cursor <cursoragent@cursor.com>
|
Sorry for late question, but now that |
|
From my understanding, With console, you don’t really get that separation, since it starts an interactive terminal UI. There isn’t a command-line-friendly way to distinguish between starting a session and sending input to that session, which is especially important for AI agents using a bash tool. cc @theomonnom in case you have more thoughts on the goal/scope of this feature. |
Summary
Adds
lk agent session start|say|end— a headless, text-mode way to drive a LiveKit agent (Python or JS) straight from the terminal, with no audio/CGO dependency (it lives under the default tag-free build, not theconsoleaudio build).It uses a three-process model that mirrors the existing
lk agent consoleplumbing:start/say/end) — short-lived, talks to the daemon and exits.lkbinary re-exec'd into a hidden daemon mode (gated by an env var, never exposed as a subcommand). It binds a fixed loopback TCP port to enforce a single active session, spawns the agent, and applies text mode.lk.agent.sessionprotobuf protocol.The CLI↔daemon control protocol reuses
pkg/ipclength-prefixed framing on the same TCP port, disambiguated from agent connections by a 4-byte magic preamble. The headless renderer (session_render.go) prints user turns, agent replies, tool calls/outputs, and handoffs.Command running / IO example
Notes
startwhile a session is live is rejected (a session is already running on 127.0.0.1:<port>).consoletag. This drops the temporary//lint:file-ignore U1000directives that were added while the shared spawn/detect helpers were unused.TODO(node)/TODO(audio)placeholders mark the follow-up surfaces (JS agent detection, audio mode).Test plan
go build ./...(default) andCGO_ENABLED=1 go build -tags console ./...go vet -tags console ./cmd/lk/,gofmtcleanstart → say (tool call) → say (handoff/end_call) → endagainstbasic_agent.py(see IO example above)TODO(node))