Skip to content

Reasoning context#22

Merged
caseymcc merged 3 commits into
mainfrom
reasoning_context
Jun 11, 2026
Merged

Reasoning context#22
caseymcc merged 3 commits into
mainfrom
reasoning_context

Conversation

@caseymcc

Copy link
Copy Markdown
Owner

No description provided.

AVA Agent and others added 3 commits June 10, 2026 22:32
…el tools and timeouts

- ARBITERAI_ENABLE_LLAMA CMake option (default ON): consumers using only
  remote OpenAI-compatible servers can build without the llama.cpp
  dependency; local-model APIs return NotImplemented when disabled
- Schema sanitization for llama.cpp GBNF: handle boolean/array/opaque
  object types, strip $schema/additionalProperties, repair
  initializer-list array-instead-of-object schemas, normalise null
  properties, recurse into items
- buildRequest: request-level tools take priority over session tools;
  plumb per-request timeout_ms and low_speed_time_s
- Telemetry and storage manager additions

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Adds an optional cache_prompt flag at both the request level
(CompletionRequest) and session level (ChatConfig). When set, the
openai provider includes "cache_prompt" in the request body so a
llama.cpp server reuses its KV cache for the common prompt prefix
(system prompt + tool schemas), saving most of the prefill cost on
every call. Only sent when explicitly set, since OpenAI itself rejects
unrecognised arguments.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
InferenceScheduler is part of the local llama.cpp runtime subsystem
(includes llama.h, uses ModelRuntime's LoadedModel); compiling it with
the runtime disabled fails. Move it and its tests into the gated block.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@caseymcc caseymcc merged commit 71ee921 into main Jun 11, 2026
1 check passed
@caseymcc caseymcc deleted the reasoning_context branch June 11, 2026 13:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant