Skip to content

KQL (Kibana Query Language) support#6491

Open
siva-abstract-security wants to merge 2 commits into
quickwit-oss:mainfrom
siva-abstract-security:feature/kql-support
Open

KQL (Kibana Query Language) support#6491
siva-abstract-security wants to merge 2 commits into
quickwit-oss:mainfrom
siva-abstract-security:feature/kql-support

Conversation

@siva-abstract-security
Copy link
Copy Markdown
Contributor

Adds optional KQL parsing as a thin translation layer at the REST entry point. KQL inputs are parsed and lowered to QueryAst using only existing variants (BoolQuery, FullTextQuery, RangeQuery, FieldPresenceQuery, WildcardQuery, MatchAll, UserInputQuery) — the core enum, visitor traits, tag pruning, and root-search remain unchanged.

Wire surface:

  • Native REST: ?kql=<expr> on /api/v1/{index}/search, mutually exclusive with the existing query= parameter.
  • Elastic-compat JSON: {"query": {"kql": {"query": "...", ...}}} on /api/v1/_elastic/{index}/_search. Documented as a Quickwit-only extension since real Elasticsearch returns parsing_exception.

Supported grammar (matches the public Kibana KQL reference):
field-value, quoted phrases, bare default-field terms, * (match-all
fast path), ?/* wildcards, field:* exists, field:>=N /
> / <= / < ranges (numeric literals coerced to JsonLiteral::Number,
non-numeric falls back to String), boolean and/or/not (case-insensitive),
juxtaposition as implicit AND, parens, field:(a or b) value groups
with proper field distribution, escape semantics (\and, \:, \+).

Safety rails (all return HTTP 400 with specific error messages):

  • Max KQL input length: 16 KiB (REST layer)
  • Max parser nesting depth: 64
  • Max bare-token length: 1 KiB
  • Max quoted-phrase length: 4 KiB
  • {...} nested-field syntax rejected (Quickwit has no nested type)
  • Nested field qualifier inside value group rejected
  • query and kql mutually exclusive

Observability:

  • quickwit_kql_parse_total counter
  • quickwit_kql_parse_failures_total counter
  • quickwit_kql_parse_duration_seconds histogram
  • Structured kql=<bool>, tantivy_grammar=<bool> fields on search log lines so SRE can split KQL vs Tantivy-grammar traffic without parsing raw query strings.

OpenAPI:

  • query is now #[serde(default)] (semantically optional at the wire layer); utoipa override exposes both query and kql as Option<String> so generated SDK clients no longer encode the obsolete required: ["query"] contract.

Tests:

  • 246 unit tests in quickwit-query covering lexer, parser (recursive-descent with depth guard), lowering (with Tantivy-grammar escape handling for default-field deferral), metrics wiring, JSON DSL deserialization, and proptest fuzz (~6k cases) confirming the parser never panics on arbitrary input.
  • Kibana conformance corpus pinning expected ASTs for each documented KQL idiom + explicit notes on intentional divergences.
  • REST handler unit tests for kql/query mutual exclusion, whitespace handling, size caps, and search_fields propagation.
  • Integration scenarios under rest-api-tests/scenarii/kql_search/ asserting exact hit counts against a known dataset.
  • Concurrent load harness (load_test.py) mixing happy-path and adversarial shapes; multi-node docker-compose template for distributed root→leaf testing.

Line coverage on KQL production code: 95-99% per file; the remaining gaps are defensive code, test panic-guards in let-else patterns, and lazy_counter!/lazy_histogram! macro internals the coverage tool cannot introspect.

Files modified outside the new kql/ module: 5 (Cargo.lock, quickwit-cli/src/tool.rs, quickwit-query/Cargo.toml, quickwit-query/src/elastic_query_dsl/mod.rs,
quickwit-query/src/lib.rs, quickwit-serve/src/search_api/rest_handler.rs). The core QueryAst enum, QueryAstVisitor, QueryAstTransformer, tag_pruning, and root-search are untouched.

@siva-abstract-security siva-abstract-security requested review from a team as code owners June 3, 2026 13:39
@siva-abstract-security siva-abstract-security changed the title feat(query): add KQL (Kibana Query Language) support KQL (Kibana Query Language) support Jun 3, 2026
Comment on lines +5 to +7
json:
query:
kql:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not an actual parameter of the elastic search api is it?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fulmicoton : You're right — KQL isn't part of the ES query DSL (it's a Kibana-side concept that compiles to standard ES DSL before reaching ES; a real cluster would reject this payload with parsing_exception).
Putting it under _elastic/ was a mistake — that namespace should stay honest to the ES wire contract.

I'll drop the Kql variant from ElasticQueryDslInner and remove this scenario. KQL remains reachable via:

GET /api/v1//search?kql=...
POST /api/v1//search with {"kql": "...", "max_hits": ...} (top-level field on SearchRequestQueryString, not under query)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated the PR.

@siva-abstract-security siva-abstract-security force-pushed the feature/kql-support branch 2 times, most recently from 7c37504 to 50ac897 Compare June 4, 2026 15:37
Adds optional KQL parsing as a thin translation layer at the REST entry
point. KQL inputs are parsed and lowered to QueryAst using only existing
variants (BoolQuery, FullTextQuery, RangeQuery, FieldPresenceQuery,
WildcardQuery, MatchAll, UserInputQuery) — the core enum, visitor traits,
tag pruning, and root-search remain unchanged.

Wire surface:
  * Native REST: `?kql=<expr>` on /api/v1/{index}/search, mutually
    exclusive with the existing `query=` parameter.
  * Elastic-compat JSON: `{"query": {"kql": {"query": "...", ...}}}` on
    /api/v1/_elastic/{index}/_search. Documented as a Quickwit-only
    extension since real Elasticsearch returns parsing_exception.

Supported grammar (matches the public Kibana KQL reference):
  field-value, quoted phrases, bare default-field terms, `*` (match-all
  fast path), `?`/`*` wildcards, `field:*` exists, `field:>=N` /
  `>` / `<=` / `<` ranges (numeric literals coerced to JsonLiteral::Number,
  non-numeric falls back to String), boolean and/or/not (case-insensitive),
  juxtaposition as implicit AND, parens, `field:(a or b)` value groups
  with proper field distribution, escape semantics (`\and`, `\:`, `\+`).

Safety rails (all return HTTP 400 with specific error messages):
  * Max KQL input length: 16 KiB (REST layer)
  * Max parser nesting depth: 64
  * Max bare-token length: 1 KiB
  * Max quoted-phrase length: 4 KiB
  * `{...}` nested-field syntax rejected (Quickwit has no nested type)
  * Nested field qualifier inside value group rejected
  * `query` and `kql` mutually exclusive

Observability:
  * quickwit_kql_parse_total counter
  * quickwit_kql_parse_failures_total counter
  * quickwit_kql_parse_duration_seconds histogram
  * Structured `kql=<bool>`, `tantivy_grammar=<bool>` fields on search
    log lines so SRE can split KQL vs Tantivy-grammar traffic without
    parsing raw query strings.

OpenAPI:
  * `query` is now `#[serde(default)]` (semantically optional at the
    wire layer); utoipa override exposes both `query` and `kql` as
    `Option<String>` so generated SDK clients no longer encode the
    obsolete `required: ["query"]` contract.

Tests:
  * 246 unit tests in quickwit-query covering lexer, parser
    (recursive-descent with depth guard), lowering (with
    Tantivy-grammar escape handling for default-field deferral),
    metrics wiring, JSON DSL deserialization, and proptest fuzz
    (~6k cases) confirming the parser never panics on arbitrary input.
  * Kibana conformance corpus pinning expected ASTs for each
    documented KQL idiom + explicit notes on intentional divergences.
  * REST handler unit tests for `kql`/`query` mutual exclusion,
    whitespace handling, size caps, and search_fields propagation.
  * Integration scenarios under rest-api-tests/scenarii/kql_search/
    asserting exact hit counts against a known dataset.
  * Concurrent load harness (load_test.py) mixing happy-path and
    adversarial shapes; multi-node docker-compose template for
    distributed root→leaf testing.

Line coverage on KQL production code: 95-99% per file; the remaining
gaps are defensive code, test panic-guards in let-else patterns, and
lazy_counter!/lazy_histogram! macro internals the coverage tool cannot
introspect.

Files modified outside the new kql/ module: 5 (Cargo.lock,
quickwit-cli/src/tool.rs, quickwit-query/Cargo.toml,
quickwit-query/src/elastic_query_dsl/mod.rs,
quickwit-query/src/lib.rs, quickwit-serve/src/search_api/rest_handler.rs).
The core QueryAst enum, QueryAstVisitor, QueryAstTransformer,
tag_pruning, and root-search are untouched.
@siva-abstract-security
Copy link
Copy Markdown
Contributor Author

@fulmicoton

  1. Does KQL on the Quickwit server fit the roadmap?

If that's not a use case, Quickwit is interested in supporting, I'm happy to close this and publish the
parser as a separate crate for the few users who do want it.

  1. If yes — parser shape.

    I built a standalone parser rather than extending tantivy_query_grammar.
    The grammars overlap ~80% (field:value, AND/OR/NOT, parens, wildcards) and
    diverge on a handful of points (lowercase keywords, field:>=N range syntax,
    field:* exists, field:(a or b) value groups).

    Would you prefer:

    • (a) I refactor to extend tantivy_query_grammar and replace the new parser
    • (b) Keep the standalone parser, accepting the parallel maintenance
    • (c) Something else entirely

    Either is fine on my end — I'd rather get the shape right than keep iterating
    on the wrong one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants