Skip to content

Unify SolrIndexSearcher paths with CollectorManager integration#4487

Open
markrmiller wants to merge 1 commit into
apache:mainfrom
markrmiller:unify-searcher-collector-manager
Open

Unify SolrIndexSearcher paths with CollectorManager integration#4487
markrmiller wants to merge 1 commit into
apache:mainfrom
markrmiller:unify-searcher-collector-manager

Conversation

@markrmiller
Copy link
Copy Markdown
Member

Replaces the dual sequential/parallel branches in getDocListNC and getDocListAndSetNC with a single private searchAndCollect helper that internally selects between single-slice and multi-slice execution. The top-level methods no longer fork on MultiThreadedSearcher.allowMT. Also makes canceling a query when doing parallel segment search actually do something.

Performance — this commit vs parent

1.5M docs / 150 segments, 300 samples per cell per build, combined from two independent round-robin sessions (parent and this commit runs alternated within each session against a single shared solr-home). 30 warmup + 30 timed runs per cell per round. sudo powermetrics --samplers cpu_power,thermal ran in parallel at 1s cadence for both sessions; thermal verification below.

cell                                                               parent ms±se     this commit   ms±se    delta%        z
-----------------------------------------------------------------------------------------------------------------------------
match-all-top500           baseline    mt_false                  17.98±0.042      17.72±0.025      -1.4%     -5.4
match-all-top500           baseline    mt_true                   17.85±0.039      17.73±0.026      -0.7%     -2.7
match-all-top500           ste         mt_false                  10.19±0.027      10.09±0.022      -1.0%     -2.8
match-all-top500           ste         mt_true                   10.13±0.021      10.10±0.023      -0.3%     -0.8
match-all-top500           postfilter  mt_false                  25.54±0.045      25.78±0.083      +0.9%     +2.5
match-all-top500           postfilter  mt_true                   25.41±0.045      25.62±0.033      +0.8%     +3.8
match-all-top500           both        mt_false                  19.94±0.044      20.23±0.029      +1.5%     +5.6
match-all-top500           both        mt_true                   20.12±0.080      20.23±0.035      +0.5%     +1.2

big-bool-or-top1000        baseline    mt_false                  24.76±0.053      24.88±0.038      +0.5%     +1.8
big-bool-or-top1000        baseline    mt_true                   26.31±0.046      26.31±0.038      -0.0%     -0.1
big-bool-or-top1000        ste         mt_false                   5.66±0.036       5.58±0.020      -1.5%     -2.1
big-bool-or-top1000        ste         mt_true                    5.61±0.027       5.49±0.020      -2.2%     -3.6
big-bool-or-top1000        postfilter  mt_false                  30.49±0.040      30.55±0.040      +0.2%     +1.0
big-bool-or-top1000        postfilter  mt_true                   30.47±0.041      30.58±0.036      +0.4%     +2.0
big-bool-or-top1000        both        mt_false                   6.96±0.023       6.89±0.021      -1.0%     -2.3
big-bool-or-top1000        both        mt_true                    6.89±0.037       6.98±0.022      +1.3%     +2.1

sort-by-rank-top500        baseline    mt_false                  17.73±0.025      17.83±0.026      +0.6%     +2.9
sort-by-rank-top500        baseline    mt_true                   17.76±0.025      17.78±0.027      +0.1%     +0.6
sort-by-rank-top500        ste         mt_false                  17.40±0.021      17.37±0.024      -0.2%     -0.9
sort-by-rank-top500        ste         mt_true                   17.39±0.022      17.37±0.024      -0.2%     -0.9
sort-by-rank-top500        postfilter  mt_false                  25.86±0.065      25.67±0.034      -0.8%     -2.7
sort-by-rank-top500        postfilter  mt_true                   25.74±0.030      25.79±0.080      +0.2%     +0.6
sort-by-rank-top500        both        mt_false                  23.98±0.030      23.90±0.028      -0.3%     -2.0
sort-by-rank-top500        both        mt_true                   23.93±0.027      23.96±0.030      +0.1%     +0.7

big-bool-or-score-top500   baseline    mt_false                  54.93±0.070      55.05±0.110      +0.2%     +0.9
big-bool-or-score-top500   baseline    mt_true                   53.37±0.063      53.54±0.088      +0.3%     +1.5
big-bool-or-score-top500   ste         mt_false                  19.49±0.072      19.60±0.112      +0.5%     +0.8
big-bool-or-score-top500   ste         mt_true                   19.52±0.161      19.50±0.073      -0.1%     -0.1
big-bool-or-score-top500   postfilter  mt_false                  59.68±0.167      59.63±0.065      -0.1%     -0.3
big-bool-or-score-top500   postfilter  mt_true                   59.42±0.056      59.69±0.102      +0.4%     +2.3
big-bool-or-score-top500   both        mt_false                  20.41±0.174      20.66±0.252      +1.2%     +0.8
big-bool-or-score-top500   both        mt_true                   20.15±0.039      20.20±0.044      +0.2%     +0.8

Negative delta = this commit faster than parent.

sudo powermetrics --samplers cpu_power,thermal -i 1000 ran in parallel for both sessions (~13 min each).

build sessions runs P-cluster avg MHz CPU_W avg thermal pressure
parent 2 10 P0 3041, P1 3060 8.27 Nominal (every sample)
patched 2 10 P0 3051, P1 3055 8.25 Nominal (every sample)

No Fair/Serious/Critical events for any sample in any run across either session.

…ased entry point

Replaces the dual sequential/parallel branches in getDocListNC and
getDocListAndSetNC with a single private searchAndCollect helper that
internally selects between single-slice and multi-slice execution. The
top-level methods no longer fork on MultiThreadedSearcher.allowMT.

Key changes:

- Extracts TopDocsCM, MaxScoreCM, DocSetCM, FixedBitSetCollector,
  SearchResult, TopDocsResult, and MaxScoreResult into a new
  SolrCollectorManagers utility. Adds PostFilterCM<T> (single-slice;
  setLastDelegate per newCollector, complete() at reduce) and
  DocSetCollectorCM (preserves the SortedIntDocSet sparse optimization).
  Multi-slice paths keep the FixedBitSet-based DocSetCM.
- Deletes MultiThreadedSearcher; allowMT is inlined as the
  canParallelize check inside searchAndCollect.
- Migrates the collector chain (segment-terminate-early, max-hits
  early termination, optional post-filter, optional cancellation) to a
  CollectorManager-based ChainCM applied per newCollector(). Single-slice
  vs multi-slice is chosen by ChainCM.requiresSingleSlice() OR the
  caller's parallelEligible hint.
- Replaces the outer CancellableCollector with a shared AtomicBoolean
  polled by every per-slice CancellableCollector, so a single
  tracker.cancel(queryID) is observed across all parallel slices.
- ChainCM.getScoreMode() returns the chain-wrapped collector's score
  mode so a Solr post-filter override (e.g. CollapsingPostFilter
  forcing COMPLETE) reaches populateScoresIfNeeded. Guaranteed
  non-null at the call site: Lucene's IndexSearcher.search calls
  newCollector() unconditionally before its empty-index short-circuit,
  so the call site uses Objects.requireNonNull rather than an
  unreachable fallback.
- TopDocsCM.reduce skips TopDocs.merge for a single input, preserving
  the underlying TopDocsCollector's EQUAL_TO / GREATER_THAN_OR_EQUAL_TO
  relation exactly.
- Aligns SolrMultiCollectorManager with Lucene's MultiCollector:
  per-child CollectionTerminatedException handling, per-child
  setMinCompetitiveScore aggregation (MinCompetitiveScoreAwareScorable),
  and removal of the EarlyTerminatingCollector wrap (ChainCM owns it,
  with a shared LongAdder across slices to preserve cross-slice
  maxHitsAllowed semantics). Clamps EarlyTerminatingCollector chunkSize
  to >= 1 to avoid divide-by-zero when maxDocsToCollect < 10.

A few hot-path alignments with Lucene's MultiCollector pattern that
together avoid a regression on heavy multi-term boolean OR queries:

- searchAndCollect skips SolrMultiCollectorManager entirely when only
  one inner CollectorManager is built (top-N without fl=score and
  without docSet — the common case), routing through a new
  SolrCollectorManagers.wrapSingle adapter. Mirrors Lucene's own
  MultiCollectorManager.wrap() short-circuit for the one-collector case.
- SolrMultiCollectorManager$Collectors$LeafCollectors.collect now
  matches MultiCollector.MultiLeafCollector#collect exactly: the
  per-iteration body is load-slot / null-check / call, and "all children
  terminated" is detected lazily inside the rare
  CollectionTerminatedException catch via allLeavesTerminated().
- QueryCancelledException and EarlyTerminatingCollectorException
  override fillInStackTrace() to return this, matching
  CollectionTerminatedException — these are control-flow exceptions
  and should not pay for stack-trace fill.

No behavior change: numFound, ranked doc_ids, maxScore, numFoundExact
and start match the parent commit on every probed (query, mode,
multiThreaded) cell, including post-filter, STE, and post-filter + STE
combinations.

Verified with the full org.apache.solr.search.* suite plus targeted
runs for TestMultiThreadedSearcher, TestEarlyTerminatingQueries,
TestQueryLimits, CursorPagingTest, TestRankQueryPlugin,
TestCollapseQParserPlugin, TestRandomCollapseQParserPlugin, and
TestReRankQParserPlugin.
Copy link
Copy Markdown
Contributor

@dsmiley dsmiley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very impressive... this is really an overhaul of multithreaded collection/search, not just a unification.

Glad to see benchmarks! But I'm surprised to see that solr/benchmark (JMH) wasn't used... why not? I'm also having trouble interpreting what you posted.

100 segments is a lot; not very realistic

}
}

static class FixedBitSetCollector extends SimpleCollector {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could use a bit more docs/comments to explain what's going on

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

surprisingly a net increase in LOC ... but more documentation so that's good

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants