You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Make top-k adaptive: over-retrieve, rerank, then apply a relative cutoff (keep within X% of the top score) bounded by a min and max k. - so that targeted questions don't get junk in, and broad questions get sufficient coverage
Benchmark against a more modern 8B model, given that we don't use the full OpenScholar pipeline (e.g. Gemma 4 E4B, which scores 'highest' on AA Omniscience - i.e. low hallucinations)
Benchmark against the actual OpenScholar pipeline (generation-feedback-regeneration)
Curate our own benchmark, with expert-written OS answers (not sure how much of that exists already)