perf!: avoid listing index files after writes#7129
Conversation
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Claude Code Review
This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.
Tip: disable this comment in your organization's Code Review settings.
|
@claude review once |
| ivf_mut.write(&mut writer).await?; | ||
| writer.finish().await?; | ||
| let index_size = writer.finish().await? as u64; | ||
|
|
||
| // Write the aux file | ||
| aux_ivf.write(&mut aux_writer).await?; | ||
| aux_writer.finish().await?; | ||
| let aux_size = aux_writer.finish().await? as u64; | ||
|
|
||
| Ok(existing_indices.len() - start_pos) | ||
| Ok(( | ||
| existing_indices.len() - start_pos, | ||
| vec![ | ||
| IndexFile { | ||
| path: INDEX_FILE_NAME.to_string(), | ||
| size_bytes: index_size, | ||
| }, | ||
| IndexFile { | ||
| path: INDEX_AUXILIARY_FILE_NAME.to_string(), | ||
| size_bytes: aux_size, | ||
| }, | ||
| ], | ||
| )) |
There was a problem hiding this comment.
🔴 In optimize_ivf_hnsw_indices (rust/lance/src/index/vector/ivf.rs:949-968), writer.finish().await? and aux_writer.finish().await? are dispatched against the concrete PreviousFileWriter constructed at lines 861-865/889-893, whose inherent finish() (rust/lance-file/src/previous/writer/mod.rs:253) returns the number of rows, not bytes. Those values are then stored in IndexFile.size_bytes and fed through LanceIndexStore::with_file_sizes as CachedFileSize hints, so the IVF_HNSW_SQ optimize path persists garbage sizes (off by orders of magnitude) into the manifest and into the HEAD-call-avoidance cache introduced by this PR. Fix by mirroring the sibling functions in this file (e.g. optimize_ivf_pq_indices at line 790, remap_index_file at line 1908, write_ivf_pq_file at line 1973) and capturing writer.tell().await? as u64 before Writer::shutdown.
Extended reasoning...
Bug
In optimize_ivf_hnsw_indices (rust/lance/src/index/vector/ivf.rs:949-968 in this PR), the IVF and auxiliary file sizes are captured as:
ivf_mut.write(&mut writer).await?;
let index_size = writer.finish().await? as u64;
// Write the aux file
aux_ivf.write(&mut aux_writer).await?;
let aux_size = aux_writer.finish().await? as u64;and then placed into the returned IndexFile entries as size_bytes.
Why this is wrong
writer and aux_writer are not Box<dyn Writer> here. At lines 861-865 and 889-893 they are wrapped with PreviousFileWriter::with_object_writer(...), so their concrete type is PreviousFileWriter<ManifestDescribing>. PreviousFileWriter carries two finish methods:
-
The inherent method at rust/lance-file/src/previous/writer/mod.rs:253-263, which returns
Result<usize>containingself.metadata.batch_offsets.last()— i.e. the number of rows written, not bytes:pub async fn finish(&mut self) -> Result<usize> { self.write_footer().await?; Writer::shutdown(self.object_writer.as_mut()).await?; let num_rows = self.metadata.batch_offsets.last().cloned().unwrap_or_default(); Ok(num_rows as usize) }
-
The trait
IndexWriter::finishimpl (in lance_format.rs), which returnsResult<IndexFile>.
Rust method resolution prefers the inherent method, so writer.finish().await? invokes the row-count variant. The as u64 cast at lines 950/954 also confirms this — if dispatch had hit the trait method (returning IndexFile), the cast would not compile.
The net effect: for the IVF_HNSW v1 optimize path, index_size is the count of HNSW nodes (the .idx writer) and aux_size is the count of stored vectors (the .aux writer). These row counts then become the persisted IndexFile.size_bytes.
Why this PR makes it dangerous
This PR’s explicit goal is to skip HEAD calls when opening index files by propagating writer-provided sizes into LanceIndexStore::with_file_sizes, which constructs CachedFileSize::new(size) in open_index_file. A wrong-by-orders-of-magnitude cached size feeds straight into the v2 FileReader’s end-of-file offset arithmetic (footer location, metadata length, etc.). When the reader trusts a "size_bytes" that is actually a few hundred / few thousand instead of the real megabyte+ file size, footer offset computation reads from the wrong place — surfacing as corrupt-file errors or, worse, silently wrong reads.
Even on the read paths that still issue their own HEAD (i.e. the path that bypasses with_file_sizes), the garbage values still flow into IndexMetadata::total_size_bytes() and describe_indices(), breaking the very accounting this PR adds.
How to fix
Mirror the pattern already used everywhere else in this file:
optimize_ivf_pq_indices(line 790):let size_bytes = writer.tell().await? as u64;immediately beforeWriter::shutdown(writer.as_mut()).await?;remap_index_file(line 1908): same pattern.write_ivf_pq_file(line 1973): same pattern.
For optimize_ivf_hnsw_indices, after ivf_mut.write(&mut writer).await? (and analogously for aux_writer), do something like:
ivf_mut.write(&mut writer).await?;
let index_size = writer.tell().await? as u64;
PreviousFileWriter::finish(&mut writer).await?;
aux_ivf.write(&mut aux_writer).await?;
let aux_size = aux_writer.tell().await? as u64;
PreviousFileWriter::finish(&mut aux_writer).await?;or equivalently call the trait method IndexWriter::finish (which already does tell+shutdown internally for this writer type — see rust/lance-index/src/scalar/lance_format.rs impl IndexWriter for PreviousFileWriter) and use the returned IndexFile.size_bytes.
Step-by-step proof
- Build an IVF_HNSW_SQ index containing, say, ~10,000 vectors of dimension 128, so the underlying
.idxand.auxfiles end up at hundreds of KB or more. - Trigger the segment-commit optimize path that calls
optimize_vector_indices→optimize_ivf_hnsw_indices. - After optimize, inspect the committed
IndexMetadata.filesfor the new UUID. The reportedsize_bytesforindex.idxequals the number of HNSW nodes (≈10,000) and forauxiliary.idxequals the number of stored vectors (≈10,000) — not the actual byte sizes (which are orders of magnitude larger). - Because the PR now feeds these as
CachedFileSizehints viaLanceIndexStore::with_file_sizes, a subsequent read that uses this path will compute footer offsets as if the file were ~10,000 bytes long, fetching bytes from outside the real footer region. - Compare with
optimize_ivf_pq_indices(line 790) on the same dataset: theresize_bytesreflects the true file size because that code path captureswriter.tell().await?before shutdown.
All four independent verifiers reached the same conclusion (Rust dispatch goes to the inherent PreviousFileWriter::finish, the value returned is num_rows, and these values then flow into manifest size_bytes and the new HEAD-avoidance cache). The fix is a small one-liner per writer, matching the pattern already established elsewhere in the same file.
Avoids listing index directories after new index writes by propagating
IndexFilemetadata from writer and builder APIs into manifest metadata. This covers scalar index writers, V3 vector builders, vector append/remap paths, and distributed vector merge while preserving legacy and extension fallbacks where the API cannot return files yet.This reduces extra object-store list/stat IO on fresh index creation and maintenance paths, with IO-count regression coverage for scalar and V3 vector index creation.