Skip to content

Use bulk writes instead of per-element writes in vector serialization#681

Open
r-devulap wants to merge 3 commits into
mainfrom
bulk-writes-memorySegment
Open

Use bulk writes instead of per-element writes in vector serialization#681
r-devulap wants to merge 3 commits into
mainfrom
bulk-writes-memorySegment

Conversation

@r-devulap

@r-devulap r-devulap commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Replace inefficient element-by-element write loops with bulk write operations in MemorySegmentVectorProvider:

  • writeFloatVector: Extract underlying float array and use writeFloats() instead of looping with writeFloat()
  • writeByteSequence: Extract underlying byte array and use write() instead of looping with writeByte()

With highway as the SIMD backend #668 , the scalar writes show up as a bottleneck when constructing index on a AWS instance x8i.24xlarge (2-socket 96 core Intel GNR) .

Dataset Index Build Time Before (s) Index Build Time After (s)
openai-1536-1m 105.59 45.22
openai-3072-1m 164.63 102.75

Replace inefficient element-by-element write loops with bulk write
operations in MemorySegmentVectorProvider:

- writeFloatVector: Extract underlying float array and use writeFloats()
  instead of looping with writeFloat()
- writeByteSequence: Extract underlying byte array and use write()
  instead of looping with writeByte()
@github-actions

github-actions Bot commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Before you submit for review:

  • Does your PR follow guidelines from CONTRIBUTIONS.md?
  • Did you summarize what this PR does clearly and concisely?
  • Did you include performance data for changes which may be performance impacting?
  • Did you include useful docs for any user-facing changes or features?
  • Did you include useful javadocs for developer oriented changes, explaining new concepts or key changes?
  • Did you trigger and review regression testing results against the base branch via Run Bench Main?
  • Did you adhere to the code formatting guidelines (TBD)
  • Did you group your changes for easy review, providing meaningful descriptions for each commit?
  • Did you ensure that all files contain the correct copyright header?

If you did not complete any of these, then please explain below.

@tlwillke tlwillke left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

writeFloatVector is fine. writeByteSequence has a bug. You are ignoring slice offsets. MemorySegmentByteSequence supports slicing, but .heapBase().get() returns the original unsliced array. I have suggested a change that fixes this and added minimal testing.

Once this is addressed and the tests pass, it is good to go.

On performance, while I appreciate the end-to-end benchmarking and consider it necessary, I would also like to see a unit-level microbenchmark isolating writeFloatVector and writeByteSequence to quantify the exact bulk-write speedup.

Comment on lines 100 to 103
{
for (int i = 0; i < sequence.length(); i++)
out.writeByte(sequence.get(i));
byte[] data = (byte[]) ((MemorySegmentByteSequence) sequence).get().heapBase().get();
out.write(data, 0, sequence.length());
}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
{
for (int i = 0; i < sequence.length(); i++)
out.writeByte(sequence.get(i));
byte[] data = (byte[]) ((MemorySegmentByteSequence) sequence).get().heapBase().get();
out.write(data, 0, sequence.length());
}
{
java.nio.ByteBuffer bb = ((MemorySegmentByteSequence) sequence).get().asByteBuffer();
out.write(bb.array(), bb.arrayOffset(), bb.remaining());
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing that out. I mistakenly assumed CI would catch this bug. I’ll fix it.

@tlwillke tlwillke added the performance improvement A contribution that aims to improve library performance, possibly along with functionality. label Jun 17, 2026
…eSequence

,heapBase().get() ignores slicing and returns the base of the original
ByteArray.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance improvement A contribution that aims to improve library performance, possibly along with functionality.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants