Use bulk writes instead of per-element writes in vector serialization#681
Use bulk writes instead of per-element writes in vector serialization#681r-devulap wants to merge 3 commits into
Conversation
Replace inefficient element-by-element write loops with bulk write operations in MemorySegmentVectorProvider: - writeFloatVector: Extract underlying float array and use writeFloats() instead of looping with writeFloat() - writeByteSequence: Extract underlying byte array and use write() instead of looping with writeByte()
|
Before you submit for review:
If you did not complete any of these, then please explain below. |
tlwillke
left a comment
There was a problem hiding this comment.
writeFloatVector is fine. writeByteSequence has a bug. You are ignoring slice offsets. MemorySegmentByteSequence supports slicing, but .heapBase().get() returns the original unsliced array. I have suggested a change that fixes this and added minimal testing.
Once this is addressed and the tests pass, it is good to go.
On performance, while I appreciate the end-to-end benchmarking and consider it necessary, I would also like to see a unit-level microbenchmark isolating writeFloatVector and writeByteSequence to quantify the exact bulk-write speedup.
| { | ||
| for (int i = 0; i < sequence.length(); i++) | ||
| out.writeByte(sequence.get(i)); | ||
| byte[] data = (byte[]) ((MemorySegmentByteSequence) sequence).get().heapBase().get(); | ||
| out.write(data, 0, sequence.length()); | ||
| } |
There was a problem hiding this comment.
| { | |
| for (int i = 0; i < sequence.length(); i++) | |
| out.writeByte(sequence.get(i)); | |
| byte[] data = (byte[]) ((MemorySegmentByteSequence) sequence).get().heapBase().get(); | |
| out.write(data, 0, sequence.length()); | |
| } | |
| { | |
| java.nio.ByteBuffer bb = ((MemorySegmentByteSequence) sequence).get().asByteBuffer(); | |
| out.write(bb.array(), bb.arrayOffset(), bb.remaining()); | |
| } |
There was a problem hiding this comment.
Thanks for pointing that out. I mistakenly assumed CI would catch this bug. I’ll fix it.
…eSequence ,heapBase().get() ignores slicing and returns the base of the original ByteArray.
Replace inefficient element-by-element write loops with bulk write operations in MemorySegmentVectorProvider:
With highway as the SIMD backend #668 , the scalar writes show up as a bottleneck when constructing index on a AWS instance x8i.24xlarge (2-socket 96 core Intel GNR) .