[arrow-select] perf: Replace ArrayData with direct Array construction in take kernels#10176
[arrow-select] perf: Replace ArrayData with direct Array construction in take kernels#10176liamzwbao wants to merge 1 commit into
ArrayData with direct Array construction in take kernels#10176Conversation
…tion in take kernels
|
Hi @alamb, this PR is ready for review. Would appreciate a benchmark on |
|
run benchmark take_kernels |
|
🤖 Arrow criterion benchmark running (GKE) | trigger CPU Details (lscpu)Comparing issue-9298-take-array-data (e2aa396) to b79d20d (merge-base) diff File an issue against this benchmark runner |
Jefffrey
left a comment
There was a problem hiding this comment.
is this the final piece of work to close the original issue?
| Ok(Arc::new(MapArray::from(unsafe { builder.build_unchecked() }))) | ||
| let (_, offsets, entries, nulls) = list_data.into_parts(); | ||
| let entries = entries.as_struct().clone(); | ||
| Ok(Arc::new(MapArray::try_new( |
There was a problem hiding this comment.
i suppose the checks inside try_new are cheap enough to not be too much of an impact 👍
There was a problem hiding this comment.
I suggest we switch back to MapArray::new_unchecked
|
|
||
| let list_data = unsafe { list_data.build_unchecked() }; | ||
| Ok(GenericListArray::<OffsetType::Native>::from(list_data)) | ||
| GenericListArray::<OffsetType::Native>::try_new(field, offsets, child, nulls) |
There was a problem hiding this comment.
yeah, as above I think we should use new_unchecked
|
🤖 Arrow criterion benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagebase (merge-base)
branch
File an issue against this benchmark runner |
|
run benchmark take_kernels |
|
🤖 Arrow criterion benchmark running (GKE) | trigger CPU Details (lscpu)Comparing issue-9298-take-array-data (e2aa396) to b79d20d (merge-base) diff File an issue against this benchmark runner |
|
🤖 Arrow criterion benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagebase (merge-base)
branch
File an issue against this benchmark runner |
|
there seems to be an impact on stringview benchmark 🤔 |
alamb
left a comment
There was a problem hiding this comment.
there seems to be an impact on stringview benchmark 🤔
We may be hitting some sort of allocator thing again, as I don't think this PR changes any of the string view array code 🤔
arrow-rs/arrow-select/src/take.rs
Lines 637 to 641 in e2aa396
| Ok(Arc::new(MapArray::from(unsafe { builder.build_unchecked() }))) | ||
| let (_, offsets, entries, nulls) = list_data.into_parts(); | ||
| let entries = entries.as_struct().clone(); | ||
| Ok(Arc::new(MapArray::try_new( |
There was a problem hiding this comment.
I suggest we switch back to MapArray::new_unchecked
|
|
||
| let list_data = unsafe { list_data.build_unchecked() }; | ||
| Ok(GenericListArray::<OffsetType::Native>::from(list_data)) | ||
| GenericListArray::<OffsetType::Native>::try_new(field, offsets, child, nulls) |
There was a problem hiding this comment.
yeah, as above I think we should use new_unchecked
alamb
left a comment
There was a problem hiding this comment.
Thank you @liamzwbao and @Jefffrey
| Ok(Arc::new(MapArray::from(unsafe { builder.build_unchecked() }))) | ||
| let (_, offsets, entries, nulls) = list_data.into_parts(); | ||
| let entries = entries.as_struct().clone(); | ||
| Ok(Arc::new(MapArray::try_new( |
|
i mistook genericlistview in the diff for stringview 😅 |
Which issue does this PR close?
ArrayDatawith direct Array construction, when possible #9298.Rationale for this change
What changes are included in this PR?
Replaces several
ArrayDataBuilderpaths inarrow-select/src/filter.rswith direct typed array constructors.Are these changes tested?
Covered by existing tests.
Are there any user-facing changes?
No