Skip to content

[arrow-select] perf: Replace ArrayData with direct Array construction in take kernels#10176

Open
liamzwbao wants to merge 1 commit into
apache:mainfrom
liamzwbao:issue-9298-take-array-data
Open

[arrow-select] perf: Replace ArrayData with direct Array construction in take kernels#10176
liamzwbao wants to merge 1 commit into
apache:mainfrom
liamzwbao:issue-9298-take-array-data

Conversation

@liamzwbao

Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Replaces several ArrayDataBuilder paths in arrow-select/src/filter.rs with direct typed array constructors.

Are these changes tested?

Covered by existing tests.

Are there any user-facing changes?

No

@github-actions github-actions Bot added the arrow Changes to the arrow crate label Jun 22, 2026
@liamzwbao liamzwbao marked this pull request as ready for review June 22, 2026 05:26
@liamzwbao

Copy link
Copy Markdown
Contributor Author

Hi @alamb, this PR is ready for review. Would appreciate a benchmark on take_kernels

@Jefffrey

Copy link
Copy Markdown
Contributor

run benchmark take_kernels

@adriangbot

Copy link
Copy Markdown

🤖 Arrow criterion benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4767567186-605-g4nvd 6.12.68+ #1 SMP Sat May 2 07:49:07 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing issue-9298-take-array-data (e2aa396) to b79d20d (merge-base) diff
BENCH_NAME=take_kernels
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental,object_store --bench take_kernels
BENCH_FILTER=
Results will be posted here when complete


File an issue against this benchmark runner

@Jefffrey Jefffrey left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this the final piece of work to close the original issue?

Comment thread arrow-select/src/take.rs
Ok(Arc::new(MapArray::from(unsafe { builder.build_unchecked() })))
let (_, offsets, entries, nulls) = list_data.into_parts();
let entries = entries.as_struct().clone();
Ok(Arc::new(MapArray::try_new(

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i suppose the checks inside try_new are cheap enough to not be too much of an impact 👍

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest we switch back to MapArray::new_unchecked

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(can do as a follow on PR)

Comment thread arrow-select/src/take.rs

let list_data = unsafe { list_data.build_unchecked() };
Ok(GenericListArray::<OffsetType::Native>::from(list_data))
GenericListArray::<OffsetType::Native>::try_new(field, offsets, child, nulls)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, as above I think we should use new_unchecked

@adriangbot

Copy link
Copy Markdown

🤖 Arrow criterion benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

group                                                                     issue-9298-take-array-data             main
-----                                                                     --------------------------             ----
take bool 1024                                                            1.00   1033.2±0.67ns        ? ?/sec    1.01   1039.2±0.93ns        ? ?/sec
take bool 512                                                             1.00    573.3±0.60ns        ? ?/sec    1.01    577.0±0.60ns        ? ?/sec
take bool null indices 1024                                               1.07    913.5±6.97ns        ? ?/sec    1.00   855.3±32.20ns        ? ?/sec
take bool null values 1024                                                1.00      2.0±0.00µs        ? ?/sec    1.00      2.0±0.00µs        ? ?/sec
take bool null values null indices 1024                                   1.00  1609.3±13.50ns        ? ?/sec    1.00  1607.6±19.21ns        ? ?/sec
take check bounds i32 1024                                                1.00    583.6±1.75ns        ? ?/sec    1.13    662.4±2.62ns        ? ?/sec
take check bounds i32 512                                                 1.00    379.1±0.41ns        ? ?/sec    1.02    386.9±4.52ns        ? ?/sec
take fsb value len: 12, indices: 1024                                     1.00      2.5±0.00µs        ? ?/sec    1.05      2.6±0.00µs        ? ?/sec
take fsb value len: 12, null values, indices: 1024                        1.00      3.5±0.01µs        ? ?/sec    1.04      3.7±0.00µs        ? ?/sec
take fsb value optimized len: 16, indices: 1024                           1.00    587.6±0.70ns        ? ?/sec    1.18    691.9±3.73ns        ? ?/sec
take fsb value optimized len: 16, null values, indices: 1024              1.00   1605.6±1.53ns        ? ?/sec    1.09   1744.0±3.97ns        ? ?/sec
take i32 1024                                                             1.00    514.1±2.51ns        ? ?/sec    1.01    521.7±4.64ns        ? ?/sec
take i32 512                                                              1.00    356.1±4.25ns        ? ?/sec    1.01    358.3±6.58ns        ? ?/sec
take i32 null indices 1024                                                1.00    853.9±1.12ns        ? ?/sec    1.00    857.2±1.30ns        ? ?/sec
take i32 null values 1024                                                 1.00   1541.2±2.97ns        ? ?/sec    1.01   1551.0±1.60ns        ? ?/sec
take i32 null values null indices 1024                                    1.00   1716.3±1.59ns        ? ?/sec    1.01   1732.1±2.18ns        ? ?/sec
take list i32 1024                                                        1.01      7.9±0.02µs        ? ?/sec    1.00      7.8±0.02µs        ? ?/sec
take list i32 512                                                         1.00      4.1±0.01µs        ? ?/sec    1.03      4.3±0.02µs        ? ?/sec
take list i32 null indices 1024                                           1.00      9.5±0.10µs        ? ?/sec    1.01      9.6±0.06µs        ? ?/sec
take list i32 null values 1024                                            1.00      5.7±0.02µs        ? ?/sec    1.01      5.8±0.02µs        ? ?/sec
take list i32 null values null indices 1024                               1.00      6.7±0.12µs        ? ?/sec    1.03      6.9±0.10µs        ? ?/sec
take listview i32 1024                                                    1.00   1063.6±5.43ns        ? ?/sec    1.26   1343.2±2.63ns        ? ?/sec
take listview i32 512                                                     1.00    615.1±1.67ns        ? ?/sec    1.58    971.4±2.71ns        ? ?/sec
take listview i32 null indices 1024                                       1.00   1598.9±1.99ns        ? ?/sec    1.24   1987.5±4.05ns        ? ?/sec
take listview i32 null values 1024                                        1.00      2.1±0.00µs        ? ?/sec    1.13      2.3±0.00µs        ? ?/sec
take listview i32 null values null indices 1024                           1.00      2.4±0.01µs        ? ?/sec    1.18      2.9±0.00µs        ? ?/sec
take primitive run logical len: 1024, physical len: 512, indices: 1024    1.00     15.6±0.01µs        ? ?/sec    1.02     15.8±0.01µs        ? ?/sec
take str 1024                                                             1.00      8.1±0.01µs        ? ?/sec    1.00      8.1±0.01µs        ? ?/sec
take str 512                                                              1.00      3.7±0.00µs        ? ?/sec    1.03      3.8±0.00µs        ? ?/sec
take str null indices 1024                                                1.00      4.4±0.06µs        ? ?/sec    1.01      4.4±0.07µs        ? ?/sec
take str null indices 512                                                 1.01      2.1±0.03µs        ? ?/sec    1.00      2.0±0.04µs        ? ?/sec
take str null values 1024                                                 1.00      4.9±0.05µs        ? ?/sec    1.03      5.0±0.06µs        ? ?/sec
take str null values null indices 1024                                    1.04      3.3±0.02µs        ? ?/sec    1.00      3.2±0.02µs        ? ?/sec
take stringview 1024                                                      1.25    918.0±2.22ns        ? ?/sec    1.00    734.6±0.56ns        ? ?/sec
take stringview 512                                                       1.17    555.6±1.25ns        ? ?/sec    1.00    473.7±1.75ns        ? ?/sec
take stringview null indices 1024                                         1.00    922.7±2.81ns        ? ?/sec    1.07    984.7±1.92ns        ? ?/sec
take stringview null indices 512                                          1.01    571.3±1.43ns        ? ?/sec    1.00    564.7±1.12ns        ? ?/sec
take stringview null values 1024                                          1.08   1956.7±2.23ns        ? ?/sec    1.00   1809.6±2.64ns        ? ?/sec
take stringview null values null indices 1024                             1.00   1764.5±1.89ns        ? ?/sec    1.00   1760.4±2.91ns        ? ?/sec

Resource Usage

base (merge-base)

Metric Value
Wall time 370.1s
Peak memory 14.8 MiB
Avg memory 12.2 MiB
CPU user 367.9s
CPU sys 0.0s
Peak spill 0 B

branch

Metric Value
Wall time 365.1s
Peak memory 14.4 MiB
Avg memory 12.6 MiB
CPU user 363.4s
CPU sys 0.1s
Peak spill 0 B

File an issue against this benchmark runner

@Jefffrey

Copy link
Copy Markdown
Contributor

run benchmark take_kernels

@adriangbot

Copy link
Copy Markdown

🤖 Arrow criterion benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4767722033-606-mxp7z 6.12.68+ #1 SMP Sat May 2 07:49:07 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing issue-9298-take-array-data (e2aa396) to b79d20d (merge-base) diff
BENCH_NAME=take_kernels
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental,object_store --bench take_kernels
BENCH_FILTER=
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot

Copy link
Copy Markdown

🤖 Arrow criterion benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

group                                                                     issue-9298-take-array-data             main
-----                                                                     --------------------------             ----
take bool 1024                                                            1.00   1033.3±1.02ns        ? ?/sec    1.01   1042.4±1.03ns        ? ?/sec
take bool 512                                                             1.00    574.3±0.40ns        ? ?/sec    1.00    575.3±1.03ns        ? ?/sec
take bool null indices 1024                                               1.05    892.0±1.19ns        ? ?/sec    1.00   851.8±31.44ns        ? ?/sec
take bool null values 1024                                                1.00      2.0±0.00µs        ? ?/sec    1.00      2.0±0.00µs        ? ?/sec
take bool null values null indices 1024                                   1.00  1613.9±15.55ns        ? ?/sec    1.00  1619.3±15.99ns        ? ?/sec
take check bounds i32 1024                                                1.00    580.8±0.71ns        ? ?/sec    1.15    666.2±1.24ns        ? ?/sec
take check bounds i32 512                                                 1.00    378.5±1.20ns        ? ?/sec    1.03    389.9±0.60ns        ? ?/sec
take fsb value len: 12, indices: 1024                                     1.00      2.5±0.00µs        ? ?/sec    1.04      2.6±0.00µs        ? ?/sec
take fsb value len: 12, null values, indices: 1024                        1.00      3.5±0.00µs        ? ?/sec    1.03      3.6±0.01µs        ? ?/sec
take fsb value optimized len: 16, indices: 1024                           1.00    586.1±0.87ns        ? ?/sec    1.17    686.5±1.67ns        ? ?/sec
take fsb value optimized len: 16, null values, indices: 1024              1.00   1602.1±1.14ns        ? ?/sec    1.10   1755.4±2.98ns        ? ?/sec
take i32 1024                                                             1.00    511.5±0.95ns        ? ?/sec    1.02    522.6±3.23ns        ? ?/sec
take i32 512                                                              1.00    351.5±1.00ns        ? ?/sec    1.01    356.7±3.44ns        ? ?/sec
take i32 null indices 1024                                                1.00    853.3±1.02ns        ? ?/sec    1.01    859.5±1.41ns        ? ?/sec
take i32 null values 1024                                                 1.00   1542.1±1.61ns        ? ?/sec    1.00   1548.1±1.09ns        ? ?/sec
take i32 null values null indices 1024                                    1.00   1717.0±1.90ns        ? ?/sec    1.01   1728.8±3.48ns        ? ?/sec
take list i32 1024                                                        1.01      7.8±0.04µs        ? ?/sec    1.00      7.8±0.04µs        ? ?/sec
take list i32 512                                                         1.00      4.2±0.02µs        ? ?/sec    1.01      4.3±0.02µs        ? ?/sec
take list i32 null indices 1024                                           1.00      9.6±0.12µs        ? ?/sec    1.01      9.7±0.07µs        ? ?/sec
take list i32 null values 1024                                            1.00      5.6±0.03µs        ? ?/sec    1.03      5.7±0.01µs        ? ?/sec
take list i32 null values null indices 1024                               1.00      6.8±0.10µs        ? ?/sec    1.03      7.0±0.08µs        ? ?/sec
take listview i32 1024                                                    1.00    974.5±2.36ns        ? ?/sec    1.41   1374.0±3.26ns        ? ?/sec
take listview i32 512                                                     1.00    613.3±1.63ns        ? ?/sec    1.49    913.6±3.05ns        ? ?/sec
take listview i32 null indices 1024                                       1.00   1678.2±3.81ns        ? ?/sec    1.23      2.1±0.01µs        ? ?/sec
take listview i32 null values 1024                                        1.00      2.0±0.00µs        ? ?/sec    1.17      2.4±0.00µs        ? ?/sec
take listview i32 null values null indices 1024                           1.00      2.5±0.00µs        ? ?/sec    1.16      2.9±0.00µs        ? ?/sec
take primitive run logical len: 1024, physical len: 512, indices: 1024    1.00     15.6±0.02µs        ? ?/sec    1.01     15.8±0.02µs        ? ?/sec
take str 1024                                                             1.00      8.1±0.00µs        ? ?/sec    1.02      8.3±0.05µs        ? ?/sec
take str 512                                                              1.00      3.7±0.00µs        ? ?/sec    1.02      3.8±0.00µs        ? ?/sec
take str null indices 1024                                                1.01      4.4±0.06µs        ? ?/sec    1.00      4.4±0.05µs        ? ?/sec
take str null indices 512                                                 1.00      2.1±0.03µs        ? ?/sec    1.04      2.1±0.03µs        ? ?/sec
take str null values 1024                                                 1.00      4.9±0.06µs        ? ?/sec    1.00      4.9±0.05µs        ? ?/sec
take str null values null indices 1024                                    1.06      3.4±0.02µs        ? ?/sec    1.00      3.2±0.02µs        ? ?/sec
take stringview 1024                                                      1.23    914.6±5.98ns        ? ?/sec    1.00    741.2±0.95ns        ? ?/sec
take stringview 512                                                       1.18    558.4±5.86ns        ? ?/sec    1.00    471.4±0.95ns        ? ?/sec
take stringview null indices 1024                                         1.01    996.2±3.39ns        ? ?/sec    1.00    983.7±2.22ns        ? ?/sec
take stringview null indices 512                                          1.01    572.9±2.78ns        ? ?/sec    1.00    564.8±1.38ns        ? ?/sec
take stringview null values 1024                                          1.08   1970.8±2.78ns        ? ?/sec    1.00   1818.6±3.08ns        ? ?/sec
take stringview null values null indices 1024                             1.03   1782.4±1.47ns        ? ?/sec    1.00   1729.4±3.22ns        ? ?/sec

Resource Usage

base (merge-base)

Metric Value
Wall time 370.1s
Peak memory 14.7 MiB
Avg memory 12.1 MiB
CPU user 368.0s
CPU sys 0.0s
Peak spill 0 B

branch

Metric Value
Wall time 375.1s
Peak memory 14.4 MiB
Avg memory 12.4 MiB
CPU user 369.4s
CPU sys 0.0s
Peak spill 0 B

File an issue against this benchmark runner

@Jefffrey

Copy link
Copy Markdown
Contributor

there seems to be an impact on stringview benchmark 🤔

@alamb alamb left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there seems to be an impact on stringview benchmark 🤔

We may be hitting some sort of allocator thing again, as I don't think this PR changes any of the string view array code 🤔

let new_nulls = take_nulls(array.nulls(), indices);
// Safety: array.views was valid, and take_native copies only valid values, and verifies bounds
Ok(unsafe {
GenericByteViewArray::new_unchecked(new_views, array.data_buffers().to_vec(), new_nulls)
})

Comment thread arrow-select/src/take.rs
Ok(Arc::new(MapArray::from(unsafe { builder.build_unchecked() })))
let (_, offsets, entries, nulls) = list_data.into_parts();
let entries = entries.as_struct().clone();
Ok(Arc::new(MapArray::try_new(

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest we switch back to MapArray::new_unchecked

Comment thread arrow-select/src/take.rs

let list_data = unsafe { list_data.build_unchecked() };
Ok(GenericListArray::<OffsetType::Native>::from(list_data))
GenericListArray::<OffsetType::Native>::try_new(field, offsets, child, nulls)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, as above I think we should use new_unchecked

@alamb alamb left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @liamzwbao and @Jefffrey

Comment thread arrow-select/src/take.rs
Ok(Arc::new(MapArray::from(unsafe { builder.build_unchecked() })))
let (_, offsets, entries, nulls) = list_data.into_parts();
let entries = entries.as_struct().clone();
Ok(Arc::new(MapArray::try_new(

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(can do as a follow on PR)

@Jefffrey

Copy link
Copy Markdown
Contributor

i mistook genericlistview in the diff for stringview 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arrow Changes to the arrow crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Epic] Replace ArrayData with direct Array construction, when possible

4 participants