Allow accessing ColumnArrays backing column#500
Open
IyeOnline wants to merge 1 commit into
Open
Conversation
When working with a ColumnArray, it is oftentimes useful to be able to access the entire, contiguous backing array at once without having to do row-wise access. This change simply allows reading access to the backing data array and the offsets.
slabko
requested changes
May 27, 2026
Contributor
slabko
left a comment
There was a problem hiding this comment.
Hi @IyeOnline, please take a look at my notes.
I appreciate your PR, but when tests are missing, I eventually need to implement them myself, or I may end up postponing the PR until I have time to do so. I try to avoid accepting changes that are not covered by tests.
Please remember to add tests for the related changes.
| } | ||
|
|
||
| ColumnRef ColumnArray::GetData() { | ||
| ColumnRef ColumnArray::GetData() const { |
Contributor
There was a problem hiding this comment.
The reason GetData was marked non-const is that it exposes internal state that can then be modified. I do not think marking this function const is a good idea.
| return data_; | ||
| } | ||
|
|
||
| const std::shared_ptr<ColumnUInt64>& ColumnArray::GetOffsets() const { |
Contributor
There was a problem hiding this comment.
Similar to GetData, logically this is not a const function.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
When working with a ColumnArray, it is oftentimes useful to be able to
access the entire, contiguous backing array at once without having to do
row-wise access.
This change simply allows reading access to the backing data array and
the offsets.
In my concrete usecase I want to fetch data from clickhouse and put it into our columnar (Apache Arrow) data representation.
Doing this recursively and efficiently gets much easier with this access.
Given that e.g. ColumnVector directly exposes writable access to its backing
std::vectorand ColumnArray itself is brittleagainst modifcation of passed-in columns, I dont see a downside with making this API public.