Skip to content

Allow accessing ColumnArrays backing column#500

Open
IyeOnline wants to merge 1 commit into
ClickHouse:masterfrom
IyeOnline:topic/columnar-ColumnArray
Open

Allow accessing ColumnArrays backing column#500
IyeOnline wants to merge 1 commit into
ClickHouse:masterfrom
IyeOnline:topic/columnar-ColumnArray

Conversation

@IyeOnline
Copy link
Copy Markdown
Contributor

When working with a ColumnArray, it is oftentimes useful to be able to
access the entire, contiguous backing array at once without having to do
row-wise access.

This change simply allows reading access to the backing data array and
the offsets.

In my concrete usecase I want to fetch data from clickhouse and put it into our columnar (Apache Arrow) data representation.
Doing this recursively and efficiently gets much easier with this access.

Given that e.g. ColumnVector directly exposes writable access to its backing std::vector and ColumnArray itself is brittle
against modifcation of passed-in columns, I dont see a downside with making this API public.

When working with a ColumnArray, it is oftentimes useful to be able to
access the entire, contiguous backing array at once without having to do
row-wise access.

This change simply allows reading access to the backing data array and
the offsets.
@IyeOnline IyeOnline requested review from mzitnik and slabko as code owners May 21, 2026 17:59
Copy link
Copy Markdown
Contributor

@slabko slabko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @IyeOnline, please take a look at my notes.

I appreciate your PR, but when tests are missing, I eventually need to implement them myself, or I may end up postponing the PR until I have time to do so. I try to avoid accepting changes that are not covered by tests.

Please remember to add tests for the related changes.

}

ColumnRef ColumnArray::GetData() {
ColumnRef ColumnArray::GetData() const {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason GetData was marked non-const is that it exposes internal state that can then be modified. I do not think marking this function const is a good idea.

return data_;
}

const std::shared_ptr<ColumnUInt64>& ColumnArray::GetOffsets() const {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to GetData, logically this is not a const function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants