Skip to content

Fix aggregation flow with remote initiator#1872

Open
ianton-ru wants to merge 3 commits into
antalya-26.3from
bugfix/antalya-26.3/fix_aggregation_with_remote_initiator
Open

Fix aggregation flow with remote initiator#1872
ianton-ru wants to merge 3 commits into
antalya-26.3from
bugfix/antalya-26.3/fix_aggregation_with_remote_initiator

Conversation

@ianton-ru
Copy link
Copy Markdown

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Fix aggregation flow with remote initiator

Documentation entry for user-facing changes

With object_storage_remote_initiator but without object_storage_cluster setting StorageObjectStorageCluster::getQueryProcessingStage returned QueryProcessingStage::Enum::FetchColumns, as result nodes sent all rows on initiator and aggregation executed on initiator.
Now method returns QueryProcessingStage::Enum::WithMergeableState is proper cases, and pre-aggregation executed on nodes.

CI/CD Options

Exclude tests:

  • Fast test
  • Integration Tests
  • Stateless tests
  • Stateful tests
  • Performance tests
  • All with ASAN
  • All with TSAN
  • All with MSAN
  • All with UBSAN
  • All with Coverage
  • All with Aarch64
  • All Regression
  • Disable CI Cache

Regression jobs to run:

  • Fast suites (mostly <1h)
  • Aggregate Functions (2h)
  • Alter (1.5h)
  • Benchmark (30m)
  • ClickHouse Keeper (1h)
  • Iceberg (2h)
  • LDAP (1h)
  • Parquet (1.5h)
  • RBAC (1.5h)
  • SSL Server (1h)
  • S3 (2h)
  • S3 Export (2h)
  • Swarms (30m)
  • Tiered Storage (2h)

@ianton-ru
Copy link
Copy Markdown
Author

@codex review

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 4, 2026

Workflow [PR], commit [0878f8f]

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e72ebfe177

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +725 to +727
if (getClusterName(context).empty() // Not cluster request
&& !(context->getSettingsRef()[Setting::object_storage_remote_initiator] // Not request with remote initiator
&& !context->getSettingsRef()[Setting::object_storage_remote_initiator_cluster].value.empty()))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep unsupported storages at FetchColumns

When object_storage_remote_initiator is set for a configuration that does not support cluster mode, such as icebergLocal or other local data-lake storages where getClusterName is empty because isClusterSupported is false, this new exception makes getQueryProcessingStage report WithMergeableState. IStorageCluster::read still immediately falls back to pure_storage for those configurations, and StorageObjectStorage::read ignores processed_stage, so aggregate queries can be planned as if partial aggregation happened remotely even though only raw rows were read. Please only return the distributed stage when the subsequent read path will actually use the remote/cluster execution path.

Useful? React with 👍 / 👎.

@ianton-ru
Copy link
Copy Markdown
Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 437a47dc4a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +121 to +123
is_remote = settings[Setting::object_storage_remote_initiator]
&& (!settings[Setting::object_storage_cluster].value.empty()
|| !settings[Setting::object_storage_remote_initiator_cluster].value.empty());
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve validation for missing remote initiator cluster

When object_storage_remote_initiator=1 is set on the fallback s3/object-storage table functions but both object_storage_cluster and object_storage_remote_initiator_cluster are empty, this now makes is_remote false and routes the query through BaseSimple. That bypasses the existing IStorageCluster::read validation that raises BAD_ARGUMENTS for this invalid configuration, so a misconfigured query silently runs locally instead of failing or using a remote initiator.

Useful? React with 👍 / 👎.

@ianton-ru ianton-ru force-pushed the bugfix/antalya-26.3/fix_aggregation_with_remote_initiator branch from 437a47d to 0878f8f Compare June 5, 2026 09:16
@ianton-ru
Copy link
Copy Markdown
Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0878f8f10d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +121 to +123
is_remote = settings[Setting::object_storage_remote_initiator]
&& (!settings[Setting::object_storage_cluster].value.empty()
|| !settings[Setting::object_storage_remote_initiator_cluster].value.empty());
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve validation for bare remote initiator requests

When object_storage_remote_initiator=1 is set on the fallback s3/object-storage table function without object_storage_cluster or object_storage_remote_initiator_cluster, this new guard leaves is_remote false, so executeImpl builds the simple storage and bypasses the validation added in StorageObjectStorageCluster/IStorageCluster. That makes an invalid remote-initiator request run locally instead of raising the documented exception, which is easy to hit when the companion cluster setting is omitted or misspelled.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

object_storage_remote_initiator_cluster without cluster name or object_storage_remote_initiator_cluster call an exception before.

@ianton-ru ianton-ru added antalya antalya-26.3 port-antalya PRs to be ported to all new Antalya releases labels Jun 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

antalya antalya-26.3 port-antalya PRs to be ported to all new Antalya releases

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant