fix: detect ScyllaDB via SUPPORTED protocol extensions, not shard count#910
fix: detect ScyllaDB via SUPPORTED protocol extensions, not shard count#910mykaul wants to merge 1 commit into
Conversation
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
The driver previously identified ScyllaDB by the presence of shard-related fields (SCYLLA_NR_SHARDS, SCYLLA_SHARD, etc.) in the SUPPORTED response. When shard-awareness is disabled on the server side (allow_shard_aware_drivers: false) those fields are absent, causing the driver to misidentify a ScyllaDB cluster as Cassandra. The immediate consequence is that the driver enables peers_v2 (which ScyllaDB does not support) and fails to connect -- the same regression described in scylladb/gocql#902. Fix (mirrors scylladb/gocql#903): * Add ProtocolFeatures.is_scylla: set to True whenever ANY known Scylla-specific extension key (SCYLLA_LWT_ADD_METADATA_MARK, SCYLLA_RATE_LIMIT_ERROR, TABLETS_ROUTING_V1) is present in the SUPPORTED response, or whenever sharding_info is populated. This flag stays True even when sharding is disabled. * ProtocolFeatures.sharding_info remains None when sharding is disabled, so shard-aware connection pooling (pool.py) is correctly left inactive. * cluster.py: use is_scylla (not sharding_info is not None) to gate the peers_v2 disable and the USING TIMEOUT metadata request timeout. * metadata.py _is_not_scylla(): use is_scylla instead of the previous check (shard_id is None), which was always False after the OPTIONS exchange and therefore silently broke Cassandra trigger-metadata queries. * protocol_features.py parse_sharding_info(): fix a latent crash -- if SCYLLA_PARTITIONER or SCYLLA_SHARDING_ALGORITHM were present without SCYLLA_SHARD, int(None) would raise TypeError. Default shard_id to 0. Fixes: scylladb/python-driver#<TBD> See also: scylladb/gocql#902, scylladb/gocql#903 Fixes: scylladb#909
a64d05e to
bea45ac
Compare
There was a problem hiding this comment.
Pull request overview
This PR fixes ScyllaDB detection by using Scylla-specific SUPPORTED protocol extension keys (and sharding info when present) instead of relying on shard-related fields alone, preventing misclassification as Cassandra when shard-awareness is disabled and avoiding peers_v2-related connection failures.
Changes:
- Add
ProtocolFeatures.is_scyllaand implement Scylla detection via known Scylla-specificSUPPORTEDextension keys. - Harden sharding parsing/initialization against missing shard fields to prevent
int(None)crashes. - Gate peers_v2 usage,
USING TIMEOUT, and trigger-metadata behavior onis_scyllainstead of shard-awareness heuristics; add unit tests for detection/regressions.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
cassandra/protocol_features.py |
Introduces is_scylla detection and adjusts sharding parsing to avoid crashes. |
cassandra/shard_info.py |
Guards integer conversion for missing shard-related fields. |
cassandra/c_shard_info.pyx |
Mirrors shard-info conversion guards in the Cython implementation. |
cassandra/cluster.py |
Switches peers_v2 disabling and metadata timeout gating to is_scylla. |
cassandra/metadata.py |
Updates Scylla-vs-not logic to use features.is_scylla. |
tests/unit/test_protocol_features.py |
Adds unit tests for Scylla detection and crash regressions. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| resolved_shard_id = int(shard_id) if shard_id is not None else 0 | ||
| return resolved_shard_id, _ShardingInfo(shard_id, shards_count, partitioner, sharding_algorithm, sharding_ignore_msb, | ||
| shard_aware_port, shard_aware_port_ssl) |
| # SCYLLA_PARTITIONER passes the sharding guard, so sharding_info is populated | ||
| # with zero defaults rather than crashing. | ||
| assert pf.sharding_info is not None | ||
| assert pf.sharding_info.shards_count == 0 | ||
| assert pf.sharding_info.sharding_ignore_msb == 0 |
| assert pf.is_scylla is True | ||
| assert pf.sharding_info is not None | ||
| assert pf.sharding_info.shards_count == 0 |
Summary
The driver previously identified ScyllaDB solely by the presence of shard-related fields (
SCYLLA_NR_SHARDS,SCYLLA_SHARD, etc.) in theSUPPORTEDresponse.When shard-awareness is disabled on the server side (
allow_shard_aware_drivers: false), those fields are absent, causing the driver to misidentify a ScyllaDB cluster as Cassandra.The immediate consequence is that the driver leaves
peers_v2enabled (which ScyllaDB does not support) and fails to connect — the same regression described in scylladb/gocql#902.This fix mirrors scylladb/gocql#903.
Changes
cassandra/protocol_features.pyProtocolFeatures.is_scyllabool (defaultFalse), set by the newdetect_scylla()static method.detect_scylla()returnsTruewhen any known Scylla-specific extension key (SCYLLA_LWT_ADD_METADATA_MARK,SCYLLA_RATE_LIMIT_ERROR,TABLETS_ROUTING_V1) is present in theSUPPORTEDresponse, or whensharding_infois populated.These keys are advertised by ScyllaDB regardless of whether shard-awareness is enabled, so
is_scyllastaysTrueeven with sharding disabled.parse_sharding_info:int(shard_id)raisedTypeErrorwhenSCYLLA_PARTITIONER/SCYLLA_SHARDING_ALGORITHMwas present withoutSCYLLA_SHARD. Default to0.cassandra/shard_info.py/cassandra/c_shard_info.pyx_ShardingInfo.__init__/ShardingInfo.__init__: guardint(shards_count)andint(sharding_ignore_msb)againstNone(default to0), matching the existing pattern used forshard_aware_port. Fixes the remaining crash path where a shard-defining field was present withoutSCYLLA_NR_SHARDS.cassandra/cluster.py_uses_peers_v2disable and_metadata_request_timeoutnow gate onis_scyllainstead ofsharding_info is not None, so both work correctly when sharding is disabled.cassandra/metadata.py_is_not_scylla()now usesnot is_scyllainstead ofshard_id is None.The old check was always
Falseafter the OPTIONS exchange (Cassandra getsshard_id=0, and0 is NoneisFalse), meaning trigger-metadata queries were silently skipped even for Cassandra/DSE connections.tests/unit/test_protocol_features.pyint(None)paths.Shard-aware pooling is unaffected
pool.pycontinues to gate per-shard connection opening onsharding_info is not None.A ScyllaDB node with sharding disabled correctly gets
is_scylla=True(so peers_v2 is disabled and USING TIMEOUT works) butsharding_info=None(so no per-shard connections are opened). This matches the gocql#903 fix'sisScyllaConn() && nrShards != 0guard.Note on Cassandra behavioral change
_is_not_scylla()was previously always returningFalsepost-OPTIONS for Cassandra (shard_id=0,0 is None = False), sosystem_schema.triggersqueries were silently skipped even for Cassandra/DSE. With this fix, Cassandra/DSE connectionswill correctly issue the trigger-metadata query and populate
TableMetadata.triggers.See also