Skip to content

CASSANDRA-18216: Allow sharding of the SAI in-memory index#4885

Open
naseryyash wants to merge 5 commits into
apache:trunkfrom
naseryyash:CASSANDRA-18216-trunk
Open

CASSANDRA-18216: Allow sharding of the SAI in-memory index#4885
naseryyash wants to merge 5 commits into
apache:trunkfrom
naseryyash:CASSANDRA-18216-trunk

Conversation

@naseryyash

Copy link
Copy Markdown

CASSANDRA-18216: Allow sharding of the SAI in-memory index

  • Refactor MemtableIndex from concrete class to interface
  • Add UnshardedMemtableIndex preserving existing behavior
  • Add ShardedMemtableIndex with token-range sharded TrieMemoryIndex[]
  • Extend ShardBoundaries with precomputed partition ranges and shard-to-key-range intersection
  • Add Range.intersects() overloads for IncludingExcludingBounds and ExcludingBounds
  • Add MemtableIndexManager factory routing based on index options
  • Sharding is opt-in via CREATE INDEX WITH OPTIONS = {'shards': ''}
  • Reject shards option on vector indexes at validation time

patch by Yash Nasery; reviewed by Caleb Rackliffe for CASSANDRA-18216

- Refactor MemtableIndex from concrete class to interface
- Add UnshardedMemtableIndex preserving existing behavior
- Add ShardedMemtableIndex with token-range sharded TrieMemoryIndex[]
- Extend ShardBoundaries with precomputed partition ranges and shard-to-key-range intersection
- Add MemtableIndexManager factory routing based on index options
- Sharding is opt-in via CREATE INDEX WITH OPTIONS = {'shards': '<count>'}
- Reject shards option on vector indexes at validation time
- Cross-shard iterator uses MergeIterator with PrimaryKeysMergeReducer for flush path
…terator<PrimaryKey>

- Add MemtableIndexFlushBench for low-cardinality sharded flush measurement
- Change MemtableIndex.iterator() return type from PrimaryKeys to Iterator<PrimaryKey>
- PrimaryKeysMergeReducer chains per-shard iterators lazily via Reducer.Trivial
- Update UnshardedMemtableIndex and RowMapping.merge() to match new interface
- Add ShardedMemtableIndexTest and refactor shared helpers from TrieMemoryIndexTest into SAIRandomizedTester
- Add RangeIntersectsBoundsTest for Range.intersects() coverage across all bound types
- Add DDL test rejecting shards option on vector indexes
@naseryyash naseryyash changed the title Cassandra 18216 trunk CASSANDRA-18216: Allow sharding of the SAI in-memory index Jun 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant