Add encryption infrastructure#1365
Conversation
|
May need changes to support BYOK. |
|
Needs a design doc. |
0b8a08f to
07e0e00
Compare
There was a problem hiding this comment.
Pull request overview
Adds foundational encryption plumbing to the server so partitions can persist a payload codec configuration (including a wrapped DEK) and later materialize an encrypt/decrypt codec via a KMS client abstraction.
Changes:
- Introduces a
KMSCryptoClientabstraction plus a KMS-envelope codec loader to unwrap DEKs and construct codecs. - Adds an AES-GCM payload codec and extends payload codec config models to support AES-GCM + base64url-encoded binary fields.
- Adds
cryptographyas a dependency and includes unit tests for AES-GCM, config round-tripping, and loader behavior.
Reviewed changes
Copilot reviewed 10 out of 11 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| uv.lock | Locks cryptography dependency needed for AES-GCM implementation. |
| packages/server/pyproject.toml | Adds cryptography to server runtime dependencies. |
| packages/server/src/memmachine_server/common/payload_codec/payload_codec_config.py | Adds KMS-envelope/AES-GCM codec config models + base64url (de)serialization. |
| packages/server/src/memmachine_server/common/payload_codec/kms_envelope_payload_codec_loader.py | Adds loader that unwraps DEKs via KMS and materializes codecs. |
| packages/server/src/memmachine_server/common/payload_codec/aes_gcm_payload_codec.py | Implements AES-GCM encode/decode codec using cryptography. |
| packages/server/src/memmachine_server/common/payload_codec/init.py | Exports new loader/config types. |
| packages/server/src/memmachine_server/common/kms/kms_crypto_client.py | Adds abstract KMS crypto client interface (encrypt/decrypt). |
| packages/server/src/memmachine_server/common/kms/init.py | Exposes KMS abstractions via package exports. |
| packages/server/server_tests/memmachine_server/common/payload_codec/test_payload_codec.py | Adds AES-GCM round-trip + wrong-AAD failure test. |
| packages/server/server_tests/memmachine_server/common/payload_codec/test_payload_codec_config.py | Adds AES-GCM config JSON round-trip and base64url assertions. |
| packages/server/server_tests/memmachine_server/common/payload_codec/test_kms_envelope_payload_codec_loader.py | Adds loader tests (success + unsupported config). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
sscargal
left a comment
There was a problem hiding this comment.
Summary
Nice, well-scoped infrastructure addition — clean ABCs, sensible AAD propagation, and the loader's match pattern leaves room for future codec types. CI is green across all platforms and the happy-path tests are correct.
Requesting changes for a few items not already covered by @Copilot's notes. Copilot has already flagged the nonce_size validation gap (twice) and the permissive urlsafe_b64decode behavior; those still apply and aren't repeated here.
Blocking
- GCM auth tag length check missing in
decode— see inline onaes_gcm_payload_codec.py:39. Currently conflates truncated buffers with active tampering. NotImplementedErroris the wrong exception type for an unsupported runtime config — see inline onkms_envelope_payload_codec_loader.py:45.binascii.Errorfromurlsafe_b64decodeis not wrapped withfrom err— see inlines onpayload_codec_config.py:36and:51.AGENTS.mdL94 requiresraise ... from errchaining.
Suggestions
KMSEnvelopePayloadCodecConfigis meant as an abstract base but is instantiable and has notypediscriminator; the test attest_kms_envelope_payload_codec_loader.py:80shows this leaks. See inline onpayload_codec_config.py:23.- Test gaps for defensive branches the author explicitly added (short-payload guard, tampered-ciphertext distinct from AD mismatch,
associated_data=Nonepath). See inlines on the test files.
Tests
Existing tests cover round-trip + AD-mismatch + unsupported-config rejection — the right baseline shape. The PR checklist has I have added unit tests unticked despite tests existing — likely just an oversight.
Local checks
Skipped local ruff / ty re-run. CI ran both across Python 3.12/3.13/3.14 on Linux/macOS/Windows and all 40+ checks passed.
| if len(value) < self._nonce_size: | ||
| raise ValueError("Encrypted payload is too short") |
There was a problem hiding this comment.
The length check guards against a missing nonce, but AES-GCM also requires a 16-byte auth tag. A 12–27 byte buffer passes this check and then surfaces as an opaque InvalidTag from AESGCM.decrypt, which conflates truncated ciphertext with active tampering in logs/Sentry.
Suggest:
if len(value) < self._nonce_size + 16:
raise ValueError("Encrypted payload is too short (missing nonce or auth tag)")The GCM tag is fixed at 16 bytes for AES-128/192/256.
| raise NotImplementedError( | ||
| f"Unsupported KMS envelope payload codec config: " | ||
| f"{type(config).__name__}" |
There was a problem hiding this comment.
NotImplementedError is conventionally for unimplemented abstract methods / unfinished features. An unsupported runtime config is a caller/data error — TypeError (matches Python's own behavior for unmatched match over types) or ValueError is more appropriate, so callers can distinguish the loader is incomplete from you passed a bogus config.
The test at test_kms_envelope_payload_codec_loader.py:88-89 would need a matching update.
There was a problem hiding this comment.
It represents an unfinished feature, assuming that the type passed in matches the type hint. In Python, runtime type checks are excessive. If anyone cares about typing, they can use a static type checker.
| if isinstance(value, bytes): | ||
| return value | ||
| if isinstance(value, str): | ||
| return urlsafe_b64decode(value.encode("ascii")) |
There was a problem hiding this comment.
Complements the Copilot note above: when urlsafe_b64decode rejects malformed input it raises binascii.Error, which is currently re-raised unwrapped. AGENTS.md L94 requires raise ... from err chaining so the original cause is preserved. Suggest:
import binascii
if isinstance(value, str):
try:
return urlsafe_b64decode(value.encode("ascii"))
except (binascii.Error, ValueError) as e:
raise ValueError("wrapped_dek is not valid base64url") from eSame pattern needed for associated_data at line 51.
|
|
||
|
|
||
| PayloadCodecConfigUnion = PlaintextPayloadCodecConfig | ||
| class KMSEnvelopePayloadCodecConfig(BaseModel): |
There was a problem hiding this comment.
KMSEnvelopePayloadCodecConfig is intended as an abstract base (no type discriminator) but is freely instantiable — test_kms_envelope_payload_codec_loader.py:80-96 confirms this by constructing an UnknownKMSEnvelopePayloadCodecConfig directly.
A couple of options:
- If genuinely abstract: make it
ABC(or atyping.Protocol) and move the base64 validators into a shared mixin. Bare instantiation should fail. - If purely structural: consider composition — give each concrete config an
envelope: KMSEnvelopeParamsfield instead of inheritance. That flattens the discriminated union and removes the abstract base in a union hazard.
Either way, consider model_config = ConfigDict(frozen=True) — config objects holding (wrapped) key material should be immutable.
There was a problem hiding this comment.
Composition is probably better than inheritance here from a design pattern perspective, but it will make config processing more complicated.
| b"0" * 32, | ||
| associated_data=b"partition:block", | ||
| ) | ||
| with pytest.raises(InvalidTag): |
There was a problem hiding this comment.
Three defensive branches the codec adds are currently untested — easy wins:
nonce_size <= 0constructor guard (codec line 24-25)- short-payload
ValueErrorbranch (codec line 39-40) - tampered ciphertext distinct from AD mismatch — the existing
wrong_codectest only varies the AAD, so a regression that silently disabled GCM auth-tag verification while leaving AAD intact would still pass. Flipping a byte in the ciphertext (buf = bytearray(encoded); buf[20] ^= 1) and assertingInvalidTagwould catch it.
| key_ref="partition_key", | ||
| wrapped_dek=b"wrapped-dek-bytes", | ||
| nonce_size=12, | ||
| associated_data=b"partition:context", |
There was a problem hiding this comment.
associated_data=None is never exercised in the codec, config, or loader tests. Both _decode_associated_data / _serialize_associated_data None branches and AESGCM.encrypt(...) with aad=None are unverified. A second loader test with associated_data=None would close this — it's the path most production callers will hit first if they forget to pass AAD.
Signed-off-by: Edwin Yu <edwinyyyu@gmail.com>
Signed-off-by: Edwin Yu <edwinyyyu@gmail.com>
Signed-off-by: Edwin Yu <edwinyyyu@gmail.com>
Signed-off-by: Edwin Yu <edwinyyyu@gmail.com>
Purpose of the change
For supporting sensitive data encryption.
The approach to encrypting in SQL will be to store codec config (including wrapped DEK) in database.
Description
Based on #1375 for easier rebasing, but really only creates stuff for encryption.
DOES NOT HANDLE key creation/rotation.
Alternative is encryption at rest but database sees the plaintext data.
KMSCryptoClient may get implementations with the following backends:
Type of change
How Has This Been Tested?
Checklist
Maintainer Checklist