Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
71 changes: 71 additions & 0 deletions config/v1/types_apiserver.go

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the changes here are only used in the operator/v1 group - any reason these changes shouldn't be placed in that group instead of this one?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it is semantically correlated with the KMS and encryption configurations above.

Original file line number Diff line number Diff line change
Expand Up @@ -267,6 +267,77 @@ const (
EncryptionTypeKMS EncryptionType = "KMS"
)

// +openshift:validation:Enum="";healthy;unhealthy;error
type KMSPluginHealthStatus string

const (
KMSPluginHealthStatusHealthy KMSPluginHealthStatus = "Healthy"

KMSPluginHealthStatusUnhealthy KMSPluginHealthStatus = "Unhealthy"

KMSPluginHealthStatusError KMSPluginHealthStatus = "Error"
)

// +openshift:compatibility-gen:level=1
type KMSPluginHealthReport struct {

// nodeName is the name of the node this instance of the plugin runs on.
// The combination of NodeName/KeyId makes this health report unique.
// The value must be between 1 and 512 characters.
// +kubebuilder:validation:MinLength=1

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document the minimum length as well.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will, but what's the point of defining it in code and in the documentation. Can't you just generate this as a suffix to the docsstring automatically?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

applied

// +kubebuilder:validation:MaxLength=512
// +required
NodeName string `json:"nodeName,omitempty"`

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

node names follow the DNS subdomain name format. We should use the format validation CEL library in kube for validating the input against that naming format: https://kubernetes.io/docs/reference/using-api/cel/#kubernetes-format-library

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid this will clash with the way that the node controller reports those, I'll add the CEL though.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, are you sure that this is the case?
the node metadata has no such constraints:
https://github.com/kubernetes/api/blob/master/core/v1/types.go#L7013-L7030


// keyId is the encryption-key-secret id (kms-{keyId}.sock), a unique identifier of the plugin on that node.
// This is not a cryptographic key used to encrypt/decrypt any resources.
// The value must be between 1 and 512 characters.
// +kubebuilder:validation:MinLength=1

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document the minimum length as well.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

applied

// +kubebuilder:validation:MaxLength=512
// +required
KeyId string `json:"keyId,omitempty"`

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the key id follow a specific format that we can validate?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't make many assumptions here, this is a concatenation of some strings and integers.


// status contains a health indicator for the respective KMS plugin
// The field can have three states: healthy, unhealthy, error.
// With error and unhealthy containing additional information in Detail.
//
// +kubebuilder:validation:MinLength=1
// +kubebuilder:validation:MaxLength=10
// +required
Status KMSPluginHealthStatus `json:"status,omitempty"`

// lastChecked is a timestamp of when the probe was last checked.
// +required
LastChecked metav1.Time `json:"lastChecked,omitempty"`

// kekId refers to the remote KEK id from KMS v2 StatusResponse.key_id.
// This is not a cryptographic key, but a unique representation of the KEK.
// +kubebuilder:validation:MinLength=0
// +kubebuilder:validation:MaxLength=1024
// +optional
KEKId *string `json:"kekId,omitempty"`

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does it mean for kekId to be set to ""?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That means there is a connection error between health monitor and kms plugin. So ideally we should expect that KMSPluginHealthStatus should be error (or unhealthy)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ibihim how do you intent do deal with this? are you just keeping the last value of a KEK in the conditions or are you always wiping it on error?

Otherwise we can also make this required from the start.


// detail contains additional error/health information; omitted when healthy
// +kubebuilder:validation:MinLength=0
// +kubebuilder:validation:MaxLength=1024
// +optional
Detail *string `json:"detail,omitempty"`
Comment on lines +300 to +324

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside from kekId, I think the fields here could be replaced by using typical status conditions (i.e a []metav1.Condition).

For example, I think this could probably all be wrapped into a single condition.

Happy path:

type: Healthy
status: True
reason: AsExpected
lastTransitionTime: ....

Unhealthy scenario:

type: Healthy
status: False
reason: SomethingIsWrong
message: "error: something is wrong"
lastTransitionTime: ....

While lastTransitionTime isn't necessarily as accurate as lastChecked in terms of probe state, it still signals that the status has been observed since X time and prevents unnecessary status updates to update a timestamp.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ibihim does that work for you to replace it?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we would need NodeName,KeyId,KekId however - so the condition would replace only the health status, lastchecked and detail

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do use conditions for this today and learned that it's difficult to work with. Operators end up parsing JSON field with no schema behind it. Please see for more details about the structure.

It seems that a dedicated type fits better:

  1. a message isn't meant to carry structured data. It's a human readable string by convention.

  2. there's no schema (or rather schema is implicit), no validation, and no generated client.

  3. structured types are easier to understand and work with. With a dedicated type, someone reading the API or writing a controller can immediately see what fields exist and what their types are.

  4. schema evolution seems to be simpler. if we later need to add a field a dedicated type gets a new field with proper validation markers and generated client support.

@p0lyn0mial p0lyn0mial Jun 10, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i am pasting the conditions we use today from the kep i shared in the previous message:

status:
  conditions:
    - type: KMSHealthReporter_master-0
      status: "True"
      reason: AsExpected
      message: '[{"kekID":"kek-9f2c","keyID":"2","status":"healthy","lastChecked":"2026-05-08T12:34:56Z"},{"kekID":"kek-4a17","keyID":"1","status":"healthy","lastChecked":"2026-05-08T12:34:56Z"}]'
    - type: KMSHealthReporter_master-1
      status: "True"
      reason: AsExpected
      message: '[{"kekID":"kek-9f2c","keyID":"2","status":"unhealthy","lastChecked":"2026-05-08T12:34:56Z","detail":"credential lacks decrypt permission"},{"kekID":"kek-4a17","keyID":"1","status":"healthy","lastChecked":"2026-05-08T12:34:56Z"}]'
    - type: KMSHealthReporter_master-2
      status: "True"
      reason: AsExpected
      message: '[{"keyID":"2","status":"error","lastChecked":"2026-05-08T12:34:56Z","detail":"connection refused"},{"keyID":"1","status":"error","lastChecked":"2026-05-08T12:34:56Z","detail":"connection refused"}]'

}

// +openshift:compatibility-gen:level=1
// +kubebuilder:validation:MinProperties=1
type APIServerEncryptionStatus struct {
// healthReports contains all KMS plugin health reports for this APIServer.
// When omitted, no health reports are available.
// Each entry must have a unique combination of nodeName and keyId.
// +optional
// +kubebuilder:validation:MinItems=0
// +kubebuilder:validation:MaxItems=100
// +openshift:validation:XValidation,rule="self.all(x, self.exists_one(y, y.nodeName == x.nodeName && y.keyId == x.keyId))",message="each health report must have a unique nodeName/keyId combination"
// +listType=atomic

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason for this to be an atomic list?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what else should it be?

HealthReports []KMSPluginHealthReport `json:"healthReports,omitempty"`

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a semantic difference between healthReports: [] and it being omitted?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope

}

type APIServerStatus struct {
}

Expand Down
50 changes: 50 additions & 0 deletions config/v1/zz_generated.deepcopy.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

21 changes: 21 additions & 0 deletions config/v1/zz_generated.swagger_doc_generated.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

118 changes: 116 additions & 2 deletions openapi/generated_openapi/zz_generated.openapi.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading