Skip to content

diego-ssh: update loggregator-down test for non-blocking dial (tnz-96…#1142

Merged
kart2bc merged 1 commit into
cloudfoundry:developfrom
navinms711:develop
Jun 2, 2026
Merged

diego-ssh: update loggregator-down test for non-blocking dial (tnz-96…#1142
kart2bc merged 1 commit into
cloudfoundry:developfrom
navinms711:develop

Conversation

@navinms711

@navinms711 navinms711 commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes a test that became invalid after the non-blocking loggregator dial change (tnz-96145) landed in diego-logging-client.

Why this PR is necessary

cloudfoundry/diego-logging-client# (tnz-96145) removed grpc.WithBlock() and grpc.WithTimeout() from NewIngressClient, making the loggregator gRPC dial non-blocking (lazy). Previously, if the loggregator server was unavailable at startup, NewIngressClient returned an error after a 1-second dial timeout, ssh-proxy logged
"failed-to-initialize-metron-client", and exited with a non-zero status.

With the lazy dial, the call always succeeds immediately and ssh-proxy starts normally, retrying the connection to the loggregator server in the background.

The test:

"SSH proxy / authenticating with the diego realm metrics / when the loggregator server isn't up / exits with non-zero status code"

used:

Eventually(process.Wait()).Should(Receive(HaveOccurred()))

This now times out because ssh-proxy no longer exits when the loggregator server is unavailable. The assertion is changed to:

Consistently(process.Wait()).ShouldNot(Receive())

which correctly describes the new behaviour: ssh-proxy starts successfully and keeps running when the loggregator server is temporarily unavailable.

Prior work

Backward Compatibility

Breaking Change? No

Test-only change. No production code, BOSH jobs, or manifests are modified. The new behaviour (ssh-proxy starts successfully when the loggregator server is transiently unavailable and reconnects lazily) is the intended outcome of the diego-logging-client change and improves startup resilience during BOSH rolling upgrades.

…145)

diego-logging-client (tnz-96145) removed grpc.WithBlock() from
NewIngressClient, making the loggregator dial non-blocking. ssh-proxy
now starts successfully even when the loggregator server is unavailable
and retries the connection in the background.

The test "when the loggregator server isn't up → exits with non-zero
status code" expected the old blocking behaviour. With the non-blocking
dial ssh-proxy never exits, causing the test to time out.

Update the assertion to verify the new correct behaviour: ssh-proxy
starts and keeps running when the loggregator server is temporarily
unavailable.

Co-authored-by: Cursor <cursoragent@cursor.com>
@navinms711 navinms711 requested a review from a team as a code owner June 2, 2026 19:00
@github-project-automation github-project-automation Bot moved this from Inbox to Pending Merge | Prioritized in Application Runtime Platform Working Group Jun 2, 2026
@kart2bc kart2bc merged commit 9c3ae71 into cloudfoundry:develop Jun 2, 2026
9 checks passed
@github-project-automation github-project-automation Bot moved this from Pending Merge | Prioritized to Done in Application Runtime Platform Working Group Jun 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Development

Successfully merging this pull request may close these issues.

2 participants