Skip to content

Add stop_strings and stopping_criteria support to TransformerBridge.generate#1374

Merged
jlarson4 merged 1 commit into
TransformerLensOrg:devfrom
RecreationalMath:add-stopping-criteria
Jun 9, 2026
Merged

Add stop_strings and stopping_criteria support to TransformerBridge.generate#1374
jlarson4 merged 1 commit into
TransformerLensOrg:devfrom
RecreationalMath:add-stopping-criteria

Conversation

@RecreationalMath

Copy link
Copy Markdown
Contributor

Description

  • Adds stop_strings and stopping_criteria to TransformerBridge.generate() and generate_stream(), alongside the existing stop_at_eos. HuggingFace-native stopping is already reachable via hf_generate(). This brings the same stopping surface to the bridge's own generation loop, which previously supported only EOS stopping.
  • stop_strings reuses HuggingFace's StopStringCriteria rather than a hand-rolled matcher, so its matching behavior is identical to HuggingFace.
  • The three stop signals (EOS via stop_at_eos, stop_strings, and stopping_criteria) are independent: any one of them ends a sequence.
  • Supported on the standard decoder-only text path (single and batched). Encoder-decoder, inputs_embeds, and multimodal inputs always raise a clear NotImplementedError. Stateful/SSM models raise only when run with use_past_kv_cache=False (their default keeps them on the supported path).

Notes for review

  • Backward compatibility: with both new parameters left as None (the default), generation output is byte-for-byte identical to before, so existing callers are unaffected.
  • The stop_at_eos=False interaction required a change to shared loop logic, so the diff touches the existing EOS-handling path: the early-exit and finished-row padding were previously gated on stop_at_eos, and are now gated on whether any stop signal is active. Without this, a stop_strings match with stop_at_eos=False would mark a sequence finished but never actually end generation.
  • One test exercises the default use_past_kv_cache=True path and is skipped on macOS-arm64 (the upstream KV-cache NaN, issue [Bug Report] [macOS-arm64] Cached eager attention NaNs in transformers v5 — blocks bridge KV-cache generation #1322), registered in tests/QUARANTINES.md per repo policy. Every other test runs on all platforms.

Tests added

A new integration suite covering single and list stop_strings, custom criteria (bare, list, and StoppingCriteriaList), the stop_at_eos=False case, no-op defaults, batched generation, generate_stream parity, and the error contracts.

Closes #595

Type of change

  • New feature (non-breaking change which adds functionality)
  • This change requires a documentation update

Checklist:

  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I have not rewritten tests relating to key interfaces which would affect backward compatibility

@RecreationalMath

Copy link
Copy Markdown
Contributor Author

Both red checks are the known flakes, not this PR: Notebook Checks (BERT) (import-cell stderr mismatch, passes locally) and Docstring Test (HF-429 on the model-loading doctests).

@jlarson4

jlarson4 commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator

Excellent work on this! False negative CI tests have passed on a re-run, merging as is

@jlarson4 jlarson4 merged commit a5f1193 into TransformerLensOrg:dev Jun 9, 2026
48 of 50 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants