fix Gemma 4 multimodal chat-template markers in processor_gemma4 by copybara-service[bot] · Pull Request #4158 · AI-Hypercomputer/maxtext

copybara-service · 2026-06-12T22:49:02Z

fix Gemma 4 multimodal chat-template markers in processor_gemma4

The Gemma 4 multimodal SFT path was emitting Gemma 3 chat-template markers
("<start_of_turn>", "<end_of_turn>") which are NOT special tokens in the
Gemma 4 tokenizer. They BPE-tokenize into 7-token noise sequences each, so a
training label like "A<end_of_turn>" became an 8-token sequence
([236776 'A', 236820 '<', 643 'end', 236779 '', 1340 'of', 236779 '',
887 'turn', 236813 '>']).

With sft_train_on_completion_only=true the model learned to reproduce this
noise sequence after every answer, producing severe response-format collapse
post-SFT (e.g. "A<B<C<D<...").

The Gemma 4 chat template uses different special tokens:
(id 2)
<|turn> (id 105)
<turn|> (id 106)
This CL switches the prompt and response formatters to use them.

codecov · 2026-06-12T22:53:58Z

Codecov Report

❌ Patch coverage is 0% with 2 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/maxtext/multimodal/processor.py	0.00%	1 Missing ⚠️
src/maxtext/multimodal/processor_gemma4.py	0.00%	1 Missing ⚠️

📢 Thoughts on this report? Let us know!

The Gemma 4 multimodal SFT path was emitting Gemma 3 chat-template markers ("<start_of_turn>", "<end_of_turn>") which are NOT special tokens in the Gemma 4 tokenizer. They BPE-tokenize into 7-token noise sequences each, so a training label like "A<end_of_turn>" became an 8-token sequence ([236776 'A', 236820 '<', 643 'end', 236779 '_', 1340 'of', 236779 '_', 887 'turn', 236813 '>']). With sft_train_on_completion_only=true the model learned to reproduce this noise sequence after every answer, producing severe response-format collapse post-SFT (e.g. "A<B<C<D<..."). The Gemma 4 chat template uses different special tokens: <bos> (id 2) <|turn> (id 105) <turn|> (id 106) This CL switches the prompt and response formatters to use them. PiperOrigin-RevId: 931396545

copybara-service Bot force-pushed the test_931310443 branch from 01d9a51 to 6462021 Compare June 12, 2026 22:58

copybara-service Bot force-pushed the test_931310443 branch from 6462021 to 5f3dc2b Compare June 12, 2026 23:44

copybara-service Bot merged commit 5f3dc2b into main Jun 12, 2026

copybara-service Bot deleted the test_931310443 branch June 12, 2026 23:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix Gemma 4 multimodal chat-template markers in processor_gemma4#4158

fix Gemma 4 multimodal chat-template markers in processor_gemma4#4158
copybara-service[bot] merged 1 commit into
mainfrom
test_931310443

copybara-service Bot commented Jun 12, 2026

Uh oh!

codecov Bot commented Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

copybara-service Bot commented Jun 12, 2026

Uh oh!

codecov Bot commented Jun 12, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant