Batching and generate_until special tokens #2723

sjmielke · 2025-02-21T20:41:18Z

models.huggingface.generate_until has a step to cut off response suffixes that occur after end tokens in:

lm-evaluation-harness/lm_eval/models/huggingface.py

Line 1370 in 0bf9f4e

# use secondary stop seqs to cut off should-have-been-stopped content post-hoc

...which should help with cases of batching where the batch generation of:

lm-evaluation-harness/lm_eval/models/huggingface.py

Line 1355 in 0bf9f4e

cont = self._model_generate(

...is leaving other sequences too long otherwise.

The problem is that using, say a special token <|eot_id|> as an EOS token, we would want this decode to preserve that token:

lm-evaluation-harness/lm_eval/models/huggingface.py

Line 1368 in 0bf9f4e

s = self.tok_decode(cont_toks)

...but it doesn't because skip_special_tokens defaults to True. Should it be set to false here? I don't have the bigger picture to decide if that would break other things, but we are encountering this issue of things not being truncated, so if you have any other ideas for how to remedy that well, please let me know!

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batching and generate_until special tokens #2723

Batching and generate_until special tokens #2723

sjmielke commented Feb 21, 2025

Batching and generate_until special tokens #2723

Batching and generate_until special tokens #2723

Comments

sjmielke commented Feb 21, 2025