You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For example, I encountered difficulties when integrating needle_in_a_haystack benchmark into lm-evaluation-harness. During the preprocessing of documents, I needed to first tokenize the haystack document and then insert the needle at different positions, which required the involvement of a tokenizer. However, both the process_docs and doc_to_text unction only have a single doc parameter, making it impossible to pass in the tokenizer.
The text was updated successfully, but these errors were encountered:
Hi! I've been working on the ruler benchmark in #2629. You can now use download_dataset : !function utils... in the config and that function will have access to the tokenizer or pretrained from model_args, as well as custom arguments passed to --metadata
For example, I encountered difficulties when integrating
needle_in_a_haystack
benchmark intolm-evaluation-harness
. During the preprocessing of documents, I needed to first tokenize thehaystack document
and then insert theneedle
at different positions, which required the involvement of a tokenizer. However, both theprocess_docs
anddoc_to_text
unction only have a single doc parameter, making it impossible to pass in the tokenizer.The text was updated successfully, but these errors were encountered: