-
Notifications
You must be signed in to change notification settings - Fork 6.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ValueError: offset must be non-negative and no greater than buffer length #5543
Comments
And here is the whole traceback: 2024-09-23 14:53:13 | INFO | fairseq_cli.train | task: TranslationTask |
I wanted to offer my assistance regarding the ValueError: offset must be non-negative and no greater than buffer length error you encountered while training with Fairseq. Summary of the Issue: Approach : |
Hi! In my case this problem appeared because of a problem with integer precision when processing long files in the binarization of the corpus. It can be solved by adding here the following line:
And processing again the corpus with You could also avoid this problem by splitting your big files in smaller ones. |
Hi,
I'm training the fairseq with the following script and get the error ValueError: offset must be non-negative and no greater than buffer length.
fairseq-train data-bin --arch transformer
--max-epoch 10
--max-tokens 2048
--num-workers 20
--max-sentences 5000
--fp16
--optimizer adam --lr-scheduler inverse_sqrt --lr 0.0007
--criterion label_smoothed_cross_entropy
The text was updated successfully, but these errors were encountered: