You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I noticed that the metrics for the FDA, SWDE, and SQuAD_completion tasks were abnormal, while the performance on other evaluations was normal. Upon analysis, I found that a large number of spaces were being prepended to the inputs of certain tasks. To address this, I made the following modifications:
Thank you for your excellent work. I trained a 410M LLaMA model on FineWeb-EDU-10B and conducted evaluations, with the results as follows.
I noticed that the metrics for the FDA, SWDE, and SQuAD_completion tasks were abnormal, while the performance on other evaluations was normal. Upon analysis, I found that a large number of spaces were being prepended to the inputs of certain tasks. To address this, I made the following modifications:
and re-evaluated the model. The updated results are as follows:
The results now look much more normal, so I’d like to know whether we should apply
.strip()
to the inputs of all tasks.The text was updated successfully, but these errors were encountered: