-
Notifications
You must be signed in to change notification settings - Fork 549
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bump Transformer v4.49.0 #1735
Bump Transformer v4.49.0 #1735
Conversation
2d8bcb8
to
a0a1eaa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically LGTM! Please update the PR description to note that after this PR, the MPT models on the huggingface hub will no longer be usable with trust_remote_code=True
and older foundry versions will need to be used for that.
Please also adjust the MPT error to say that it is no longer supported by foundry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change models in yamls to 8b, otherwise LGTM!
🚨🚨🚨 MPT models on the huggingface hub will no longer be usable with trust_remote_code=True and older foundry versions will need to be used for that. 🚨🚨🚨
A couple of interesting notes:
Transformer tokenizers have added an extra_special_tokens attribute + made a new object
AddedTokens
for special tokens.Seems we have to explicitly set torch_dtype due to this, where transformer will explicitly converts the torch_dtype to a string for JSON serialization.
Looks like they've gotten rid of LlamaFlashAttention2 and refactored Attention:
huggingface/transformers#35235
Seems like our test was actually hitting this strange interaction:
huggingface/transformers#30305