You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello,
The quantized file suffix is ".safetensor", however, when I execute ./bin/llama_example, it output the below warning info,
where can I get these ".bin" files?
Thank you!
The text was updated successfully, but these errors were encountered:
Hello, where did you get the W8A16 model weights? We open source the simulated quantized ckpts, which is only used for accuracy verification. If actual inference is performed, packing preprocessing is required.
Hello,

The quantized file suffix is ".safetensor", however, when I execute ./bin/llama_example, it output the below warning info,
where can I get these ".bin" files?
Thank you!
The text was updated successfully, but these errors were encountered: