Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

onnxruntime version #127

Open
waquey opened this issue Oct 25, 2024 · 3 comments
Open

onnxruntime version #127

waquey opened this issue Oct 25, 2024 · 3 comments

Comments

@waquey
Copy link

waquey commented Oct 25, 2024

Dear Authors,
Thanks for the great job.
After installing "ryzen-ai-1.2.0-20240726.msi", I can run with NPU under the target platform.
However, there are some questions I would like to verify.

(1) What's the version of vitis onnxruntime.dll you've provided in msi packages ?

(2) I've tried to download onnxruntime official release 1.8.0 and run CPUExecutionProvider.
But the speed is faster than (1) under CPU mode. (without launching NPU) Is it expected?

(3) I've tried one model with "Exp" operator. However, after quantizing and inferencing, the operator is not run under npu mode. Is it expected?

Thanks

@uday610
Copy link
Collaborator

uday610 commented Oct 27, 2024

Hi
After activating the conda environment created by MSI installer (conda activate ryzen-ai-1.2.0) you can check onnxruntime version by pip list. It should be 1.17.0

You tried too old release 1.8.0 so it is possible it is slow.

I am not sure what you mean you have tried one model with Exp operator. What type of model is this? We cannot run one operator on NPU. Ryzen AI is supported for complete CNN models.

Thanks
Uday

@waquey
Copy link
Author

waquey commented Oct 28, 2024

Thanks for replying!
It is transformer-based (viT) onnx models. However, after quantization, it seems the "Erf" operator cannot be supported by NPU. Thus, I've tried to use another implementation with "Exp" operator to imitate the GELU activation. The compiled model shows "Exp" is run on CPU while "Conv/Div/Mul..." are run on NPU. Is there any recommended method to quantize transformer-based models for dpu-mode? I use "RyzenAI_quant_tutorial/onnx_example" to modify calibration images to do quantization currently. Thanks

@uday610
Copy link
Collaborator

uday610 commented Oct 29, 2024

With the current software, some operators can run on the CPU depending on the model.

We expect more offloading to the NPU in future software versions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants