-
-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Runpod template is broken #17
Comments
For anyone looking for a temporary fix: pip install --upgrade exllamav2 (to get the PID of the process 'python3 server.py --listen --extensions openai') kill PID |
Thanks @WouterGlorieux for the workaround. I checked before running the upgrade, the exllamav2 version used is v0.11. I've made a PR to make sure the same version of exllamav2 required by text-generation-webui is installed. Hopefully that can resolve this issue |
Did you type "exellamav2" instead of "exllamav2"? |
For anyone having issues just using pip install --upgrade exllamav2 try using pip install --upgrade --no-deps exllamav2 instead. It seems like its now attempting to update a lot of dependencies with just the normal upgrade command and breaking. For now the no deps command works but eventually it probably won't. Hopefully by then its solved. Edit Though do keep in mind I learned how to use pip and docker the other day to try and solve this so if it does break again I'll try to fix it but no guarantees. |
Looks like a recent update to exllamav2 now triggers an update to pytorch, which causes everything to explode with some ABI mismatch: ImportError: /usr/local/lib/python3.10/dist-packages/flash_attn_2_cuda.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN2at4_ops9_pad_enum4callERKNS_6TensorEN3c108ArrayRefINS5_6SymIntEEElNS5_8optionalIdEE I'm not going to try to figure that out, but for anyone else still struggling along with RunPod, you can work around this by specifying the previous version instead of saying --upgrade: pip install exllamav2==0.0.15 (Edit: looks like the same thing invictus mentioned above--use whichever works, I'm guessing sticking with the older version may work for longer since eventually exllamav2 may start using something that actually requires the newer version) |
Hi Thanks invictus, I tried to fix it as well in the last few days but i don't really know what I'm doing. I can't find find your Template on Runpod, is it up already? |
Yeah it should be up. I don't know if I have to contact runpod or something first, cause I just clicked the public ticker for my template and thought that would be enough. I contacted them though so I guess we'll see what they say. |
I made a new template on runpod that should be working now, it is called text-generation-webui-oneclick-UI-and-API https://runpod.io/console/gpu-cloud?template=00y0qvimn6&ref=2vdt3dn9 Seeing as thebloke has not posted any models for months it is likely that this repo will not get updated anytime soon if at all, so I forked this repo and I will try to maintain it further here: |
Thx a lot WouterGlorieux. |
Thanks WouterGlorieux I used your template and it worked great! So you don't have to bother with runpod Invictus, but thanks anyways. |
@WouterGlorieux So we delisted official bloke template and added yours to Discovery page :) |
Awesome, thanks! |
Can't load any models anymore. Here is the traceback:
Traceback (most recent call last):
File "/workspace/text-generation-webui/modules/ui_model_menu.py", line 243, in load_model_wrapper
shared.model, shared.tokenizer = load_model(selected_model, loader)
File "/workspace/text-generation-webui/modules/models.py", line 87, in load_model
output = load_func_maploader
File "/workspace/text-generation-webui/modules/models.py", line 378, in ExLlamav2_HF_loader
from modules.exllamav2_hf import Exllamav2HF
File "/workspace/text-generation-webui/modules/exllamav2_hf.py", line 7, in
from exllamav2 import (
ImportError: cannot import name 'ExLlamaV2Cache_Q4' from 'exllamav2' (/usr/local/lib/python3.10/dist-packages/exllamav2/init.py)
The text was updated successfully, but these errors were encountered: