Runpod template is broken #17

WouterGlorieux · 2024-03-08T09:54:55Z

Can't load any models anymore. Here is the traceback:

Traceback (most recent call last):

File "/workspace/text-generation-webui/modules/ui_model_menu.py", line 243, in load_model_wrapper

shared.model, shared.tokenizer = load_model(selected_model, loader)
File "/workspace/text-generation-webui/modules/models.py", line 87, in load_model

output = load_func_maploader
File "/workspace/text-generation-webui/modules/models.py", line 378, in ExLlamav2_HF_loader

from modules.exllamav2_hf import Exllamav2HF
File "/workspace/text-generation-webui/modules/exllamav2_hf.py", line 7, in

from exllamav2 import (
ImportError: cannot import name 'ExLlamaV2Cache_Q4' from 'exllamav2' (/usr/local/lib/python3.10/dist-packages/exllamav2/init.py)

WouterGlorieux · 2024-03-08T11:00:46Z

For anyone looking for a temporary fix:
start Web terminal and connect to it. Then do these commands:

pip install --upgrade exllamav2

(to get the PID of the process 'python3 server.py --listen --extensions openai')
ps fux

kill PID

FajriFadli · 2024-03-08T11:49:47Z

Thanks @WouterGlorieux for the workaround. I checked before running the upgrade, the exllamav2 version used is v0.11. I've made a PR to make sure the same version of exllamav2 required by text-generation-webui is installed. Hopefully that can resolve this issue

noisefloordev · 2024-03-13T13:55:55Z

Did you type "exellamav2" instead of "exllamav2"?

invictus-1 · 2024-03-21T15:18:45Z

For anyone having issues just using pip install --upgrade exllamav2 try using pip install --upgrade --no-deps exllamav2 instead. It seems like its now attempting to update a lot of dependencies with just the normal upgrade command and breaking. For now the no deps command works but eventually it probably won't. Hopefully by then its solved.

Edit
I uploaded a template on Runpod with a possible permanent fix. It's called TheBloke LLM One Click ExLlamaV2Cache_Q4 fix.
As of right now it seems like everything should be working without issue. I've messed around with a few different LLM's and tried creating a few pods and it seems to work without issue.

Though do keep in mind I learned how to use pip and docker the other day to try and solve this so if it does break again I'll try to fix it but no guarantees.

noisefloordev · 2024-03-22T05:01:00Z

Looks like a recent update to exllamav2 now triggers an update to pytorch, which causes everything to explode with some ABI mismatch:

ImportError: /usr/local/lib/python3.10/dist-packages/flash_attn_2_cuda.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN2at4_ops9_pad_enum4callERKNS_6TensorEN3c108ArrayRefINS5_6SymIntEEElNS5_8optionalIdEE

I'm not going to try to figure that out, but for anyone else still struggling along with RunPod, you can work around this by specifying the previous version instead of saying --upgrade:

pip install exllamav2==0.0.15

(Edit: looks like the same thing invictus mentioned above--use whichever works, I'm guessing sticking with the older version may work for longer since eventually exllamav2 may start using something that actually requires the newer version)

Treak23 · 2024-03-23T08:26:52Z

For anyone having issues just using pip install --upgrade exllamav2 try using pip install --upgrade --no-deps exllamav2 instead. It seems like its now attempting to update a lot of dependencies with just the normal upgrade command and breaking. For now the no deps command works but eventually it probably won't. Hopefully by then its solved.

Edit I uploaded a template on Runpod with a possible permanent fix. It's called TheBloke LLM One Click ExLlamaV2Cache_Q4 fix. As of right now it seems like everything should be working without issue. I've messed around with a few different LLM's and tried creating a few pods and it seems to work without issue.

Though do keep in mind I learned how to use pip and docker the other day to try and solve this so if it does break again I'll try to fix it but no guarantees.

Hi Thanks invictus, I tried to fix it as well in the last few days but i don't really know what I'm doing.

I can't find find your Template on Runpod, is it up already?

invictus-1 · 2024-03-23T12:39:04Z

For anyone having issues just using pip install --upgrade exllamav2 try using pip install --upgrade --no-deps exllamav2 instead. It seems like its now attempting to update a lot of dependencies with just the normal upgrade command and breaking. For now the no deps command works but eventually it probably won't. Hopefully by then its solved.
Edit I uploaded a template on Runpod with a possible permanent fix. It's called TheBloke LLM One Click ExLlamaV2Cache_Q4 fix. As of right now it seems like everything should be working without issue. I've messed around with a few different LLM's and tried creating a few pods and it seems to work without issue.
Though do keep in mind I learned how to use pip and docker the other day to try and solve this so if it does break again I'll try to fix it but no guarantees.

Hi Thanks invictus, I tried to fix it as well in the last few days but i don't really know what I'm doing.

I can't find find your Template on Runpod, is it up already?

Yeah it should be up. I don't know if I have to contact runpod or something first, cause I just clicked the public ticker for my template and thought that would be enough. I contacted them though so I guess we'll see what they say.

WouterGlorieux · 2024-03-23T14:29:49Z

I made a new template on runpod that should be working now, it is called text-generation-webui-oneclick-UI-and-API

https://runpod.io/console/gpu-cloud?template=00y0qvimn6&ref=2vdt3dn9

Seeing as thebloke has not posted any models for months it is likely that this repo will not get updated anytime soon if at all, so I forked this repo and I will try to maintain it further here:
https://github.com/ValyrianTech/dockerLLM

Mark21124 · 2024-03-24T10:14:32Z

I made a new template on runpod that should be working now, it is called text-generation-webui-oneclick-UI-and-API

https://runpod.io/console/gpu-cloud?template=00y0qvimn6&ref=2vdt3dn9

Seeing as thebloke has not posted any models for months it is likely that this repo will not get updated anytime soon if at all, so I forked this repo and I will try to maintain it further here: https://github.com/ValyrianTech/dockerLLM

Thx a lot WouterGlorieux.
Yeah it really looks like TheBloke is inactive at the moment i don't know, i cant anything find on his Discord or anywhere else but it looks like it.

Treak23 · 2024-03-25T03:12:47Z

Thanks WouterGlorieux I used your template and it worked great!

So you don't have to bother with runpod Invictus, but thanks anyways.

kodxana · 2024-05-13T14:17:52Z

@WouterGlorieux So we delisted official bloke template and added yours to Discovery page :)

WouterGlorieux · 2024-05-13T14:38:58Z

Awesome, thanks!

FajriFadli linked a pull request Mar 8, 2024 that will close this issue

Remove old version of exllamav2 to use the same version on text-generation-webui #18

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Runpod template is broken #17

Runpod template is broken #17

WouterGlorieux commented Mar 8, 2024

WouterGlorieux commented Mar 8, 2024 •

edited

Loading

FajriFadli commented Mar 8, 2024

noisefloordev commented Mar 13, 2024

invictus-1 commented Mar 21, 2024 •

edited

Loading

noisefloordev commented Mar 22, 2024 •

edited

Loading

Treak23 commented Mar 23, 2024

invictus-1 commented Mar 23, 2024 •

edited

Loading

WouterGlorieux commented Mar 23, 2024

Mark21124 commented Mar 24, 2024

Treak23 commented Mar 25, 2024

kodxana commented May 13, 2024

WouterGlorieux commented May 13, 2024

Runpod template is broken #17

Runpod template is broken #17

Comments

WouterGlorieux commented Mar 8, 2024

WouterGlorieux commented Mar 8, 2024 • edited Loading

FajriFadli commented Mar 8, 2024

noisefloordev commented Mar 13, 2024

invictus-1 commented Mar 21, 2024 • edited Loading

noisefloordev commented Mar 22, 2024 • edited Loading

Treak23 commented Mar 23, 2024

invictus-1 commented Mar 23, 2024 • edited Loading

WouterGlorieux commented Mar 23, 2024

Mark21124 commented Mar 24, 2024

Treak23 commented Mar 25, 2024

kodxana commented May 13, 2024

WouterGlorieux commented May 13, 2024

WouterGlorieux commented Mar 8, 2024 •

edited

Loading

invictus-1 commented Mar 21, 2024 •

edited

Loading

noisefloordev commented Mar 22, 2024 •

edited

Loading

invictus-1 commented Mar 23, 2024 •

edited

Loading