Skip to content

Pull requests: HabanaAI/vllm-fork

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Update requirements-hpu.txt
#945 opened Mar 21, 2025 by michalkuligowski Loading…
Update hpu_worker.py
#943 opened Mar 21, 2025 by michalkuligowski Loading…
add ScaleToHwAligned for loading fp8 vllm model
#941 opened Mar 21, 2025 by changwangss Loading…
Enable Delayed Sampling by default
#937 opened Mar 20, 2025 by mswiniarsk Loading…
Add VLLM_T_COMPILE_FULLGRAPH flag
#932 opened Mar 19, 2025 by anko-intel Loading…
multi-image support for llama3.2 [1/N]
#926 opened Mar 18, 2025 by zhouyu5 Loading…
Enable embedding online serving benchmark test
#922 opened Mar 17, 2025 by yeonsily Loading…
Make lazy mode autodetection more robust
#921 opened Mar 17, 2025 by kzawora-intel Loading…
Enable split qkv for LLama and GPTBigCode
#914 opened Mar 14, 2025 by kdamaszk Loading…
Fix spec decoding warmup
#906 opened Mar 11, 2025 by yangw1234 Loading…
Bump jinja2 from 3.1.4 to 3.1.6 dependencies Pull requests that update a dependency file python Pull requests that update python code
#891 opened Mar 6, 2025 by dependabot bot Loading…
ProTip! Updated in the last three days: updated:>2025-03-18.