Skip to content

Navigation Menu

Explore
By company size
By use case
By industry
View all solutions
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

HabanaAI / vllm-fork Public

forked from vllm-project/vllm

Notifications You must be signed in to change notification settings
Fork 80
Star 59

Code
Issues 9
Pull requests 50
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: HabanaAI/vllm-fork

Labels 17 Milestones 0

Labels 17 Milestones 0

New pull request New

50 Open 818 Closed

50 Open 818 Closed

Author

Filter by author

Loading

Label

Filter by label

Loading

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Loading

Milestones

Filter by milestone

Loading

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Loading

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Update requirements-hpu.txt

#945 opened Mar 21, 2025 by michalkuligowski

Loading…

APC fully cached prefill moved to decode

#944 opened Mar 21, 2025 by adobrzyn • Draft

Update hpu_worker.py

#943 opened Mar 21, 2025 by michalkuligowski

Loading…

[WIP] merged_prefill+ - initial cleanup

#942 opened Mar 21, 2025 by madamczykhabana • Draft

add ScaleToHwAligned for loading fp8 vllm model

#941 opened Mar 21, 2025 by changwangss

Loading…

Cherry-pick Softmax skip option for greedy

#939 opened Mar 20, 2025 by attafosu • Draft

Enable Delayed Sampling by default

#937 opened Mar 20, 2025 by mswiniarsk

Loading…

[PD+DP][WIP]add mooncake store to support xPyD

#936 opened Mar 20, 2025 by jikunshang • Draft

Add VLLM_T_COMPILE_FULLGRAPH flag

#932 opened Mar 19, 2025 by anko-intel

Loading…

4

[SW-218309] Updated Tests Workflow To Support Forked PR's

#930 opened Mar 19, 2025 by RonBenMosheHabana

Loading…

3

multi-image support for llama3.2 [1/N]

#926 opened Mar 18, 2025 by zhouyu5

Loading…

1

fix server crash when the client use random seed sampling

#924 opened Mar 18, 2025 by yangulei

Loading…

1

Enable embedding online serving benchmark test

#922 opened Mar 17, 2025 by yeonsily

Loading…

2

Make lazy mode autodetection more robust

#921 opened Mar 17, 2025 by kzawora-intel

Loading…

3

fix G2D FP32 gemm. cherry-pick 970ef08dbf

#917 opened Mar 16, 2025 by jikunshang • Draft

3

Fixing (1) tensor size mismatch and (2) missing prepare_cos_sin issues for Phi-3.5

#916 opened Mar 14, 2025 by mrezavand

Loading…

3

Split qkv and gate_up for llama; TP overlap with mlp

#915 opened Mar 14, 2025 by tianmu-li

Loading…

7

Enable split qkv for LLama and GPTBigCode

#914 opened Mar 14, 2025 by kdamaszk

Loading…

4

[Draft] [DNM]PD distributed support for deepseek r1

#912 opened Mar 14, 2025 by jikunshang • Draft

1

Fix spec decoding warmup

#906 opened Mar 11, 2025 by yangw1234

Loading…

3

Synchronize vLLM flags to support cross-node inference

#897 opened Mar 7, 2025 by IT-Forrest

Loading…

3

Add LoRA parameter to test_lm_eval_correctness

#894 opened Mar 6, 2025 by mkrze • Draft

1

Cherry-pick of "Selective merged prefill #643"

#893 opened Mar 6, 2025 by kamil-kaczor

Loading…

Bump jinja2 from 3.1.4 to 3.1.6 dependencies

Pull requests that update a dependency file

Pull requests that update python code

#891 opened Mar 6, 2025 by dependabot bot

Loading…

Added the logic to fix the warmup phase for spec decoding when enforce_eager is not used

#880 opened Feb 28, 2025 by pallavijaini0525

Loading…

3

Previous 1 2 Next

Previous Next

ProTip! Updated in the last three days: updated:>2025-03-18.

Footer

© 2025 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.