The stack deployment of vLLM
What's Changed
- [CI/Build][Router] Make semantic caching optional by @Shaoting-Feng in #218
- [Benchmark] Add router config in tutorial by @Shaoting-Feng in #223
- refactor: standard fastapi project structure for better main… by @BrianPark314 in #217
- Added lora support proposal by @wangchen615 in #216
- [Feat] Added
initContainer
tomodelSpec
by @AbelHristodor in #221 - [Router] Fix semantic cache check in chat completion url by @Shaoting-Feng in #224
- [Doc] Change repo in tutorial 08 naive k8s by @Shaoting-Feng in #225
- [Doc] Update community meeting calendar invite by @YuhanLiu11 in #231
- [Doc] Fix
startupProbe
indentation invalues-07
tutorial file by @AbelHristodor in #226 - [Doc] Initial docs structure by @Siddhant-Ray in #234
- [Doc] Update endpoint in 01 tutorial by @Shaoting-Feng in #236
- [Doc] add example page and readme by @Siddhant-Ray in #241
- [Doc] Fix typo of model name and output len in AIBrix by @Shaoting-Feng in #242
- [Doc] Add doc page for benchmark qa by @Siddhant-Ray in #243
- [Doc] add doc on gcp.rst by @EaminC in #249
- [Feat] add vllm-api-key by @JustinDuy in #194
- [CI/Build] Add concurrency to functionality test by @Shaoting-Feng in #219
- [Doc] update tutorial and user manual docs by @Siddhant-Ray in #257
- [Doc] Add docs for router CRD config and dev, some small tweaks by @Siddhant-Ray in #259
- [FEAT] Terraform Quickstart Tutorials for Google GKE by @falconlee236 in #250
- [Feat] add requestGPUType to modelSpec by @Hexoplon in #253
- [Doc][CI/Build] Minor fix by @Shaoting-Feng in #258
- [Doc] dev api docs, bug fixes by @Siddhant-Ray in #266
- [Feat] add explicit resource limit values by @Hexoplon in #255
- [DOC] format unified gcp.rst adding trouble shooting by @EaminC in #263
- [Doc] Minor fix in tutorial by @YuhanLiu11 in #272
- [Doc] Minor fix in benchmarking scripts by @YuhanLiu11 in #273
- [Tutorial] Deployment on Azure AKS by @surajssd in #247
- [Feat] add model label on engine deployments by @Hexoplon in #269
- [Misc] Add
schedulerName
in servingEngineSpec by @hongkunyoo in #275 - [Feat] Remove sudo requirement for kubectl and helm by @Romero027 in #256
- [Benchmark] Minor fix in benchmark script by @YuhanLiu11 in #284
- [Benchmark] Minor updates to benchmark script by @YuhanLiu11 in #286
- [Doc] Minor fix in tutorials by @YuhanLiu11 in #288
- [Feat] add extraVolume and extraVolumeMount helm variables by @Hexoplon in #280
- Update 09-lora-enabled-installation.md by @wangchen615 in #287
- chore: use extra deps to optionally install additional pkg by @rootfs in #289
- [Feat] Request rewriter interface in router by @ApostaC in #230
- [Feat] add security context to servingEngineSpec by @Hexoplon in #282
- [Doc] add docs link to readme by @Siddhant-Ray in #290
- chore: update e2e test to use python 3.12 to match setup.py requirements by @rootfs in #295
- [CI/Build] Github action for building docs pipeline by @Siddhant-Ray in #291
- Add
.readthedocs.yaml
by @hmellor in #296 - Hotfix readthedocs build by @hmellor in #298
- Update docs link in README by @hmellor in #299
- feat: support PII detection in http request by @rootfs in #235
- [Bugfix]: add missing v1 prefix by @Xunzhuo in #302
- [Misc] Bumping version to 0.1.1 by @YuhanLiu11 in #308
New Contributors
- @AbelHristodor made their first contribution in #221
- @JustinDuy made their first contribution in #194
- @falconlee236 made their first contribution in #250
- @Hexoplon made their first contribution in #253
- @surajssd made their first contribution in #247
- @hongkunyoo made their first contribution in #275
- @Romero027 made their first contribution in #256
- @Xunzhuo made their first contribution in #302
Full Changelog: vllm-stack-0.1.0...vllm-stack-0.1.1