Skip to content

Commit 830cb31

Browse files
Update 2025-03-06-benchmark.md
1 parent 641199d commit 830cb31

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

_posts/2025-03-06-benchmark.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ image: /assets/img/benchmark_e2e_brix.png
2626
<!-- - Recently, **AIBrix**, released by **ByteDance**, boasts various necessary features built out for production settings. -->
2727

2828
- Since real-world performance numbers are not public (which will be a future blog!), today we released a **benchmark** everyone can test vLLM Production Stack. In particular, it shows that vLLM Production Stack performs **10X** faster and more cost-efficient, in prefill-heavy workloads, than the baseline vLLM deployment method as well as AIBrix, another full-stack system recently released by **ByteDance**. <!-- Moreover, we show that **AIBrix** perform even **worse** than a **naive vLLM** + K8s setup.-->
29-
We made public both the benchmark [*scripts*](https://github.com/vllm-project/production-stack/blob/main/tutorials/08-benchmark-multi-round-qa-multi-gpu.md) and [*tutorial*](https://github.com/vllm-project/production-stack/blob/main/tutorials/08-benchmark-multi-round-qa-multi-gpu.md).
29+
We made public both the benchmark [*scripts*](https://github.com/vllm-project/production-stack/tree/main/benchmarks/multi-round-qa) and [*tutorial*](https://github.com/vllm-project/production-stack/blob/main/tutorials/08-benchmark-multi-round-qa-multi-gpu.md).
3030
<!-- - In order to make it easy for everyone to reproduce the results and test with more benchmarks, we relase our [scripts](https://github.com/vllm-project/production-stack/blob/main/tutorials/07-benchmark-multi-round-qa-multi-gpu.md) and [tutorial](https://github.com/vllm-project/production-stack/blob/main/tutorials/07-benchmark-multi-round-qa-multi-gpu.md) to further facilitate the development of open-source LLM serving solutions.-->
3131

3232
##### [[vLLM Production Stack Github]](https://github.com/vllm-project/production-stack) | [[Get In Touch]](https://forms.gle/Jaq2UUFjgvuedRPV8) | [[Slack]](https://vllm-dev.slack.com/archives/C089SMEAKRA) | [[Linkedin]](https://www.linkedin.com/company/lmcache-lab) | [[Twitter]](https://x.com/lmcache)

0 commit comments

Comments
 (0)