Update 2025-03-06-benchmark.md

Shaoting-Feng · web-flow · commit 830cb311cd33 · 2025-03-06T20:13:01.000-06:00
diff --git a/_posts/2025-03-06-benchmark.md b/_posts/2025-03-06-benchmark.md
@@ -26,7 +26,7 @@ image: /assets/img/benchmark_e2e_brix.png
 <!-- - Recently, **AIBrix**, released by **ByteDance**, boasts various necessary features built out for production settings. -->
 
 - Since real-world performance numbers are not public (which will be a future blog!), today we released a **benchmark** everyone can test vLLM Production Stack. In particular, it shows that vLLM Production Stack performs **10X** faster and more cost-efficient, in prefill-heavy workloads, than the baseline vLLM deployment method as well as AIBrix, another full-stack system recently released by **ByteDance**. <!-- Moreover, we show that **AIBrix** perform even **worse** than a **naive vLLM** + K8s setup.-->
-We made public both the benchmark [*scripts*](https://github.com/vllm-project/production-stack/blob/main/tutorials/08-benchmark-multi-round-qa-multi-gpu.md) and [*tutorial*](https://github.com/vllm-project/production-stack/blob/main/tutorials/08-benchmark-multi-round-qa-multi-gpu.md). 
+We made public both the benchmark [*scripts*](https://github.com/vllm-project/production-stack/tree/main/benchmarks/multi-round-qa) and [*tutorial*](https://github.com/vllm-project/production-stack/blob/main/tutorials/08-benchmark-multi-round-qa-multi-gpu.md). 
 <!-- - In order to make it easy for everyone to reproduce the results and test with more benchmarks, we relase our [scripts](https://github.com/vllm-project/production-stack/blob/main/tutorials/07-benchmark-multi-round-qa-multi-gpu.md) and [tutorial](https://github.com/vllm-project/production-stack/blob/main/tutorials/07-benchmark-multi-round-qa-multi-gpu.md) to further facilitate the development of open-source LLM serving solutions.-->
 
 ##### [[vLLM Production Stack Github]](https://github.com/vllm-project/production-stack) | [[Get In Touch]](https://forms.gle/Jaq2UUFjgvuedRPV8) | [[Slack]](https://vllm-dev.slack.com/archives/C089SMEAKRA) | [[Linkedin]](https://www.linkedin.com/company/lmcache-lab) | [[Twitter]](https://x.com/lmcache)