Skip to content

Commit 2d7f9ae

Browse files
docs: fix links in docs (ai-dynamo#256)
Co-authored-by: Anant Sharma <[email protected]>
1 parent 27abe13 commit 2d7f9ae

File tree

10 files changed

+14
-14
lines changed

10 files changed

+14
-14
lines changed

README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ The following examples require a few system level packages.
4141
apt-get update
4242
DEBIAN_FRONTEND=noninteractive apt-get install -yq python3-dev libucx0
4343
44-
pip install ai-dynamo nixl vllm==0.7.2+dynamo
44+
pip install ai-dynamo[all]
4545
```
4646

4747
> [!NOTE]

components/metrics/README.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,7 @@ metrics --component my_component --endpoint my_endpoint
6565
### Real Worker
6666

6767
To run a more realistic deployment to gathering metrics from,
68-
see the examples in [deploy/examples/llm](deploy/examples/llm).
68+
see the examples in [examples/llm](../../examples/llm).
6969

7070
For example, for a VLLM + KV Routing based deployment that
7171
exposes statistics on an endpoint labeled
@@ -88,7 +88,7 @@ endpoint name used for python-based workers that register a `KvMetricsPublisher`
8888

8989
To visualize the metrics being exposed on the Prometheus endpoint,
9090
see the Prometheus and Grafana configurations in
91-
[deploy/metrics](deploy/metrics):
91+
[deploy/metrics](../../deploy/metrics):
9292
```bash
9393
docker compose -f deploy/docker-compose.yml --profile metrics up -d
9494
```

deploy/dynamo/sdk/docs/sdk/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111

1212
# Introduction
1313

14-
Dynamo is a flexible and performant distributed inferencing solution for large-scale deployments. It is an ecosystem of tools, frameworks, and abstractions that makes the design, customization, and deployment of frontier-level models onto datacenter-scale infrastructure easy to reason about and optimized for your specific inferencing workloads. Dynamo's core is written in Rust and contains a set of well-defined Python bindings. Docs and examples for those can be found [here](../../../../README.md).
14+
Dynamo is a flexible and performant distributed inferencing solution for large-scale deployments. It is an ecosystem of tools, frameworks, and abstractions that makes the design, customization, and deployment of frontier-level models onto datacenter-scale infrastructure easy to reason about and optimized for your specific inferencing workloads. Dynamo's core is written in Rust and contains a set of well-defined Python bindings. Docs and examples for those can be found [here](../../../../../README.md).
1515

1616
Dynamo SDK is a layer on top of the core. It is a Python framework that makes it easy to create inference graphs and deploy them locally and onto a target K8s cluster. The SDK was heavily inspired by [BentoML's](https://github.com/bentoml/BentoML) open source deployment patterns and leverages many of its core primitives. The Dynamo CLI is a companion tool that allows you to spin up an inference pipeline locally, containerize it, and deploy it. You can find a toy hello-world example [here](../../README.md).
1717

docs/guides/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,7 @@ Distributed deployment where prefill and decode are done by separate workers tha
6464

6565
### Prerequisites
6666

67-
Start required services (etcd and NATS) using [Docker Compose](/deploy/docker-compose.yml)
67+
Start required services (etcd and NATS) using [Docker Compose](../../deploy/docker-compose.yml)
6868
```bash
6969
docker compose -f deploy/docker-compose.yml up -d
7070
```

docs/kv_cache_manager.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ The Dynamo KV Cache Manager feature addresses this challenge by enabling the off
1212
The Dynamo KV Cache Manager uses advanced caching policies that prioritize placing frequently accessed data in GPU memory, while less accessed data is moved to shared CPU memory, SSDs, or networked object storage. It incorporates eviction policies that strike a balance between over-caching (which can introduce lookup latencies) and under-caching (which leads to missed lookups and KV cache re-computation).
1313
Additionally, this feature can manage KV cache across multiple GPU nodes, supporting both distributed and disaggregated inference serving, and offers hierarchical caching capabilities, creating offloading strategies at the GPU, node, and cluster levels.
1414

15-
The Dynamo KV Cache Manager is designed to be framework-agnostic to support various backends, including TensorRT-TLLM, vLLM, and SGLang, and to facilitate the scaling of KV cache storage across large, distributed clusters using NVLink, NVIDIA Quantum switches, and NVIDIA Spectrum switches. It integrates with [NIXL](https://github.com/ai-dynamo/nixl/blob/omrik/documentation/docs/nixl.md) to enable data transfers across different worker instances and storage backends.
15+
The Dynamo KV Cache Manager is designed to be framework-agnostic to support various backends, including TensorRT-TLLM, vLLM, and SGLang, and to facilitate the scaling of KV cache storage across large, distributed clusters using NVLink, NVIDIA Quantum switches, and NVIDIA Spectrum switches. It integrates with [NIXL](https://github.com/ai-dynamo/nixl/blob/main/docs/nixl.md) to enable data transfers across different worker instances and storage backends.
1616

1717
## Design
1818

examples/llm/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,7 @@ sequenceDiagram
6464

6565
### Prerequisites
6666

67-
Start required services (etcd and NATS) using [Docker Compose](/deploy/docker-compose.yml)
67+
Start required services (etcd and NATS) using [Docker Compose](../../deploy/docker-compose.yml)
6868
```bash
6969
docker compose -f deploy/docker-compose.yml up -d
7070
```

launch/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -77,7 +77,7 @@ E.g. https://huggingface.co/bartowski/Llama-3.2-3B-Instruct-GGUF/blob/main/Llama
7777

7878
Download model file:
7979
```
80-
curl -L -o Llama-3.2-3B-Instruct-Q4_K_M.gguf "https://huggingface.co/bartowski/Llama-3.2-3B-Instruct-GGUF/blob/main/Llama-3.2-3B-Instruct-Q4_K_M.gguf?download=true"
80+
curl -L -o Llama-3.2-3B-Instruct-Q4_K_M.gguf "https://huggingface.co/bartowski/Llama-3.2-3B-Instruct-GGUF/resolve/main/Llama-3.2-3B-Instruct-Q4_K_M.gguf?download=true"
8181
```
8282

8383
## Run a model from local file

lib/bindings/python/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,7 @@ maturin develop --uv
5050

5151
## Pre-requisite
5252

53-
See [README.md](/lib/runtime/README.md).
53+
See [README.md](../../runtime/README.md#️-prerequisites).
5454

5555
## Hello World Example
5656

lib/runtime/README.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,7 @@ cargo test
4444

4545
The simplest way to deploy the pre-requisite services is using
4646
[docker-compose](https://docs.docker.com/compose/install/linux/),
47-
defined in the project's root [docker-compose.yml](docker-compose.yml).
47+
defined in [deploy/docker-compose.yml](../../deploy/docker-compose.yml).
4848

4949
```
5050
docker-compose up -d
@@ -109,7 +109,7 @@ Annotated { data: Some("d"), id: None, event: None, comment: None }
109109

110110
#### Python
111111

112-
See the [README.md](/lib/bindings/python/README.md) for details
112+
See the [README.md](../bindings/python/README.md) for details
113113

114114
The Python and Rust `hello_world` client and server examples are interchangeable,
115115
so you can start the Python `server.py` and talk to it from the Rust `client`.

support_matrix.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -39,12 +39,12 @@ If you are using a **GPU**, the following GPU models and architectures are suppo
3939
| **Dependency** | **Version** |
4040
|------------------|-------------|
4141
|**Base Container**| 25.01 |
42-
| **vLLM** |0.7.2+dynamo*|
42+
|**ai-dynamo-vllm**| 0.7.2* |
4343
|**TensorRT-LLM** | 0.19.0** |
4444
|**NIXL** | 0.1.0 |
4545

4646
> **Note**:
47-
> - *v0.7.2+dynamo is a customized patch of v0.7.2 from vLLM.
47+
> - *ai-dynamo-vllm v0.7.2 is a customized patch of v0.7.2 from vLLM.
4848
> - **The specific version of TensorRT-LLM (planned v0.19.0) that will be supported by Dynamo is subject to change.
4949
5050

@@ -54,4 +54,4 @@ If you are using a **GPU**, the following GPU models and architectures are suppo
5454
- **Wheels**: Pre-built Python wheels are only available for **x86_64 Linux**. No wheels are available for other platforms at this time.
5555
- **Container Images**: We distribute only the source code for container images, and only **x86_64 Linux** is supported for these. Users must build the container image from source if they require it.
5656

57-
Once you've confirmed that your platform and architecture are compatible, you can install **Dynamo** by following the instructions in the [Quick Start Guide](https://github.com/ai-dynamo/dynamo/?tab=readme-ov-file#quick-start).
57+
Once you've confirmed that your platform and architecture are compatible, you can install **Dynamo** by following the instructions in the [Quick Start Guide](https://github.com/ai-dynamo/dynamo/blob/main/README.md#installation).

0 commit comments

Comments
 (0)