Skip to content

Commit 5950f55

Browse files
authored
[Doc] Group examples into categories (vllm-project#11782)
Signed-off-by: Harry Mellor <[email protected]>
1 parent a4e2b26 commit 5950f55

File tree

13 files changed

+240
-62
lines changed

13 files changed

+240
-62
lines changed

.gitignore

+1-4
Original file line numberDiff line numberDiff line change
@@ -79,10 +79,7 @@ instance/
7979

8080
# Sphinx documentation
8181
docs/_build/
82-
docs/source/getting_started/examples/*.rst
83-
!**/*.template.rst
84-
docs/source/getting_started/examples/*.md
85-
!**/*.template.md
82+
docs/source/getting_started/examples/
8683

8784
# PyBuilder
8885
.pybuilder/

docs/Makefile

+4
Original file line numberDiff line numberDiff line change
@@ -18,3 +18,7 @@ help:
1818
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
1919
%: Makefile
2020
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
21+
22+
clean:
23+
@$(SPHINXBUILD) -M clean "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
24+
rm -rf "$(SOURCEDIR)/getting_started/examples"

docs/requirements-docs.txt

+1
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@ sphinx-book-theme==1.0.1
33
sphinx-copybutton==0.5.2
44
myst-parser==3.0.1
55
sphinx-argparse==0.4.0
6+
sphinx-togglebutton==0.3.2
67
msgspec
78
cloudpickle
89

docs/source/conf.py

+4
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,10 @@
4343
"sphinx.ext.autosummary",
4444
"myst_parser",
4545
"sphinxarg.ext",
46+
"sphinx_togglebutton",
47+
]
48+
myst_enable_extensions = [
49+
"colon_fence",
4650
]
4751

4852
# Add any paths that contain templates here, relative to this directory.

docs/source/generate_examples.py

+222-42
Original file line numberDiff line numberDiff line change
@@ -1,54 +1,234 @@
1+
import itertools
12
import re
3+
from dataclasses import dataclass, field
24
from pathlib import Path
35

6+
ROOT_DIR = Path(__file__).parent.parent.parent.resolve()
7+
ROOT_DIR_RELATIVE = '../../../..'
8+
EXAMPLE_DIR = ROOT_DIR / "examples"
9+
EXAMPLE_DOC_DIR = ROOT_DIR / "docs/source/getting_started/examples"
10+
411

512
def fix_case(text: str) -> str:
6-
subs = [
7-
("api", "API"),
8-
("llm", "LLM"),
9-
("vllm", "vLLM"),
10-
("openai", "OpenAI"),
11-
("multilora", "MultiLoRA"),
12-
]
13-
for sub in subs:
14-
text = re.sub(*sub, text, flags=re.IGNORECASE)
13+
subs = {
14+
"api": "API",
15+
"cpu": "CPU",
16+
"llm": "LLM",
17+
"tpu": "TPU",
18+
"aqlm": "AQLM",
19+
"gguf": "GGUF",
20+
"lora": "LoRA",
21+
"vllm": "vLLM",
22+
"openai": "OpenAI",
23+
"multilora": "MultiLoRA",
24+
"mlpspeculator": "MLPSpeculator",
25+
r"fp\d+": lambda x: x.group(0).upper(), # e.g. fp16, fp32
26+
r"int\d+": lambda x: x.group(0).upper(), # e.g. int8, int16
27+
}
28+
for pattern, repl in subs.items():
29+
text = re.sub(rf'\b{pattern}\b', repl, text, flags=re.IGNORECASE)
1530
return text
1631

1732

18-
def generate_title(filename: str) -> str:
19-
# Turn filename into a title
20-
title = filename.replace("_", " ").title()
21-
# Handle acronyms and names
22-
title = fix_case(title)
23-
return f"# {title}"
33+
@dataclass
34+
class Index:
35+
"""
36+
Index class to generate a structured document index.
37+
38+
Attributes:
39+
path (Path): The path save the index file to.
40+
title (str): The title of the index.
41+
description (str): A brief description of the index.
42+
caption (str): An optional caption for the table of contents.
43+
maxdepth (int): The maximum depth of the table of contents. Defaults to 1.
44+
documents (list[str]): A list of document paths to include in the index. Defaults to an empty list.
45+
46+
Methods:
47+
generate() -> str:
48+
Generates the index content as a string in the specified format.
49+
""" # noqa: E501
50+
path: Path
51+
title: str
52+
description: str
53+
caption: str
54+
maxdepth: int = 1
55+
documents: list[str] = field(default_factory=list)
56+
57+
def generate(self) -> str:
58+
content = f"# {self.title}\n\n{self.description}\n\n"
59+
content += "```{toctree}\n"
60+
content += f":caption: {self.caption}\n:maxdepth: {self.maxdepth}\n"
61+
content += "\n".join(sorted(self.documents)) + "\n```\n"
62+
return content
63+
64+
65+
@dataclass
66+
class Example:
67+
"""
68+
Example class for generating documentation content from a given path.
69+
70+
Attributes:
71+
path (Path): The path to the main directory or file.
72+
category (str): The category of the document.
73+
main_file (Path): The main file in the directory.
74+
other_files (list[Path]): List of other files in the directory.
75+
title (str): The title of the document.
76+
77+
Methods:
78+
__post_init__(): Initializes the main_file, other_files, and title attributes.
79+
determine_main_file() -> Path: Determines the main file in the given path.
80+
determine_other_files() -> list[Path]: Determines other files in the directory excluding the main file.
81+
determine_title() -> str: Determines the title of the document.
82+
generate() -> str: Generates the documentation content.
83+
""" # noqa: E501
84+
path: Path
85+
category: str = None
86+
main_file: Path = field(init=False)
87+
other_files: list[Path] = field(init=False)
88+
title: str = field(init=False)
89+
90+
def __post_init__(self):
91+
self.main_file = self.determine_main_file()
92+
self.other_files = self.determine_other_files()
93+
self.title = self.determine_title()
94+
95+
def determine_main_file(self) -> Path:
96+
"""
97+
Determines the main file in the given path.
98+
If the path is a file, it returns the path itself. Otherwise, it searches
99+
for Markdown files (*.md) in the directory and returns the first one found.
100+
Returns:
101+
Path: The main file path, either the original path if it's a file or the first
102+
Markdown file found in the directory.
103+
Raises:
104+
IndexError: If no Markdown files are found in the directory.
105+
""" # noqa: E501
106+
return self.path if self.path.is_file() else list(
107+
self.path.glob("*.md")).pop()
108+
109+
def determine_other_files(self) -> list[Path]:
110+
"""
111+
Determine other files in the directory excluding the main file.
112+
113+
This method checks if the given path is a file. If it is, it returns an empty list.
114+
Otherwise, it recursively searches through the directory and returns a list of all
115+
files that are not the main file.
116+
117+
Returns:
118+
list[Path]: A list of Path objects representing the other files in the directory.
119+
""" # noqa: E501
120+
if self.path.is_file():
121+
return []
122+
is_other_file = lambda file: file.is_file() and file != self.main_file
123+
return [file for file in self.path.rglob("*") if is_other_file(file)]
124+
125+
def determine_title(self) -> str:
126+
return fix_case(self.path.stem.replace("_", " ").title())
127+
128+
def generate(self) -> str:
129+
# Convert the path to a relative path from __file__
130+
make_relative = lambda path: ROOT_DIR_RELATIVE / path.relative_to(
131+
ROOT_DIR)
132+
133+
content = f"Source <gh-file:{self.path.relative_to(ROOT_DIR)}>.\n\n"
134+
if self.main_file.suffix == ".py":
135+
content += f"# {self.title}\n\n"
136+
include = "include" if self.main_file.suffix == ".md" else \
137+
"literalinclude"
138+
content += f":::{{{include}}} {make_relative(self.main_file)}\n:::\n\n"
139+
140+
if not self.other_files:
141+
return content
142+
143+
content += "## Example materials\n\n"
144+
for file in self.other_files:
145+
include = "include" if file.suffix == ".md" else "literalinclude"
146+
content += f":::{{admonition}} {file.relative_to(self.path)}\n"
147+
content += ":class: dropdown\n\n"
148+
content += f":::{{{include}}} {make_relative(file)}\n:::\n"
149+
content += ":::\n\n"
150+
151+
return content
24152

25153

26154
def generate_examples():
27-
root_dir = Path(__file__).parent.parent.parent.resolve()
28-
29-
# Source paths
30-
script_dir = root_dir / "examples"
31-
script_paths = sorted(script_dir.glob("*.py"))
32-
33-
# Destination paths
34-
doc_dir = root_dir / "docs/source/getting_started/examples"
35-
doc_paths = [doc_dir / f"{path.stem}.md" for path in script_paths]
36-
37-
# Generate the example docs for each example script
38-
for script_path, doc_path in zip(script_paths, doc_paths):
39-
# Make script_path relative to doc_path and call it include_path
40-
include_path = '../../../..' / script_path.relative_to(root_dir)
41-
content = (f"{generate_title(doc_path.stem)}\n\n"
42-
f"Source: <gh-file:examples/{script_path.name}>.\n\n"
43-
f"```{{literalinclude}} {include_path}\n"
44-
":language: python\n"
45-
":linenos:\n```")
155+
# Create the EXAMPLE_DOC_DIR if it doesn't exist
156+
if not EXAMPLE_DOC_DIR.exists():
157+
EXAMPLE_DOC_DIR.mkdir(parents=True)
158+
159+
# Create empty indices
160+
examples_index = Index(
161+
path=EXAMPLE_DOC_DIR / "examples_index.md",
162+
title="Examples",
163+
description=
164+
"A collection of examples demonstrating usage of vLLM.\nAll documented examples are autogenerated using <gh-file:docs/source/generate_examples.py> from examples found in <gh-file:examples>.", # noqa: E501
165+
caption="Examples",
166+
maxdepth=1) # TODO change to 2 when examples start being categorised
167+
category_indices = {
168+
"offline_inference":
169+
Index(
170+
path=EXAMPLE_DOC_DIR / "examples_offline_inference_index.md",
171+
title="Offline Inference",
172+
description=
173+
"Offline inference examples demonstrate how to use vLLM in an offline setting, where the model is queried for predictions in batches.", # noqa: E501
174+
caption="Examples",
175+
),
176+
"online_serving":
177+
Index(
178+
path=EXAMPLE_DOC_DIR / "examples_online_serving_index.md",
179+
title="Online Serving",
180+
description=
181+
"Online serving examples demonstrate how to use vLLM in an online setting, where the model is queried for predictions in real-time.", # noqa: E501
182+
caption="Examples",
183+
),
184+
"other":
185+
Index(
186+
path=EXAMPLE_DOC_DIR / "examples_other_index.md",
187+
title="Other",
188+
description=
189+
"Other examples that don't strongly fit into the online or offline serving categories.", # noqa: E501
190+
caption="Examples",
191+
),
192+
}
193+
194+
examples = []
195+
# Find categorised examples
196+
for category in category_indices:
197+
category_dir = EXAMPLE_DIR / category
198+
py = category_dir.glob("*.py")
199+
md = category_dir.glob("*.md")
200+
for path in itertools.chain(py, md):
201+
examples.append(Example(path, category))
202+
# Find examples in subdirectories
203+
for path in category_dir.glob("*/*.md"):
204+
examples.append(Example(path.parent, category))
205+
# Find uncategorised examples
206+
py = EXAMPLE_DIR.glob("*.py")
207+
md = EXAMPLE_DIR.glob("*.md")
208+
for path in itertools.chain(py, md):
209+
examples.append(Example(path))
210+
# Find examples in subdirectories
211+
for path in EXAMPLE_DIR.glob("*/*.md"):
212+
# Skip categorised examples
213+
if path.parent.name in category_indices:
214+
continue
215+
examples.append(Example(path.parent))
216+
217+
# Generate the example documentation
218+
for example in examples:
219+
doc_path = EXAMPLE_DOC_DIR / f"{example.path.stem}.md"
46220
with open(doc_path, "w+") as f:
47-
f.write(content)
48-
49-
# Generate the toctree for the example scripts
50-
with open(doc_dir / "examples_index.template.md") as f:
51-
examples_index = f.read()
52-
with open(doc_dir / "examples_index.md", "w+") as f:
53-
example_docs = "\n".join(path.stem + ".md" for path in script_paths)
54-
f.write(examples_index.replace(r"%EXAMPLE_DOCS%", example_docs))
221+
f.write(example.generate())
222+
# Add the example to the appropriate index
223+
index = category_indices.get(example.category, examples_index)
224+
index.documents.append(example.path.stem)
225+
226+
# Generate the index files
227+
for category_index in category_indices.values():
228+
if category_index.documents:
229+
examples_index.documents.insert(0, category_index.path.name)
230+
with open(category_index.path, "w+") as f:
231+
f.write(category_index.generate())
232+
233+
with open(examples_index.path, "w+") as f:
234+
f.write(examples_index.generate())

docs/source/getting_started/examples/examples_index.template.md

-8
This file was deleted.

examples/fp8/README.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -56,7 +56,7 @@ python3 examples/fp8/extract_scales.py --quantized_model <QUANTIZED_MODEL_DIR> -
5656
```
5757
### 4. Load KV Cache Scaling Factors into VLLM.
5858
This script evaluates the inference throughput of language models using various backends such as vLLM. It measures the time taken to process a given number of prompts and generate sequences for each prompt. The recently generated KV cache scaling factors are now integrated into the benchmarking process and allow for KV cache scaling factors to be utilized for FP8.
59-
```python
59+
```
6060
# prerequisites:
6161
# - LLaMa 2 kv_cache_scales.json file
6262
@@ -90,7 +90,7 @@ optional arguments:
9090
--kv-cache-dtype {auto,fp8} Data type for kv cache storage. If "auto", will use model data type. FP8_E5M2 (without scaling) is only supported on cuda version greater than 11.8. On ROCm (AMD GPU), FP8_E4M3 is instead supported ```for common inference criteria.
9191
--quantization-param-path QUANT_PARAM_JSON Path to the JSON file containing the KV cache scaling factors. This should generally be supplied, when KV cache dtype is FP8. Otherwise, KV cache scaling factors default to 1.0, which may cause accuracy issues. FP8_E5M2 (without scaling) is only supported on cuda version greater than 11.8. On ROCm (AMD GPU), FP8_E4M3 is instead supported for common inference criteria.
9292
```
93-
```
9493
Example:
94+
```console
9595
python3 benchmarks/benchmark_throughput.py --input-len <INPUT_LEN> --output-len <OUTPUT_LEN> -tp <TENSOR_PARALLEL_SIZE> --kv-cache-dtype fp8 --quantization-param-path <path/to/kv_cache_scales.json> --model <path-to-llama2>
96-
```python
96+
```
File renamed without changes.

examples/production_monitoring/README.md examples/prometheus_grafana/README.md

+5-5
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,12 @@
1-
# vLLM + Prometheus/Grafana
1+
# Prometheus and Grafana
22

33
This is a simple example that shows you how to connect vLLM metric logging to the Prometheus/Grafana stack. For this example, we launch Prometheus and Grafana via Docker. You can checkout other methods through [Prometheus](https://prometheus.io/) and [Grafana](https://grafana.com/) websites.
44

55
Install:
66
- [`docker`](https://docs.docker.com/engine/install/)
77
- [`docker compose`](https://docs.docker.com/compose/install/linux/#install-using-the-repository)
88

9-
### Launch
9+
## Launch
1010

1111
Prometheus metric logging is enabled by default in the OpenAI-compatible server. Launch via the entrypoint:
1212
```bash
@@ -35,19 +35,19 @@ python3 ../../benchmarks/benchmark_serving.py \
3535

3636
Navigating to [`http://localhost:8000/metrics`](http://localhost:8000/metrics) will show the raw Prometheus metrics being exposed by vLLM.
3737

38-
### Grafana Dashboard
38+
## Grafana Dashboard
3939

4040
Navigate to [`http://localhost:3000`](http://localhost:3000). Log in with the default username (`admin`) and password (`admin`).
4141

42-
#### Add Prometheus Data Source
42+
### Add Prometheus Data Source
4343

4444
Navigate to [`http://localhost:3000/connections/datasources/new`](http://localhost:3000/connections/datasources/new) and select Prometheus.
4545

4646
On Prometheus configuration page, we need to add the `Prometheus Server URL` in `Connection`. For this setup, Grafana and Prometheus are running in separate containers, but Docker creates DNS name for each containers. You can just use `http://prometheus:9090`.
4747

4848
Click `Save & Test`. You should get a green check saying "Successfully queried the Prometheus API.".
4949

50-
#### Import Dashboard
50+
### Import Dashboard
5151

5252
Navigate to [`http://localhost:3000/dashboard/import`](http://localhost:3000/dashboard/import), upload `grafana.json`, and select the `prometheus` datasource. You should see a screen that looks like the following:
5353

0 commit comments

Comments
 (0)