Skip to content

Commit e6659da

Browse files
committed
Update and Organize documentation
- Revise the documentation for clarity and readability - Organize documentation into distinct, clealrly labeled sections. - Follow markdown standards
1 parent 97a69cf commit e6659da

File tree

1 file changed

+93
-85
lines changed

1 file changed

+93
-85
lines changed

Diff for: experiments/README.md

+93-85
Original file line numberDiff line numberDiff line change
@@ -1,135 +1,143 @@
1-
To run the experiments you need to update the script paths and install fire, pandas and tqdm
1+
# Running Experiments Guide
2+
3+
To run the experiments you need to update the script paths and install `fire`, `pandas` and `tqdm`
24

35
## Model Checkpoints
46

5-
Need checkpoints from https://github.com/facebookresearch/segment-anything
7+
You'll need to obtain model checkpoints from the [facebookresearch/segment-anything](https://github.com/facebookresearch/segment-anything) repository. Use the following commands to download them:
68

7-
wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth
9+
```bash
10+
wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth
11+
wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_b_01ec64.pth
12+
```
813

9-
wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_b_01ec64.pth
14+
## COCO2017 Dataset
1015

11-
## COCO2017 dataset
16+
To run experiments, you'll require the COCO2017 dataset. Download it using these commands:
1217

13-
Need to download
18+
```bash
19+
wget http://images.cocodataset.org/zips/val2017.zip
20+
wget http://images.cocodataset.org/annotations/annotations_trainval2017.zip
21+
```
1422

15-
wget http://images.cocodataset.org/zips/val2017.zip
16-
wget http://images.cocodataset.org/annotations/annotations_trainval2017.zip
23+
## Folder Structure of Experimental Data
1724

18-
## Folder structure of experimental data
19-
```
20-
experiments_data/tmp
21-
experiments_data/tmp/sam_coco_mask_center_cache
22-
experiments_data/tmp/sam_eval_masks_out
23-
experiments_data/datasets
24-
experiments_data/datasets/coco2017
25-
experiments_data/datasets/coco2017/val2017
26-
experiments_data/datasets/coco2017/annotations
27-
experiments_data/checkpoints
25+
Here's the folder structure you should set up for your experimental data:
26+
27+
```plaintext
28+
experiments_data/
29+
├── tmp/
30+
│ ├── sam_coco_mask_center_cache/
31+
│ ├── sam_eval_masks_out/
32+
├── datasets/
33+
│ ├── coco2017/
34+
│ ├── val2017/
35+
│ ├── annotations/
36+
├── checkpoints/
2837
```
29-
## Environment details
38+
39+
## Environment Details
3040

3141
### Hardware
32-
These experiments were run on an Amazon p4d.24xlarge instance. See the Product details of the EC2 website for the exact details. A few key highlights are
42+
43+
These experiments were conducted on an Amazon `p4d.24xlarge` instance with the following specifications:
3344

3445
- 8 A100 GPUs with 40960MiB running at 400W
3546
- 96 vCPUs
3647
- 1152 GiB of RAM
37-
- Software
3848

49+
### Software Versions
3950

40-
### Versions
51+
- **PyTorch**: Utilize the latest nightly build of PyTorch.
52+
- **Python**: Version 3.10.
53+
- **Segment-Anything Repositories**:
54+
- **Main Repository**: Use the original [facebookresearch/segment-anything](https://github.com/facebookresearch/segment-anything) for foundational code and functionalities.
55+
- **Fork with Enhancements**: Access the fork at [cpuhrsch/segment-anything](https://github.com/cpuhrsch/segment-anything). This fork includes additional commits essential for replicating the baseline and initial experiments.
56+
- **Optimized Version**: Explore the optimized code at [pytorch-labs/segment-anything-fast](https://github.com/pytorch-labs/segment-anything-fast). This repository offers performance improvements and is crucial for later experiments.
4157

42-
- PyTorch nightly and Python 3.10
43-
- https://github.com/cpuhrsch/segment-anything fork of https://github.com/facebookresearch/segment-anything with additional commits if you want to reproduce baseline and first few experiments
44-
- This https://github.com/pytorch-labs/segment-anything-fast
4558

46-
### Installation instructions
59+
### Installation Instructions
4760

48-
```
49-
$ conda create -n nightly20231117py310
50-
$ conda activate nightly20231117py310
51-
$ conda install python=3.10
52-
$ pip install https://download.pytorch.org/whl/nightly/cu121/torch-2.2.0.dev20231117%2Bcu121-cp310-cp310-linux_x86_64.whl
53-
$ pip install https://download.pytorch.org/whl/nightly/cu121/torchvision-0.17.0.dev20231117%2Bcu121-cp310-cp310-linux_x86_64.whl
54-
$ git clone https://github.com/cpuhrsch/segment-anything.git
55-
$ cd segment-anything
56-
$ pip install -e .
57-
$ cd ..
58-
$ git clone https://github.com/pytorch-labs/segment-anything-fast.git
59-
$ cd segment-anything-fast
60-
$ pip install -e .
61+
Follow these steps to set up the required environment:
62+
63+
```bash
64+
65+
conda create -n nightly20231117py310
66+
conda activate nightly20231117py310
67+
conda install python=3.10
68+
pip install https://download.pytorch.org/whl/nightly/cu121/torch-2.2.0.dev20231117%2Bcu121-cp310-cp310-linux_x86_64.whl
69+
pip install https://download.pytorch.org/whl/nightly/cu121/torchvision-0.17.0.dev20231117%2Bcu121-cp310-cp310-linux_x86_64.whl
70+
git clone https://github.com/cpuhrsch/segment-anything.git
71+
cd segment-anything
72+
pip install -e .
73+
cd ..
74+
git clone https://github.com/pytorch-labs/segment-anything-fast.git
75+
cd segment-anything-fast
76+
pip install -e .
6177
```
6278

63-
If you plan to run the scripts that run the experiments from segment-anything-fast it is important to install the segment-anything fork in editable mode so that the script can switch between different commits of the fork automatically.
79+
If you intend to run scripts from segment-anything-fast, install the segment-anything fork in editable mode to allow switching between different commits of the fork automatically.
6480

81+
### How to Run Experiments
6582

66-
### How to run experiments
83+
Use this command to run experiments:
6784

85+
```bash
86+
python run_experiments.py 16 vit_b <pytorch_github> <segment-anything_github> <path_to_experiments_data> --run-experiments --num-workers 32
6887
```
69-
$ python run_experiments.py 16 vit_b <pytorch_github> <segment-anything_github> <path_to_experiments_data> --run-experiments --num-workers 32
70-
```
71-
72-
If at any point you run into issue, please note that you can increase verbosity by adding `--capture_output False` to above command. Also, please don't hesitate to open an issue.
7388

89+
If you encounter any issues, add `--capture_output False` to increase verbosity, and feel free to open an issue.
7490

7591
### Data
76-
We are using the COCO2017 Validation (Val images) dataset. We use this dataset to serve as a somewhat realistic distribution of input images and aim to measure a) accuracy and b) performance.
77-
Measurement
78-
Accuracy
79-
Our main goal is to verify that our performance optimizations do not degrade the accuracy of the model. We do not aim to reproduce any paper results or aim to make statements about the accuracy of this model on the dataset. This measurement serves as an additional integration test in conjunction with numerous unit and other separate integration tests.
8092

81-
We calculate the center points of the mask annotations using a rudimentary version of https://arxiv.org/pdf/2304.02643.pdf, section D.1.Point Sampling ([code](https://github.com/pytorch-labs/segment-anything-fast/blob/67d5c894569e99b9fdba55cfcf2f724be9f68994/experiments/data.py#L10-L120)). These center points serve as annotations per image. Note that the number of masks and thus number of annotations per image vary.
93+
We utilize the COCO2017 Validation (Val images) dataset for these experiments. It provides a realistic distribution of input images for measuring accuracy and performance.
8294

83-
These images and annotations are given to the predict_torch method of an instance of SamPredictor to predict masks. These are then compared to the ground truth masks using the Intersection over Union (IoU) metric ([code](https://github.com/pytorch-labs/segment-anything-fast/blob/67d5c894569e99b9fdba55cfcf2f724be9f68994/experiments/metrics.py#L4-L22)). We calculate the mean IoU (mIoU) metric over the entire 5000 images of the validation dataset to track accuracy.
84-
Performance
85-
Our goal is to measure the runtime of PyTorch models. We purposefully exclude data movements or calculation of the metrics. Specifically we measure the execution time on the GPU of running the image encoder (e.g. vit_h) and SamPredictor.predict_torch ([code](https://github.com/pytorch-labs/segment-anything-fast/blob/67d5c894569e99b9fdba55cfcf2f724be9f68994/experiments/eval_combo.py#L127-L165), [code](https://github.com/pytorch-labs/segment-anything-fast/blob/67d5c894569e99b9fdba55cfcf2f724be9f68994/experiments/eval_combo.py#L68-L99)).
95+
### Measurement
8696

87-
Each experiment is run in a separate Python process created from scratch. We run three batches of warmup before each experiment. This also implies that we are excluding compilation time from benchmarking.
97+
#### Accuracy
8898

89-
We measure the execution time and calculate the number of images that can be processed per image (img/s). We also measure the maximum amount of memory allocated at the end of the process using torch.cuda.max_memory_allocated.
90-
Tracing
99+
Our primary goal is to ensure that performance optimizations do not compromise model accuracy. We do not aim to replicate paper results or make claims about model accuracy on the dataset. This measurement serves as an integration test alongside unit and other integration tests.
100+
We calculate mask annotation center points using a simplified version of [this method](https://arxiv.org/pdf/2304.02643.pdf), section D.1.Point Sampling ([code](https://github.com/pytorch-labs/segment-anything-fast/blob/67d5c894569e99b9fdba55cfcf2f724be9f68994/experiments/data.py#L10-L120)). These points serve as annotations per image, and the number of masks and annotations per image can vary.
101+
These images and annotations are provided to the `predict_torch` method of a `SamPredictor` instance for mask prediction. The predictions are then compared to ground truth masks using the Intersection over Union (IoU) metric ([code](https://github.com/pytorch-labs/segment-anything-fast/blob/67d5c894569e99b9fdba55cfcf2f724be9f68994/experiments/metrics.py#L4-L22)). We calculate the mean IoU (mIoU) metric over the entire 5000 images of the validation dataset to track accuracy.
91102

92-
We collect kernel and memory traces using PyTorch native tooling and analyze it with [Perfetto UI](https://perfetto.dev/). When collecting these traces and profiles we typically only limit us to a few batches. Otherwise the files can become very large and difficult to load.
103+
#### Performance
93104

94-
### Kernel traces
105+
Our objective is to measure the runtime of PyTorch models. We intentionally exclude data movement or metric calculation from measurements. Specifically, we measure the GPU execution time of running the image encoder (e.g., `vit_h`) and `SamPredictor.predict_torch` ([code](https://github.com/pytorch-labs/segment-anything-fast/blob/67d5c894569e99b9fdba55cfcf2f724be9f68994/experiments/eval_combo.py#L127-L165), [code](https://github.com/pytorch-labs/segment-anything-fast/blob/67d5c894569e99b9fdba55cfcf2f724be9f68994/experiments/eval_combo.py#L68-L99)).
106+
Each experiment runs in a separate Python process created from scratch. We execute three batches of warm-up before each experiment. This also means that we exclude compilation time from benchmarking.
107+
We measure the execution time and calculate the number of images processed per second (img/s). We also measure the maximum amount of memory allocated at the end of the process using `torch.cuda.max_memory_allocated`.
95108

96-
One can write a simple wrapper that runs a function under the tracer context and writes out the result to a compressed json file. The resulting chrome trace can then be analyzed with Perfetto UI.
109+
#### Tracing
97110

98-
```
99-
def profiler_runner(path, fn, *args, **kwargs):
100-
with torch.profiler.profile(
101-
activities=[torch.profiler.ProfilerActivity.CPU,
102-
torch.profiler.ProfilerActivity.CUDA],
103-
record_shapes=True) as prof:
104-
result = fn(*args, **kwargs)
105-
prof.export_chrome_trace(path)
106-
return result
107-
```
111+
We collect kernel and memory traces using PyTorch native tooling and analyze them with [Perfetto UI](https://perfetto.dev/). We typically limit the collection to a few batches to avoid generating excessively large files.
108112

109-
It can be very useful to annotate certain regions in these traces to map (pieces of) the code to the overall traces. For this we frequently use record_function. Consider the following as an example.
113+
##### Kernel Traces
110114

111-
```
112-
with torch.autograd.profiler.record_function("timed region"):
113-
with torch.autograd.profiler.record_function("image encoder"):
114-
features_batch = encoder(input_image_batch)
115-
features_batch = features_batch[:orig_input_image_batch_size]
116-
117-
with torch.autograd.profiler.record_function("nt predict_torch"):
118-
predictor.reset_image()
119-
[...]
115+
You can create a simple wrapper to run a function under the tracer context and write the result to a compressed JSON file. The resulting Chrome trace can be analyzed with Perfetto UI. Here's an example:
116+
117+
```python
118+
def profiler_runner(path, fn, *args, **kwargs):
119+
with torch.profiler.profile(
120+
activities=[torch.profiler.ProfilerActivity.CPU,
121+
torch.profiler.ProfilerActivity.CUDA],
122+
record_shapes=True) as prof:
123+
result = fn(*args, **kwargs)
124+
prof.export_chrome_trace(path)
125+
return result
120126
```
121127

122-
### Memory profiles
128+
It's useful to annotate specific regions in these traces to map code segments to the overall traces. For this, we frequently use `record_function`. See the example in the provided code.
123129

124-
We record the memory history and use memory_viz.py to convert the result into a human readable html file.
130+
##### Memory Profiles
125131

126-
```
132+
We record memory history and use `memory_viz.py` to convert the result into a human-readable HTML file. Here's an example:
133+
134+
```python
127135
def memory_runner(path, fn, *args, **kwargs):
128136
print("Start memory recording")
129137
torch.cuda.synchronize()
130138
torch.cuda.memory._record_memory_history(
131-
True,
132-
trace_alloc_max_entries=100000,
139+
True,
140+
trace_alloc_max_entries=100000,
133141
trace_alloc_record_context=True
134142
)
135143
result = fn(*args, **kwargs)
@@ -139,7 +147,7 @@ def memory_runner(path, fn, *args, **kwargs):
139147
import pickle
140148
with open(path, 'wb') as f:
141149
pickle.dump(snapshot, f)
142-
# Use to convert pickle file into html
150+
# Use to convert pickle file into HTML
143151
# python torch/cuda/_memory_viz.py trace_plot <snapshot>.pickle -o <snapshot>.html
144152
return result
145153
```

0 commit comments

Comments
 (0)