pip install -r requirements.txt
- Option 1 (Recommended): Quantize weights directly
python quant.py --model_path /path/to/DeepSeek/R1/BF16/ --qmodel_path /path/to/DeepSeek/R1-Dynamic-FP8 --low_cpu_mem
- Option 2: Load the model using transformers (requires ~700 GB of DRAM)
python quant.py --model_path /path/to/DeepSeek/R1/BF16/ --qmodel_path /path/to/DeepSeek/R1/Dynamic-FP8
Note
- weight dtype is
torch.float8_e4m3fn
(full range is-448
to448
) WEIGHT_BACKOFF = 0.5
SCALE_DTYPE = torch.bfloat16
Since DeepSeek V3 and R1 are not yet supported by Transformers, we need to manually copy some model files.
python post_process.py --model_path /path/to/DeepSeek/R1/BF16/ --qmodel_path /path/to/DeepSeek/R1/Dynamic-FP8
- Name convention:
- weight scale name:
prefix.scale_weight
- input scale name:
prefix.scale_input
(for static only)
- weight scale name:
- A json file mapping from tensor name to safetensor file name.
class M(torch.nn.Module):
def __init__(self) -> None:
super().__init__()
self.fc1 = torch.nn.Linear(10, 5, bias=False)
def forward(self, inp):
x1 = self.fc1(inp)
return x1
1. state dict
{
"fc1.weight": torch.Tensor(...),
"fc1.scale_weight": torch.Tensor(...),
"fc1.scale_input": torch.Tensor(...),
}
2. json file, `model.safetensors.index.json`
{
"fc1.weight": "qmodel.safetensors",
"fc1.scale_weight": "qmodel.safetensors",
"fc1.scale_input": "qmodel.safetensors"
}