Skip to content

LittleLittleCloud/TorchSharp.BitsAndBytes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

4167aff · Mar 1, 2025

History

42 Commits
Mar 1, 2025
Mar 1, 2025
Mar 1, 2025
Feb 11, 2025
Mar 1, 2025
Feb 9, 2025
Feb 8, 2025
Feb 9, 2025
Feb 8, 2025
Mar 1, 2025
Feb 9, 2025
Mar 1, 2025
Feb 9, 2025
Feb 8, 2025
Mar 1, 2025
Mar 1, 2025
Feb 8, 2025

Repository files navigation

TorchSharp.BitsAndBytes

The TorchSharp.BitsAndBytes is a C# binding library for bitsandbytes library from Huggingface. It provides 4Bit and 8Bit quantization for TorchSharp models.

License NuGet Version

Usage

4Bit Quantization && Dequantization

Note

4Bit quantization is only available for CUDA devices.

var input = torch.rand([dim * 4, dim], dtype: ScalarType.Float32).cuda(); // FP32 tensor, must be on cuda device
string quantizedDType = "fp4"; // Available options: "fp4", "nf4"
int blockSize = 64; // can be [64, 128, 256, 512, 1024]

// Quantize to 4Bit
(var quantizedTensor, var absMax, blockSize, var n) = BitsAndByteUtils.Quantize4Bit(input, quantizedDType, blockSize);

// Dequantize to FP32
var dequantizedTensor = BitsAndByteUtils.Dequantize4Bit(quantiedTensor, absMax, input.dtype, quantizedDType, n, input.shape, blockSize);

For more examples, please refer to the Benchmark section.

Benchmark


BenchmarkDotNet v0.14.0, Windows 11 (10.0.26100.3037)
Intel Core i9-14900K, 1 CPU, 32 logical and 24 physical cores
Memory: 64GB
GPU: RTX4090
.NET SDK 9.0.102
  [Host]     : .NET 8.0.12 (8.0.1224.60305), X64 RyuJIT AVX2
  DefaultJob : .NET 8.0.12 (8.0.1224.60305), X64 RyuJIT AVX2


Method Mean Error StdDev
Quantize4Bit 536.35 μs 12.164 μs 35.290 μs
Dequantize4Bit 2,257.89 μs 44.542 μs 51.294 μs
GEMV_4Bit_FP4 84.16 μs 1.673 μs 3.223 μs
GEMV_4Bit_NF4 82.69 μs 4.329 μs 12.629 μs
GEMV_FP32 49.59 μs 0.975 μs 2.035 μs
GEMM_INT8 2,994.86 μs 12.144 μs 11.360 μs
GEMM_FP32 4,495.49 μs 35.264 μs 32.986 μs