TorchSharp.BitsAndBytes

The TorchSharp.BitsAndBytes is a C# binding library for bitsandbytes library from Huggingface. It provides 4Bit and 8Bit quantization for TorchSharp models.

Usage

4Bit Quantization && Dequantization

Note

4Bit quantization is only available for CUDA devices.

var input = torch.rand([dim * 4, dim], dtype: ScalarType.Float32).cuda(); // FP32 tensor, must be on cuda device
string quantizedDType = "fp4"; // Available options: "fp4", "nf4"
int blockSize = 64; // can be [64, 128, 256, 512, 1024]

// Quantize to 4Bit
(var quantizedTensor, var absMax, blockSize, var n) = BitsAndByteUtils.Quantize4Bit(input, quantizedDType, blockSize);

// Dequantize to FP32
var dequantizedTensor = BitsAndByteUtils.Dequantize4Bit(quantiedTensor, absMax, input.dtype, quantizedDType, n, input.shape, blockSize);

For more examples, please refer to the Benchmark section.

Benchmark


BenchmarkDotNet v0.14.0, Windows 11 (10.0.26100.3037)
Intel Core i9-14900K, 1 CPU, 32 logical and 24 physical cores
Memory: 64GB
GPU: RTX4090
.NET SDK 9.0.102
  [Host]     : .NET 8.0.12 (8.0.1224.60305), X64 RyuJIT AVX2
  DefaultJob : .NET 8.0.12 (8.0.1224.60305), X64 RyuJIT AVX2

Method	Mean	Error	StdDev
Quantize4Bit	536.35 μs	12.164 μs	35.290 μs
Dequantize4Bit	2,257.89 μs	44.542 μs	51.294 μs
GEMV_4Bit_FP4	84.16 μs	1.673 μs	3.223 μs
GEMV_4Bit_NF4	82.69 μs	4.329 μs	12.629 μs
GEMV_FP32	49.59 μs	0.975 μs	2.035 μs
GEMM_INT8	2,994.86 μs	12.144 μs	11.360 μs
GEMM_FP32	4,495.49 μs	35.264 μs	32.986 μs

Name	Name	Last commit message	Last commit date
Latest commit LittleLittleCloud Update README.md Mar 1, 2025 4167aff · Mar 1, 2025 History 42 Commits
.github/workflows	.github/workflows	update piepline	Mar 1, 2025
Redist	Redist	update	Mar 1, 2025
TorchSharp.BitsAndBytes.Benchmark	TorchSharp.BitsAndBytes.Benchmark	update	Mar 1, 2025
TorchSharp.BitsAndBytes.Tests	TorchSharp.BitsAndBytes.Tests	Add Int8GEMM (#2 )	Feb 11, 2025
TorchSharp.BitsAndBytes	TorchSharp.BitsAndBytes	update	Mar 1, 2025
nuget	nuget	add readme	Feb 9, 2025
.gitattributes	.gitattributes	update	Feb 8, 2025
.gitignore	.gitignore	update	Feb 9, 2025
.gitmodules	.gitmodules	add bitsandbytes 0.45.1	Feb 8, 2025
BranchInfo.props	BranchInfo.props	bump version	Mar 1, 2025
Dependencies.props	Dependencies.props	update	Feb 9, 2025
Directory.Build.props	Directory.Build.props	update	Mar 1, 2025
Directory.Packages.props	Directory.Packages.props	update	Feb 9, 2025
LICENSE	LICENSE	Initial commit	Feb 8, 2025
README.md	README.md	Update README.md	Mar 1, 2025
TorchSharp.BitsAndBytes.sln	TorchSharp.BitsAndBytes.sln	update	Mar 1, 2025
global.json	global.json	update	Feb 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TorchSharp.BitsAndBytes

Usage

4Bit Quantization && Dequantization

Benchmark

About

Releases 2

Packages 2

Languages

License

LittleLittleCloud/TorchSharp.BitsAndBytes

Folders and files

Latest commit

History

Repository files navigation

TorchSharp.BitsAndBytes

Usage

4Bit Quantization && Dequantization

Benchmark

About

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 2

Languages