[GSoC] Add block quantized models #270

DaniAffCH · 2024-08-15T20:53:23Z

This PR introduces block-quantized versions for most of the opencv_zoo models.
All the models have been quantized using a block size of 64, as this configuration demonstrated good performance empirically.

Additionally, the block quantization tool has been enhanced to handle more cases:

The tool now supports weights categorized as constant in ONNX (previously, it only supported initializers).
It now properly handles blocks where all elements are identical.
Data type saturation has been added to prevent overflow during quantization.
The error metric has been updated to be independent of the weights' order of magnitude
A new verbose mode has been added to assist in troubleshooting.

Finally, the benchmark tool has been modified to support block quantized models.

The following table contains block quantization statistics of the quantized models

Model	Original Size (KB)	Block Quantized Size (KB)
face_detection_yunet	227.14	119.62
face_recognition_sface	37,789.41	10,417.82
facial_expression_recognition_mobilefacenet	4,679.58	1,344.44
handpose_estimation_mediapipe	4,003.54	1,199.43
human_segmentation_pphumanseg	6,019.47	1,694.07
image_classification_mobilenetv1	16,494.27	4,491.59
image_classification_mobilenetv2	13,637.28	3,782.18
image_classification_ppresnet50	100,163.12	27,435.20
license_plate_detection_lpd_yunet	4,049.04	1,158.07
object_detection_nanodet	3,711.87	1,097.62
object_detection_yolox	35,017.58	9,516.03
object_tracking_vittrack	697.97	264.97
optical_flow_estimation_raft	62,616.54	47,700.30
palm_detection_mediapipe	3,814.19	1,141.94
person_detection_mediapipe	11,709.14	3,400.44
person_reid_youtu	104,373.44	28,518.79
pose_estimation_mediapipe	5,426.99	1,655.17
text_detection_en_ppocrv3	2,366.69	835.33
text_detection_cn_ppocrv3	2,366.69	835.33
text_recognition_CRNN_CN	71,100.74	28,346.08
text_recognition_CRNN_CH	63,385.71	26,257.37

The tables below summarize the metrics change between the original fp32 model, block quantized and int8 quantized version:

Models	Accuracy
SFace	0.9940
SFace block	0.9942
SFace quant	0.9932

Models	Easy AP	Medium AP	Hard AP
YuNet	0.8844	0.8656	0.7503
YuNet block	0.8845	0.8652	0.7504
YuNet quant	0.8810	0.8629	0.7503

Models	Accuracy	mIoU
PPHumanSeg	0.9656	0.9164
PPHumanSeg block	0.9655	0.9162
PPHumanSeg quant	0.7285	0.3642

Models	Top-1 Accuracy	Top-5 Accuracy
MobileNet V1	67.64	87.97
MobileNet V1 block	67.21	87.62
MobileNet V1 quant	55.53	78.74
MobileNet V2	69.44	89.23
MobileNet V2 block	68.66	88.90
MobileNet V2 quant	68.37	88.56

Models	Top-1 Accuracy	Top-5 Accuracy
PP-ResNet	82.28	96.15
PP-ResNet block	82.27	96.15
PP-ResNet quant	0.22	0.96

The following models haven't been quantized:

image_segmentation_efficientsam: Not compliant with the ONNX standard (Efficient SAM is not compliant with onnx standard #269).
text_recognition_CRNN_EN: Even a minor error prevents the model from correctly predicting the characters, despite accurately predicting the text bounding box. The reason for this issue remains unclear, as the Chinese version of the model works properly.

- constant weight category supported - add data type saturation - handled the case in which all the elements within a block are the same benchmark script modified to support block quantized models block quantized some models

…dpose blocked model fix, removed blocked CRNN EN,

DaniAffCH added 7 commits August 4, 2024 22:35

Gemm and MatMul block quantization support

4698403

refactoring

10cbbeb

fix indentation

974d32e

node name independent

00b8dde

Merge branch 'main' of https://github.com/DaniAffCH/opencv_zoo

fb54356

Block quantization tool:

fd7c9fb

- constant weight category supported - add data type saturation - handled the case in which all the elements within a block are the same benchmark script modified to support block quantized models block quantized some models

add missing block quantized models

5639eba

DaniAffCH changed the title ~~Add block quantized models~~ [GSoC] Add block quantized models Aug 15, 2024

formatting

0009ab6

fengyuentau self-requested a review August 16, 2024 07:20

fengyuentau self-assigned this Aug 16, 2024

fengyuentau added the GSoC Google Summer of Code projected related label Aug 16, 2024

DaniAffCH added 4 commits August 17, 2024 14:28

add blocked models to eval script. Evaluation yunet

0fbcdaf

Add sface and pphumanseg evaluation, block quantization tool fix, han…

1642774

…dpose blocked model fix, removed blocked CRNN EN,

changed evaluation metric in block_quantize script and add verbose mode

11806d7

Add evaluation for PP-ResNet and Mobilenet

4f59fc7

DaniAffCH marked this pull request as ready for review August 18, 2024 16:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[GSoC] Add block quantized models #270

[GSoC] Add block quantized models #270

DaniAffCH commented Aug 15, 2024 •

edited

Loading

[GSoC] Add block quantized models #270

Are you sure you want to change the base?

[GSoC] Add block quantized models #270

Conversation

DaniAffCH commented Aug 15, 2024 • edited Loading

DaniAffCH commented Aug 15, 2024 •

edited

Loading