[GSoC] dnn: Blockwise quantization support #25644

DaniAffCH · 2024-05-25T12:20:50Z

This PR introduces blockwise quantization in DNN allowing the parsing of ONNX models quantized in blockwise style. In particular it modifies the Quantize and Dequantize operations. The related PR opencv/opencv_extra#1181 contains the test data.

Additional notes:

The original quantization issue has been fixed. Previously, for 1D scale and zero-point, the operation applied was $y = int8(x/s - z)$ instead of $y = int8(x/s + z)$. Note that the operation was already correctly implemented when the scale and zero-point were scalars. The previous implementation failed the ONNX test cases, but now all have passed successfully. Reference
the function block_repeat broadcasts scale and zero-point to the input shape. It repeats all the elements of a given axis n times. This function generalizes the behavior of repeat from the core module which is defined just for 2 axis assuming Mat has 2 dimensions. If appropriate and useful, you might consider moving block_repeat to the core module.
Now, the scale and zero-point can be taken as layer inputs. This increases the ONNX layers' coverage and enables us to run the ONNX test cases (previously disabled) being fully compliant with ONNX standards. Since they are now supported, I have enabled the test cases for: test_dequantizelinear, test_dequantizelinear_axis, test_dequantizelinear_blocked, test_quantizelinear, test_quantizelinear_axis, test_quantizelinear_blocked just in CPU backend. All of them pass successfully.

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

I agree to contribute to the project under Apache 2 License.
To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
The PR is proposed to the proper branch
There is a reference to the original bug report and related work
There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
The feature is well documented and sample code can be built with the project CMake

DaniAffCH · 2024-05-29T08:52:20Z

@fengyuentau feel free to review

modules/dnn/src/onnx/onnx_importer.cpp

modules/dnn/test/test_onnx_conformance.cpp

modules/dnn/src/int8layers/quantization_utils.cpp

modules/dnn/test/test_onnx_conformance_layer_parser_denylist.inl.hpp

modules/dnn/src/int8layers/quantization_utils.cpp

DaniAffCH · 2024-07-01T18:30:44Z

All previous comments have been addressed. For multiple inferences, the scale and zero point are now cached and reused in subsequent executions when they are inputs.

fengyuentau

Actually these comments were made a week ago but I forget to submit it... Anyway your PR looks good to me.

fengyuentau · 2024-07-02T08:32:04Z

modules/dnn/src/int8layers/quantization_utils.cpp

+
+    copyVecToMat(tmpMat,data);
+
+    block_repeat(tmpMat, axis, block_size, mat);


Can we avoid creating a new Mat here? Repeating is like reusing a piece of memory multiple times. We can rewrite the forward implementation with a loop, each time it proceeds with a correct step. So we can save time and memory from creating new Mat.

Also if this is implemented in the end, could you put the core function, e.g. quantize(...) in modules/dnn/src/layers/cpu_kernels? I think, when it comes to importing stage, we need to call quantize or dequantize functions in modules/dnn/src/onnx/onnx_graph_simplifier.cpp.

This can be done later when we need to fuse qdq.

fengyuentau · 2024-07-16T05:42:36Z

@vpisarev @asmorkalov @dkurt Feel free to review this PR. I propose to merge it because it is needed in the second stage of the GSoC project.

modules/dnn/src/int8layers/quantization_utils.cpp

Support Blockwise Quantization #1181 This PR contains the test data necessary to verify the correctness of blockwise quantization introduced in opencv/opencv#25644

[GSoC] dnn: Blockwise quantization support opencv#25644 This PR introduces blockwise quantization in DNN allowing the parsing of ONNX models quantized in blockwise style. In particular it modifies the `Quantize` and `Dequantize` operations. The related PR opencv/opencv_extra#1181 contains the test data. Additional notes: - The original quantization issue has been fixed. Previously, for 1D scale and zero-point, the operation applied was $y = int8(x/s - z)$ instead of $y = int8(x/s + z)$. Note that the operation was already correctly implemented when the scale and zero-point were scalars. The previous implementation failed the ONNX test cases, but now all have passed successfully. [Reference](https://github.com/onnx/onnx/blob/main/docs/Operators.md#QuantizeLinear) - the function `block_repeat` broadcasts scale and zero-point to the input shape. It repeats all the elements of a given axis n times. This function generalizes the behavior of `repeat` from the core module which is defined just for 2 axis assuming `Mat` has 2 dimensions. If appropriate and useful, you might consider moving `block_repeat` to the core module. - Now, the scale and zero-point can be taken as layer inputs. This increases the ONNX layers' coverage and enables us to run the ONNX test cases (previously disabled) being fully compliant with ONNX standards. Since they are now supported, I have enabled the test cases for: `test_dequantizelinear`, `test_dequantizelinear_axis`, `test_dequantizelinear_blocked`, `test_quantizelinear`, `test_quantizelinear_axis`, `test_quantizelinear_blocked` just in CPU backend. All of them pass successfully. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake

DaniAffCH added 2 commits May 25, 2024 10:46

block quantize and dequantize support

bbd05f2

Merge branch 'opencv:4.x' into blockwise-quantization

75735d9

DaniAffCH changed the title ~~[GSoC] Blockwise quantization support~~ [GSoC] dnn: Blockwise quantization support May 25, 2024

fengyuentau added GSoC category: dnn category: dnn (onnx) ONNX suport issues in DNN module labels May 27, 2024

fengyuentau self-requested a review May 27, 2024 12:05

fengyuentau self-assigned this May 27, 2024

DaniAffCH added 2 commits May 29, 2024 09:43

add test cases and fixed quantization

200c0ce

fix whitespace

1f22e4b

DaniAffCH mentioned this pull request May 29, 2024

Support Blockwise Quantization opencv/opencv_extra#1181

Merged

DaniAffCH marked this pull request as ready for review May 29, 2024 08:52

fengyuentau reviewed May 30, 2024

View reviewed changes

modules/dnn/src/onnx/onnx_importer.cpp Outdated Show resolved Hide resolved

modules/dnn/test/test_onnx_conformance.cpp Outdated Show resolved Hide resolved

fengyuentau added the category:dnn_timvx TIM-VX related issues in DNN module label May 31, 2024

DaniAffCH added 2 commits May 31, 2024 10:59

support scale and zerop from input

f7007eb

fix white spaces once more

a54645e

fengyuentau reviewed Jun 6, 2024

View reviewed changes

removed comments and adjusted asserts

6cdff0b

fengyuentau reviewed Jun 17, 2024

View reviewed changes

add CV_Checks and input caching

93e6ac5

removed debug cout

4a7d04d

fengyuentau requested review from dkurt, vpisarev and asmorkalov July 3, 2024 08:00

fengyuentau added this to the 4.11.0 milestone Jul 3, 2024

removed redundant checks

7325983

DaniAffCH mentioned this pull request Jul 7, 2024

[GSoC] Blockwise Quantization Tool opencv/opencv_zoo#265

Merged

fengyuentau reviewed Jul 8, 2024

View reviewed changes

This comment was marked as resolved.

Sign in to view

DaniAffCH added 2 commits July 13, 2024 10:19

filter out unsupported backends

66a6351

Merge branch '4.x' into blockwise-quantization

ba7777e

fengyuentau approved these changes Jul 16, 2024

View reviewed changes

asmorkalov reviewed Jul 22, 2024

View reviewed changes

DaniAffCH added 3 commits July 25, 2024 16:55

add variable description

7266fc6

removed unnecessary function

a7329a4

removed redundant copy

116fdb6

asmorkalov reviewed Jul 30, 2024

View reviewed changes

modules/dnn/src/int8layers/quantization_utils.cpp Outdated Show resolved Hide resolved

modules/dnn/src/int8layers/quantization_utils.cpp Show resolved Hide resolved

DaniAffCH added 2 commits July 30, 2024 11:50

add sanity check and passed parameter by reference

ed9c208

just another sanity check

b5629db

asmorkalov approved these changes Jul 30, 2024

View reviewed changes

vpisarev approved these changes Jul 30, 2024

View reviewed changes

asmorkalov merged commit 2a333a6 into opencv:4.x Jul 30, 2024
28 of 30 checks passed

DaniAffCH deleted the blockwise-quantization branch July 30, 2024 12:55

asmorkalov mentioned this pull request Aug 6, 2024

(5.x) Merge 4.x #25998

Merged


		copyVecToMat(tmpMat,data);

		block_repeat(tmpMat, axis, block_size, mat);

Uh oh!

[GSoC] dnn: Blockwise quantization support #25644

[GSoC] dnn: Blockwise quantization support #25644

Uh oh!

Conversation

DaniAffCH commented May 25, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Readiness Checklist

Uh oh!

DaniAffCH commented May 29, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

DaniAffCH commented Jul 1, 2024

Uh oh!

fengyuentau left a comment

Choose a reason for hiding this comment

Uh oh!

fengyuentau Jul 2, 2024

Choose a reason for hiding this comment

Uh oh!

fengyuentau Jul 8, 2024

Choose a reason for hiding this comment

Uh oh!

This comment was marked as resolved.

Uh oh!

fengyuentau commented Jul 16, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

DaniAffCH commented May 25, 2024 •

edited

Loading