8-bit quantization in dnn module and int8 layers #20228

jebastin-nadar · 2021-06-07T06:53:07Z

PR for GSoC'21 project on quantization in DNN module. This PR adds functions to quantize FP32 models and int8 versions of some layers along with tests for the new layers.

Layer	Status	Remarks
Convolution	✔️	Variable weights unsupported
Inner Product	✔️	Variable weights unsupported
Pooling	✔️	Only Max and Average pooling
Padding	✔️
Flatten	✔️
Activations	✔️
Concat	✔️
Eltwise	✔️	Eltwise division unsupported
BatchNorm, Scale, Shift	✔️
Data Permutation layers	✔️

A second PR is planned later this summer to load 8-bit quantized models from other frameworks (ONNX/Tensorflow) and perform inference using int8 layers and weights without converting them to FP32 (as done currently).

mentor : @vpisarev
relates : #16633 #20188

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

I agree to contribute to the project under Apache 2 License.
To the best of my knowledge, the proposed patch is not based on a code under GPL or other license that is incompatible with OpenCV
The PR is proposed to proper branch
There is reference to original bug report and related work
There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
The feature is well documented and sample code can be built with the project CMake

jebastin-nadar · 2021-06-09T10:52:53Z

@vpisarev MaxPool int8 tests pass now. Tests with maxpooling as the last layer are commented as they compute max indices which is not supported in int8 version.

Although I still don't get why maxpooling as the last layer should compute max index as well by default.

opencv/modules/dnn/src/layers/pooling_layer.cpp

Line 1314 in 4c2dff8

int numOutputs = requiredOutputs ? requiredOutputs : (type == MAX ? 2 : 1);

modules/dnn/src/layers/batch_norm_layer.cpp

vpisarev · 2021-06-10T08:32:40Z

@vpisarev MaxPool int8 tests pass now. Tests with maxpooling as the last layer are commented as they compute max indices which is not supported in int8 version.

Although I still don't get why maxpooling as the last layer should compute max index as well by default.

opencv/modules/dnn/src/layers/pooling_layer.cpp

Line 1314 in 4c2dff8

int numOutputs = requiredOutputs ? requiredOutputs : (type == MAX ? 2 : 1);

I actually suggest to support computing indices (in FP32, as before) together with computing the max value in INT8. Yes, it will be slower, but it will let us to provide 100% compatibility.

jebastin-nadar · 2021-06-11T03:55:57Z

support computing indices (in FP32, as before) together with computing the max value in INT8

This is not possible right now as a single variable determines the datatypes of all the outputs for a layer. So both outputs[0] and outputs[1] can either be CV_32F or CV_8S. One as CV_32F and another as CV_8S is not possible currently.

opencv/modules/dnn/src/dnn.cpp

Line 582 in c2c67c2

int dtype; // Datatype of output blobs.

opencv/modules/dnn/src/dnn.cpp

Line 980 in c2c67c2

dst.create(shape, dtype);

Ofcourse changing "dtype" to std::vector would solve it, but that will introduce a lot of complexity in allocating blobs and a lot of work for a feature which is rarely used. Maybe we can keep it as low priority and look at it later.

jebastin-nadar · 2021-06-11T04:02:32Z

Some build issues :

AVX-512 path in int8layers/layers_common.simd.hpp causes build warnings and convolution tests failure (segmentation fault). The same tests are passed locally and in some other builders so I suspect there is an issue with that specific path. Also, I cannot reproduce this locally as my CPU only supports up to AVX2.
DNN Tests failure in OpenCL builders. From what I remember, I haven't modified any OpenCL related code, so don't know whats causing the failures.

vpisarev · 2021-06-12T07:09:09Z

Ofcourse changing "dtype" to std::vector would solve it, but that will introduce a lot of complexity in allocating blobs and a lot of work for a feature which is rarely used. Maybe we can keep it as low priority and look at it later.

ok, sounds good to me. Let's keep it as a low-priority item

vpisarev · 2021-06-12T07:11:11Z

Some build issues :

AVX-512 path in int8layers/layers_common.simd.hpp causes build warnings and convolution tests failure (segmentation fault). The same tests are passed locally and in some other builders so I suspect there is an issue with that specific path. Also, I cannot reproduce this locally as my CPU only supports up to AVX2.

I suggest to comment off AVX-512 branches for now (in your newly added code, not everywhere)

DNN Tests failure in OpenCL builders. From what I remember, I haven't modified any OpenCL related code, so don't know whats causing the failures.

well, you need to figure that out. If you stuck at that, we can look at it together

jebastin-nadar · 2021-06-22T10:19:15Z

@alalek @vpisarev How do I ensure only dnn module tests are run in CI Linux OpenCL builder. I edited my original comment but it looks like tests for all modules is being checked.

jebastin-nadar · 2021-07-07T13:48:55Z

@vpisarev As discussed, int8 layers which had mostly duplicated code (concat, flatten, padding) have been removed and the original fp32 layers are modified to support 8-bit inputs as well.

In some parallel_for(), I have used templates to support multiple datatypes, please check the latest commit to see if any changes have to be made.

vpisarev · 2021-08-18T13:26:00Z

@SamFC10, could you please fix the merge conflicts once again? And then squash commits? We will try to merge your pull request quickly.

jebastin-nadar · 2021-08-18T16:11:57Z

And then squash commits

Looks like I messed up something.
Command used :

git reset --soft HEAD~38
git commit -m ""
git push -f origin int8

alalek · 2021-08-18T17:37:16Z

You can rollback changes:

git checkout -B int8 79eca09675fecd2b58c237233a0aaaa7197ace6f

jebastin-nadar · 2021-08-19T04:42:10Z

Managed to restore my commits and squashed them. Thanks for the help @alalek @vpisarev

vpisarev · 2021-08-19T08:08:36Z

👍

dnn cleanup: On-fly-quantization removal #2498 On-fly-quantization is first introduced via #20228. We decided to remove it but keep int8 layers implementation because on-fly-quantization is less practical given the fact that there has been so many dedicated tools for model quantization. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake

asmorkalov added category: dnn GSoC labels Jun 7, 2021

jebastin-nadar force-pushed the int8 branch from 425a3fe to d24aa8c Compare June 7, 2021 12:50

jebastin-nadar commented Jun 9, 2021

View reviewed changes

modules/dnn/src/layers/batch_norm_layer.cpp Outdated Show resolved Hide resolved

vpisarev mentioned this pull request Jun 12, 2021

partial support for quantized models in ONNX importer #20264

Closed

6 tasks

jebastin-nadar force-pushed the int8 branch 2 times, most recently from f11678f to b36cf00 Compare June 16, 2021 06:16

jebastin-nadar mentioned this pull request Jun 20, 2021

dnn : Fix NaNs during BatchNorm reinitialization #20283

Merged

6 tasks

jebastin-nadar force-pushed the int8 branch 9 times, most recently from 28d7c78 to fc8350d Compare June 29, 2021 03:34

jebastin-nadar force-pushed the int8 branch 2 times, most recently from 1afb5b9 to 7b5a392 Compare July 7, 2021 13:43

jebastin-nadar changed the title ~~WIP : 8-bit quantization in dnn module and int8 layers~~ 8-bit quantization in dnn module and int8 layers Jul 12, 2021

jebastin-nadar marked this pull request as ready for review July 12, 2021 07:10

jebastin-nadar force-pushed the int8 branch from 04fb844 to 03c9b01 Compare July 17, 2021 12:54

jebastin-nadar force-pushed the int8 branch 4 times, most recently from 1ff055e to c8a294b Compare July 26, 2021 03:55

vpisarev self-assigned this Jul 27, 2021

jebastin-nadar force-pushed the int8 branch 2 times, most recently from 80998b3 to 6f0162c Compare July 29, 2021 07:29

asmorkalov added the feature label Aug 6, 2021

This was referenced Aug 9, 2021

DNN : fix bug in extracting prior-box variances in detection output layer #20525

Merged

dnn : int8 quantized layers support in onnx importer #20535

Merged

jebastin-nadar force-pushed the int8 branch from 79eca09 to a1c63a2 Compare August 18, 2021 16:08

int8 layers and 8-bit quantization support

fa90e14

jebastin-nadar force-pushed the int8 branch from a1c63a2 to fa90e14 Compare August 19, 2021 04:27

vpisarev self-requested a review August 19, 2021 08:08

vpisarev approved these changes Aug 19, 2021

View reviewed changes

opencv-pushbot merged commit f787c49 into opencv:master Aug 19, 2021

alalek mentioned this pull request Aug 23, 2021

[GSoC] OpenCV.js: Accelerate OpenCV.js DNN via WebNN #20406

Merged

alalek mentioned this pull request Oct 3, 2021

DNN: Invalid memory access in Int8 code #20799

Closed

jebastin-nadar mentioned this pull request Oct 3, 2021

dnn : fix illegal memory access in int8 convolution #20800

Merged

6 tasks

This was referenced Oct 14, 2021

DNN: Int8 tests failures (2021-10-14) #20873

Closed

(5.x) Merge 4.x #20886

Merged

DNN(Int8): error messages from Quantized tests #20909

Open

alalek mentioned this pull request Dec 2, 2021

dnn(test): drop non OCV/CPU cases for Int8 #21172

Merged

fengyuentau mentioned this pull request Feb 8, 2024

dnn cleanup: On-fly-quantization removal #24980

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

8-bit quantization in dnn module and int8 layers #20228

8-bit quantization in dnn module and int8 layers #20228

Uh oh!

jebastin-nadar commented Jun 7, 2021 •

edited

Loading

Uh oh!

jebastin-nadar commented Jun 9, 2021

Uh oh!

Uh oh!

vpisarev commented Jun 10, 2021 •

edited

Loading

Uh oh!

jebastin-nadar commented Jun 11, 2021

Uh oh!

jebastin-nadar commented Jun 11, 2021

Uh oh!

vpisarev commented Jun 12, 2021 •

edited

Loading

Uh oh!

vpisarev commented Jun 12, 2021

Uh oh!

jebastin-nadar commented Jun 22, 2021

Uh oh!

jebastin-nadar commented Jul 7, 2021

Uh oh!

vpisarev commented Aug 18, 2021

Uh oh!

jebastin-nadar commented Aug 18, 2021

Uh oh!

alalek commented Aug 18, 2021

Uh oh!

jebastin-nadar commented Aug 19, 2021

Uh oh!

vpisarev commented Aug 19, 2021

Uh oh!

Uh oh!

Uh oh!

8-bit quantization in dnn module and int8 layers #20228

8-bit quantization in dnn module and int8 layers #20228

Uh oh!

Conversation

jebastin-nadar commented Jun 7, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Readiness Checklist

Uh oh!

jebastin-nadar commented Jun 9, 2021

Uh oh!

Uh oh!

vpisarev commented Jun 10, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jebastin-nadar commented Jun 11, 2021

Uh oh!

jebastin-nadar commented Jun 11, 2021

Uh oh!

vpisarev commented Jun 12, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vpisarev commented Jun 12, 2021

Uh oh!

jebastin-nadar commented Jun 22, 2021

Uh oh!

jebastin-nadar commented Jul 7, 2021

Uh oh!

vpisarev commented Aug 18, 2021

Uh oh!

jebastin-nadar commented Aug 18, 2021

Uh oh!

alalek commented Aug 18, 2021

Uh oh!

jebastin-nadar commented Aug 19, 2021

Uh oh!

vpisarev commented Aug 19, 2021

Uh oh!

Uh oh!

jebastin-nadar commented Jun 7, 2021 •

edited

Loading

vpisarev commented Jun 10, 2021 •

edited

Loading

vpisarev commented Jun 12, 2021 •

edited

Loading