CARVIEW |
Navigation Menu
-
-
Notifications
You must be signed in to change notification settings - Fork 56.2k
8-bit quantization in dnn module and int8 layers #20228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@vpisarev MaxPool int8 tests pass now. Tests with maxpooling as the last layer are commented as they compute max indices which is not supported in int8 version. Although I still don't get why maxpooling as the last layer should compute max index as well by default. opencv/modules/dnn/src/layers/pooling_layer.cpp Line 1314 in 4c2dff8
|
I actually suggest to support computing indices (in FP32, as before) together with computing the max value in INT8. Yes, it will be slower, but it will let us to provide 100% compatibility. |
This is not possible right now as a single variable determines the datatypes of all the outputs for a layer. So both outputs[0] and outputs[1] can either be CV_32F or CV_8S. One as CV_32F and another as CV_8S is not possible currently. opencv/modules/dnn/src/dnn.cpp Line 582 in c2c67c2
opencv/modules/dnn/src/dnn.cpp Line 980 in c2c67c2
Ofcourse changing "dtype" to std::vector would solve it, but that will introduce a lot of complexity in allocating blobs and a lot of work for a feature which is rarely used. Maybe we can keep it as low priority and look at it later. |
Some build issues :
|
ok, sounds good to me. Let's keep it as a low-priority item |
I suggest to comment off AVX-512 branches for now (in your newly added code, not everywhere)
well, you need to figure that out. If you stuck at that, we can look at it together |
f11678f
to
b36cf00
Compare
28d7c78
to
fc8350d
Compare
1afb5b9
to
7b5a392
Compare
@vpisarev As discussed, int8 layers which had mostly duplicated code (concat, flatten, padding) have been removed and the original fp32 layers are modified to support 8-bit inputs as well. In some parallel_for(), I have used templates to support multiple datatypes, please check the latest commit to see if any changes have to be made. |
1ff055e
to
c8a294b
Compare
80998b3
to
6f0162c
Compare
@SamFC10, could you please fix the merge conflicts once again? And then squash commits? We will try to merge your pull request quickly. |
Looks like I messed up something.
|
You can rollback changes:
|
👍 |
dnn cleanup: On-fly-quantization removal #2498 On-fly-quantization is first introduced via #20228. We decided to remove it but keep int8 layers implementation because on-fly-quantization is less practical given the fact that there has been so many dedicated tools for model quantization. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake
PR for GSoC'21 project on quantization in DNN module. This PR adds functions to quantize FP32 models and int8 versions of some layers along with tests for the new layers.
A second PR is planned later this summer to load 8-bit quantized models from other frameworks (ONNX/Tensorflow) and perform inference using int8 layers and weights without converting them to FP32 (as done currently).
mentor : @vpisarev
relates : #16633 #20188
Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.