CARVIEW |
Navigation Menu
-
-
Notifications
You must be signed in to change notification settings - Fork 56.2k
dnn : int8 quantized layers support in onnx importer #20535
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
int depth = CV_32F; | ||
checkQuantizedLayer(node_proto, layerParams, depth); | ||
if (depth == CV_8S) | ||
CV_Error(Error::StsNotImplemented, "Int8 resize layer is not supported"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO, we should fallback for such cases - reuse 32F implementation which would always have better coverage.
BTW, #20228 should NOT block implementation of new layers without "int8" stuff. That experimental code must be optional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding fallback for current layers without int8 support and for new layers added in future :
- Functions in 20228 can fallback to FP32 version automatically if int8 version of a layer is unavailable. Tests which have been added check the logic for the same. Adding int8 version of a new layer is optional.
- Nodes in this PR which don't have int8 version donot fallback to FP32 version, they should be probably be added. Will try to add it in the coming days. Logic is slightly difficult as quantize/dequantize nodes have to be added before and after the unsupported layer.
Marking pull request as ready for review as coding part is done. Only a short tutorial for using quantization and quantized onnx models remains. @alalek Automatic FP32 fallback for an unsupported INT8 node does not look possible right now. Instead, adding INT8 path for that unsupported node seems much more easier. That's what I did with resize layer. INT8 path for resize layer has been added in the FP32 version of the layer itself to avoid code duplication. But looking at it now, maybe the changes made are too much and may affect performance of the FP32 layer. If required, I can move the int8 implementation to int8layers/resize_layer.cpp and revert the changes in layers/resize_layer.cpp. |
Any solution to solve this build failure?
I have modified |
|
@@ -967,6 +967,112 @@ TEST_P(Test_ONNX_layers, ConvResizePool1d) | |||
testONNXModels("conv_resize_pool_1d"); | |||
} | |||
|
|||
TEST_P(Test_ONNX_layers, Quantized_Convolution) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test_ONNX_layers
This test subset are parametrized to run on all available backends/targets.
This doesn't make sense for now.
Create separate test fixture to test on OpenCV/CPU target only and move there all "quantized" cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't make sense for now
While it's true that other backends don't have quantized layer's implementation, we still need to ensure quantized networks fallback to a supported backend. Keeping the tests available for all backends helps in testing this fallback.
[ RUN ] Test_ONNX_nets.ResNet50_Int8/0, where GetParam() = NGRAPH/CPU
[ WARN:0] global /home/jebastin/opencv_build/opencv/modules/dnn/src/dnn.cpp (4562) setPreferableBackend DNN: Only default backend supports quantized networks
FALLBACK: Layer [Quantize]:[data_quantized] is expected to has backend implementation
.
.
[ OK ] Test_ONNX_nets.ResNet50_Int8/0 (168 ms)
Internally, the code fallbacks to OpenCV/CPU target for unsupported backends, so I dont see a point in limiting the backends of these tests (except reducing the total time for running dnn module tests).
If you still think a new test fixture is needed, I will start working on it soon (will probably need some reference code)
@@ -1103,6 +1209,11 @@ TEST_P(Test_ONNX_nets, ResNet50v1) | |||
testONNXModels("resnet50v1", pb, default_l1, default_lInf, true, target != DNN_TARGET_MYRIAD); | |||
} | |||
|
|||
TEST_P(Test_ONNX_nets, ResNet50_Int8) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test_ONNX_nets
The same.
We need separate test fixture for quantized tests with limited set of backends.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let put it in to have some usable tests for int8 feature.
Thank you 👍
dnn : int8 quantized layers support in onnx importer * added quantized layers support in onnx importer * added more cases in eltwise node, some more checks * added tests for quantized nodes * relax thresholds for failed tests, address review comments * refactoring based on review comments * added support for unsupported cases and pre-quantized resnet50 test * relax thresholds due to int8 resize layer
merge with extra : opencv/opencv_extra#896
Final PR for GSoC'21 project on 8-bit quantization support in dnn module. This PR adds new layers which are currently supported by onnx quantization.
Docs - https://onnxruntime.ai/docs/how-to/quantization.html
Supported layers - https://github.com/microsoft/onnxruntime/blob/master/onnxruntime/python/tools/quantization/registry.py#L34-L53
resolves : #20188
replaces #20264
TODO:
fallback to FP32 for unsupported cases. Automatic fallback does not look possiblePull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.