dnn onnx: add instance norm layer #24378

fengyuentau · 2023-10-09T12:12:55Z

Resolves #24377
Relates #24092 (comment)

Perf	multi-thread	single-thread
x: [2, 64, 180, 240]	3.95ms	11.12ms

Todo:

speed up by multi-threading
add perf
add backend: OpenVINO
add backend: CUDA
add backend: OpenCL (no fp16)
add backend: CANN (will be done via dnn cann backend: add hardswish, layernorm and instasnce norm for cann and bug fix #24462)

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

I agree to contribute to the project under Apache 2 License.
To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
The PR is proposed to the proper branch
There is a reference to the original bug report and related work
There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
The feature is well documented and sample code can be built with the project CMake

force_builders=Linux OpenCL,Win64 OpenCL,Custom
buildworker:Custom=linux-4
build_image:Custom=ubuntu:18.04
modules_filter:Custom=none
disable_ipp:Custom=ON

dkurt · 2023-10-09T12:19:25Z

This is 3rd layer which will do the same compute. There are already MVN and LayerNorm, can they be fixed?

fengyuentau · 2023-10-09T12:22:54Z

can they be fixed?

I prefer to keep all of those (keep mvn for caffe models). Dedicated layers are more flexible when it comes to backend support and optimization. And we don't have to break a single layer into several layers, for example,

https://github.com/opencv/opencv/blob/590f150d5e032165e27d81294c9b7ac710b77f11/modules/dnn/src/onnx/onnx_importer.cpp#L1870C6-L1883

dkurt · 2023-10-09T12:25:39Z

@fengyuentau, MVN already provides OpenCL, CUDA and OpenVINO optimizations. Let's rename it instead and use in ONNX importer.

fengyuentau · 2023-10-09T12:29:16Z

MVN already provides OpenCL, CUDA and OpenVINO optimizations

We still have CANN and some other backends which has its own graph. Merging several layers into one may have trouble catching layer type and corresponding parameters. Dedicated layers are more maintainable.

dkurt · 2023-10-09T12:33:20Z

@fengyuentau, agree, but what about fixing LayerNorm? Can it be replaced in ONNX importer instead if MVN?

fengyuentau · 2023-10-09T12:37:21Z

@fengyuentau, agree, but what about fixing LayerNorm? Can it be replaced in ONNX importer instead if MVN?

I propose to keep it as-is. There is a plan adding CANN backend support for it #23550. If the layer type is missed, it may most probably fool the CANN graph simplifier and lead to performance drop.

dkurt · 2023-10-09T12:48:41Z

This way, can InstanceNorm replace LayerNorm in the simplifier completely?

opencv/modules/dnn/src/onnx/onnx_graph_simplifier.cpp

Line 310 in 590f150

class LayerNormSubGraph : public Subgraph

If this PR will remove LayerNorm, that looks fine. Otherwise, it duplicates the math.

modules/dnn/src/onnx/onnx_importer.cpp

fengyuentau · 2023-10-09T13:06:39Z

Speaking of OpenVINO, is there a documentation on its operator spec? Something like the ONNX operator spec documentation.

dkurt · 2023-10-09T13:08:58Z

Speaking of OpenVINO, is there a documentation on its operator spec?

https://docs.openvino.ai/2023.1/openvino_docs_operations_specifications.html

fengyuentau · 2023-10-09T13:10:02Z

Speaking of OpenVINO, is there a documentation on its operator spec?

https://docs.openvino.ai/2023.1/openvino_docs_operations_specifications.html

Thank you!

vpisarev · 2023-10-11T06:30:48Z

@dkurt, @fengyuentau, I think, ONNX with its InstanceNormalization is much more popular nowadays vs. Caffe & MVN. I would be nice, I think, to have a dedicated layer with conventional name (InstanceNormalization), but internally it can call MVN kernel (which could be declared in some internal headers)

fengyuentau · 2023-10-11T06:50:31Z

In this way, we can extract the kernel, put it in layers/cpu_kernels/, call it anywhere that we need it (mvn, layer norm, instance norm, ...).

asmorkalov · 2023-10-24T14:15:55Z

@fengyuentau Please pay attention on the merge conflict.

asmorkalov · 2023-10-30T09:39:04Z

@dkurt could you take a look again?

dkurt · 2023-10-30T09:53:36Z

@asmorkalov, I expected that we should merge #24409 first

dnn: add shared fastNorm kernel for mvn, instance norm and layer norm #24409 Relates #24378 (comment) TODO: - [x] add fastNorm - [x] refactor layer norm with fastNorm - [x] refactor mvn with fastNorm - [ ] add onnx mvn in importer (in a new PR?) - [ ] refactor instance norm with fastNorm (in another PR #24378, need to merge this one first though) ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake

asmorkalov · 2023-11-01T11:34:33Z

@dkurt Done.

asmorkalov · 2023-11-01T11:35:27Z

@fengyuentau please rebase to get PR 24409 included.

dkurt · 2023-11-01T11:38:13Z

modules/dnn/src/layers/instance_norm_layer.cpp

+
+    virtual bool supportBackend(int backendId) CV_OVERRIDE {
+        return backendId == DNN_BACKEND_OPENCV;
+    }


Breaks coverage with other backends that initialized in MVN layer (OpenVINO, CUDA).

I will try to add them back.

OpenVINO and CUDA backend are added.

Can you handle OpenCL backend as well?

opencv/modules/dnn/src/layers/mvn_layer.cpp

Line 208 in ea47cb3

bool forward_ocl(InputArrayOfArrays inputs_, OutputArrayOfArrays outputs_, OutputArrayOfArrays internals_)

I can give it a shot.

May I ask why returning false in forward_ocl in mvn_layer.cpp if the input umat type is int16?

opencv/modules/dnn/src/layers/mvn_layer.cpp

Lines 229 to 230 in fa56623

if (inputs[0].depth() == CV_16S)

return false;

Initially I thought it was CV_16F, but it is CV_16S indeed. So the opencl backend supports int8 quantization as well? But the weirder thing is we don't support int16 quantization right?

CV_16S is an original data type for FP16 in OpenCV (before CV_16F which is used in dnn for FP16 inference on CPU). However every OpenCL kernel uses CV_16S which actually just a short int buffer for float16. See convertFp16

Done with adding OpenCL backend (no fp16 though).

fengyuentau · 2023-11-02T13:20:15Z

Update: Resolved.

@dkurt Do you know how is memory allocated in CUDA op? For example, here in cuda4dnn/primitives/mvn.hpp:

            csl::WorkspaceBuilder builder;
            builder.require<float>(max_outer_size);
            if (normalize_variance)
                builder.require<float>(max_outer_size);
            scratch_mem_in_bytes = builder.required_workspace_size();

fengyuentau · 2023-11-04T09:28:14Z

All tests including the ones on different backends (OpenVINO, CUDA and OpenCL) are passed 👍

fengyuentau · 2023-11-04T09:36:40Z

By the way, I though dnn with OpenCL on macOS is broken because it just did not work, but actually I found it is disable and no log is output during building opencv:

opencv/modules/dnn/CMakeLists.txt

Line 20 in e95c005

    
           ocv_option(OPENCV_DNN_OPENCL "Build with OpenCL support" HAVE_OPENCL AND NOT APPLE)

It has been disabled back to 2018 with a comment saying it was temporarily, but it actually lasts till today and goes on. Is it time to enable it? cc @vpisarev

dnn: add shared fastNorm kernel for mvn, instance norm and layer norm opencv#24409 Relates opencv#24378 (comment) TODO: - [x] add fastNorm - [x] refactor layer norm with fastNorm - [x] refactor mvn with fastNorm - [ ] add onnx mvn in importer (in a new PR?) - [ ] refactor instance norm with fastNorm (in another PR opencv#24378, need to merge this one first though) ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake

dnn onnx: add instance norm layer opencv#24378 Resolves opencv#24377 Relates opencv#24092 (comment) | Perf | multi-thread | single-thread | | - | - | - | | x: [2, 64, 180, 240] | 3.95ms | 11.12ms | Todo: - [x] speed up by multi-threading - [x] add perf - [x] add backend: OpenVINO - [x] add backend: CUDA - [x] add backend: OpenCL (no fp16) - [ ] add backend: CANN (will be done via opencv#24462) ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake ``` force_builders=Linux OpenCL,Win64 OpenCL,Custom buildworker:Custom=linux-4 build_image:Custom=ubuntu:18.04 modules_filter:Custom=none disable_ipp:Custom=ON ```

dnn: add shared fastNorm kernel for mvn, instance norm and layer norm opencv#24409 Relates opencv#24378 (comment) TODO: - [x] add fastNorm - [x] refactor layer norm with fastNorm - [x] refactor mvn with fastNorm - [ ] add onnx mvn in importer (in a new PR?) - [ ] refactor instance norm with fastNorm (in another PR opencv#24378, need to merge this one first though) ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake

dnn onnx: add instance norm layer opencv#24378 Resolves opencv#24377 Relates opencv#24092 (comment) | Perf | multi-thread | single-thread | | - | - | - | | x: [2, 64, 180, 240] | 3.95ms | 11.12ms | Todo: - [x] speed up by multi-threading - [x] add perf - [x] add backend: OpenVINO - [x] add backend: CUDA - [x] add backend: OpenCL (no fp16) - [ ] add backend: CANN (will be done via opencv#24462) ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake ``` force_builders=Linux OpenCL,Win64 OpenCL,Custom buildworker:Custom=linux-4 build_image:Custom=ubuntu:18.04 modules_filter:Custom=none disable_ipp:Custom=ON ```

dnn: add shared fastNorm kernel for mvn, instance norm and layer norm opencv#24409 Relates opencv#24378 (comment) TODO: - [x] add fastNorm - [x] refactor layer norm with fastNorm - [x] refactor mvn with fastNorm - [ ] add onnx mvn in importer (in a new PR?) - [ ] refactor instance norm with fastNorm (in another PR opencv#24378, need to merge this one first though) ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake

dnn onnx: add instance norm layer opencv#24378 Resolves opencv#24377 Relates opencv#24092 (comment) | Perf | multi-thread | single-thread | | - | - | - | | x: [2, 64, 180, 240] | 3.95ms | 11.12ms | Todo: - [x] speed up by multi-threading - [x] add perf - [x] add backend: OpenVINO - [x] add backend: CUDA - [x] add backend: OpenCL (no fp16) - [ ] add backend: CANN (will be done via opencv#24462) ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake ``` force_builders=Linux OpenCL,Win64 OpenCL,Custom buildworker:Custom=linux-4 build_image:Custom=ubuntu:18.04 modules_filter:Custom=none disable_ipp:Custom=ON ```

fengyuentau added feature category: dnn category: dnn (onnx) ONNX suport issues in DNN module labels Oct 9, 2023

fengyuentau added this to the 4.9.0 milestone Oct 9, 2023

fengyuentau requested a review from dkurt October 9, 2023 12:12

fengyuentau assigned vpisarev Oct 9, 2023

dkurt reviewed Oct 9, 2023

View reviewed changes

modules/dnn/src/onnx/onnx_importer.cpp Outdated Show resolved Hide resolved

fengyuentau mentioned this pull request Oct 10, 2023

GSoC Add ONNX Support for GatherElements #24092

Merged

6 tasks

fengyuentau mentioned this pull request Oct 16, 2023

dnn: add shared fastNorm kernel for mvn, instance norm and layer norm #24409

Merged

11 tasks

fengyuentau force-pushed the instance_norm branch from b216e6d to 74723a2 Compare October 25, 2023 03:00

dkurt reviewed Nov 1, 2023

View reviewed changes

add test case

9009097

fengyuentau added 5 commits November 2, 2023 15:22

quickfix: fix vector values 1

05cc087

move inv_stdev outside of inner loop

7cdb07a

try to add cuda backend

f310b52

quickfix: fix type

d23c0fe

quickfix: fix header

d9869ee

fengyuentau added 6 commits November 2, 2023 13:36

try fix memory issue

74f5eab

debug InstanceNormOp

ddfabf3

try fix memory 1

897d585

try fix memory issue 2

fad70e6

cuda: fix acc issue

fc68ca6

format initCUDA; add opencl

002b5ca

dkurt approved these changes Nov 7, 2023

View reviewed changes

asmorkalov merged commit ee0822d into opencv:4.x Nov 7, 2023

asmorkalov assigned dkurt and unassigned vpisarev Nov 7, 2023

fengyuentau deleted the instance_norm branch November 7, 2023 10:03

fengyuentau mentioned this pull request Nov 13, 2023

dnn onnx test: remove cuda backend filter for instance norm test #24531

Merged

6 tasks

dkurt mentioned this pull request Nov 16, 2023

dnn: add openvino, opencl and cuda backends for layer normalization layer #24552

Merged

9 tasks

dkurt mentioned this pull request Nov 30, 2023

dnn onnx: add group norm layer #24610

Merged

12 tasks

asmorkalov mentioned this pull request Jan 19, 2024

5.x merge 4.x #24862

Merged

fengyuentau mentioned this pull request Feb 21, 2024

ONNX conformance test results #21078

Open

48 tasks

Uh oh!

dnn onnx: add instance norm layer #24378

dnn onnx: add instance norm layer #24378

Uh oh!

Conversation

fengyuentau commented Oct 9, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Readiness Checklist

Uh oh!

dkurt commented Oct 9, 2023

Uh oh!

fengyuentau commented Oct 9, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dkurt commented Oct 9, 2023

Uh oh!

fengyuentau commented Oct 9, 2023

Uh oh!

dkurt commented Oct 9, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fengyuentau commented Oct 9, 2023

Uh oh!

dkurt commented Oct 9, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

fengyuentau commented Oct 9, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dkurt commented Oct 9, 2023

Uh oh!

fengyuentau commented Oct 9, 2023

Uh oh!

vpisarev commented Oct 11, 2023

Uh oh!

fengyuentau commented Oct 11, 2023

Uh oh!

asmorkalov commented Oct 24, 2023

Uh oh!

asmorkalov commented Oct 30, 2023

Uh oh!

dkurt commented Oct 30, 2023

Uh oh!

asmorkalov commented Nov 1, 2023

Uh oh!

asmorkalov commented Nov 1, 2023

Uh oh!

dkurt Nov 1, 2023

Choose a reason for hiding this comment

Uh oh!

fengyuentau Nov 2, 2023

Choose a reason for hiding this comment

Uh oh!

fengyuentau Nov 3, 2023

Choose a reason for hiding this comment

Uh oh!

dkurt Nov 3, 2023

Choose a reason for hiding this comment

Uh oh!

fengyuentau Nov 3, 2023

Choose a reason for hiding this comment

Uh oh!

fengyuentau Nov 3, 2023

Choose a reason for hiding this comment

Uh oh!

dkurt Nov 3, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fengyuentau Nov 4, 2023

Choose a reason for hiding this comment

Uh oh!

fengyuentau commented Nov 2, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fengyuentau commented Nov 4, 2023

Uh oh!

fengyuentau commented Nov 4, 2023

Uh oh!

fengyuentau commented Oct 9, 2023 •

edited

Loading

fengyuentau commented Oct 9, 2023 •

edited

Loading

dkurt commented Oct 9, 2023 •

edited

Loading

dkurt commented Oct 9, 2023 •

edited

Loading

fengyuentau commented Oct 9, 2023 •

edited

Loading

dkurt Nov 3, 2023 •

edited

Loading

fengyuentau commented Nov 2, 2023 •

edited

Loading