Added MSA implementations for mips platforms. #15422

mipsopen-fwu · 2019-08-30T08:41:43Z

Implemented toolchain configurations for the build of mips32 and mips64 platforms under platform/linux/ directory. The mips toolchain can be foud from the link: https://www.mips.com/develop/tools/codescape-mips-sdk/ .
Implemented MIPS SIMD (MSA) intrinsics under modules/core/include/opencv2/core/ directory.
Made some optimizations with MSA functions in dnn/imgproc/video/videoio modules.

Materials:

MIPS SIMD Developer guide: https://www.mips.com/products/architectures/ase/simd/
GCC builtin functions: https://gcc.gnu.org/onlinedocs/gcc/MIPS-SIMD-Architecture-Built-in-Functions.html

force_builders=linux,docs,Custom
buildworker:Custom=linux-1
build_image:Custom=mips64el

mipsopen-fwu · 2019-08-30T09:00:08Z

I have added MSA implementations for MIPS platforms. The changes include:

The implementation of toolchain configurations for the build of mips32 and mips64 platforms under platform/linux/ directory. The mips toolchain can be foud from the link: https://www.mips.com/develop/tools/codescape-mips-sdk/ .
The implementation of MIPS SIMD (MSA) intrinsics under modules/core/include/opencv2/core/ directory.
Some optimizations with MSA functions in dnn/imgproc/video/videoio modules.

Performance and functionality testsuites have been run locally for verification. According to the opencv_perf tests results, the MSA optimized version is 39% faster than the No-MSA version.

alalek

Great job 👍

General notes:

please split PR: SIMD backend + build scripts should be reviewed and merged first. Other parts please move into separate PRs. Please check related comment below about RAW MSA code.
as an "optimization" activity please re-target this PR onto 3.4 branch

alalek · 2019-08-30T09:45:29Z

3rdparty/libpng/CMakeLists.txt

+  set(PNG_MIPS_MSA "on" CACHE STRING "Enable MIPS_MSA optimizations:
+     off: disable the optimizations")
+  set_property(CACHE PNG_MIPS_MSA PROPERTY STRINGS
+     ${PNG_MIPS_MSA_POSSIBLE_VALUES})


-... +set(PNG_MIPS_MSA ON CACHE BOOL ...) -if(${PNG_MIPS_MSA} STREQUAL "on") +if(PNG_MIPS_MSA)

instead of strings manipulation.

Also it is better to add this pre-check:

if(";${CPU_BASELINE_FINAL};" MATCHES ";MSA;")

alalek · 2019-08-30T09:50:20Z

cmake/OpenCVCompilerOptimizations.cmake

@@ -97,6 +98,8 @@ ocv_optimization_process_obsolete_option(ENABLE_FMA3 FMA3 ON)
 ocv_optimization_process_obsolete_option(ENABLE_VFPV3 VFPV3 OFF)
 ocv_optimization_process_obsolete_option(ENABLE_NEON NEON OFF)

+ocv_optimization_process_obsolete_option(ENABLE_MSA MSA OFF)


Please don't introduce obsolete options.
CPU_BASELINE=MSA should be used instead (it is defaulted to this values below on the line 350)

alalek · 2019-08-30T09:52:38Z

modules/core/include/opencv2/core/cvdef.h

@@ -256,6 +256,8 @@ namespace cv { namespace debug_build_guard { } using namespace debug_build_guard
 #define CV_CPU_AVX512_CEL       261
 #define CV_CPU_AVX512_ICL       262

+#define CV_CPU_MSA              300


Please use value "150"

alalek · 2019-08-30T09:54:55Z

modules/core/include/opencv2/core/msa_macros.h

+// of this distribution and at https://opencv.org/license.html.
+
+#ifndef OPENCV_CORE_HAL_MSA_MACROS_H
+#define OPENCV_CORE_HAL_MSA_MACROS_H


Please move this file into core/hal directory.

alalek · 2019-08-30T09:59:29Z

platforms/linux/mips.toolchain.cmake

+
+if(USE_MSA)
+  message(WARNING "You use obsolete variable USE_MSA to enable MSA instruction set. Use -DENABLE_MSA=ON instead." )
+  set(ENABLE_MSA TRUE)


Please remove obsolete options.

alalek · 2019-08-30T10:10:30Z

modules/core/include/opencv2/core/hal/intrin_msa.hpp

+}
+
+inline v_uint32x4 v_interleave_pairs(const v_uint32x4& vec) { return v_reinterpret_as_u32(v_interleave_pairs(v_reinterpret_as_s32(vec))); }
+inline v_float32x4 v_interleave_pairs(const v_float32x4& vec) { return v_reinterpret_as_f32(v_interleave_pa


Please remove this call. It is gone (handled by compile time checks).

GitHub code reference is buggy again - correct line is 1748 about hasSIMD128().

Please remove static inline bool hasSIMD128() - it is not used anymore in OpenCV
(wrong code is linked due GitHub bug, but file is correct)

alalek · 2019-08-30T10:11:19Z

modules/core/include/opencv2/core/hal/intrin_msa.hpp

+}
+
+inline v_uint32x4 v_interleave_pairs(const v_uint32x4& vec) { return v_reinterpret_as_u32(v_interleave_pairs(v_reinterpret_as_s32(vec))); }
+inline v_float32x4 v_interleave_pairs(const v_float32x4& vec) { return v_reinterpret_as_f32(v_interleave_pa


alalek · 2019-08-30T10:38:34Z

3rdparty/libpng/mips/filter_msa_intrinsics.c

+#define SLDI_B2_0(RTYPE, in0, in1, out0, out1, slide_val)                          \
+{                                                                                  \
+    out0 = (RTYPE) __msa_sldi_b((v16i8) __msa_fill_b(0), (v16i8) in0, slide_val);  \
+    out1 = (RTYPE) __msa_sldi_b((v16i8) __msa_fill_b(0), (v16i8) in1, slide_val);  \


Please use source from libpng 1.6.37
Add separate patch file if required (to avoid fixes lost after 3rdparty upgrade in the future)

alalek · 2019-08-30T11:01:03Z

modules/dnn/src/layers/convolution_layer.cpp

+                                    vs11 = v_fma(w1, r1, vs11);
+                                    vs12 = v_fma(w1, r2, vs12);
+                                    vs13 = v_fma(w1, r3, vs13);
+#else
                                    vs00 += w0*r0;


Perhaps this can be replaced to v_fma() unconditionally for all platforms.

alalek · 2019-08-30T11:21:59Z

modules/imgproc/src/demosaicing.cpp

+        {
+            v8u16 r0 = msa_ld1q_u16((const ushort*)bayer);
+            v8u16 r1 = msa_ld1q_u16((const ushort*)(bayer + bayer_step));
+            v8u16 r2 = msa_ld1q_u16((const ushort*)(bayer + bayer_step * 2));


Main principle of "universal intrinsics" (SIMD) approach is "code once, run anywhere".

You need to write code once (on some platform), debug it with some backend (there is even C++ backend for SIMD for reference) and then it should run on other backends with reasonable speed.

So we are strictly against huge code blocks using native intrinsic if this code can be replaced by universal SIMD code without significant lost of speed.
Universal SIMD code is regularly tested via x86/ARM backends, native RAW MIPS has very low testing coverage.

See discussion here: #9763 (comment)

Could you please move these changes into a separate PR?
Lets merge first addition of SIMD backend and related build scripts (so we can add cross-platforms builds for that into our infrastructure (but no tests)).

mipsopen-fwu · 2019-08-31T02:55:31Z

Hi Alexander Thanks a lot for your quick response and detailed comments. I will adjust and optimize the source code as you have commented, then submit separated PRs. Best Regards, Fei Wu

mipsopen-fwu · 2019-09-02T07:03:16Z

@alalek Since I need to separate the PR into two, should I close this pull request first?
Next Steps I'm planning to do:

Create development branch from 3.4 and commit the changes for SIMD backend+build scripts.
Submit pull request for the commits in Step 1.
After the PR in Step 2 is merged, commit the optimizations for other modules and create another PR.

alalek · 2019-09-02T11:08:08Z

Lets keep basic part of MSA integration in this PR (at least to track conversation above).

You can "force push" commits in your branch to update PR ("mipsopen-fwu:msa-dev").

mipsopen-fwu · 2019-09-03T03:15:13Z

@alalek
I'm trying to rebase this pull request from 3.4 by following the steps specified in:
https://github.com/opencv/opencv/wiki/Branches#rebase-pull-request-from-master-to-34
Conflicts are reported when I changed base branch to '3.4'.
Need I resolve these conflicts first before I do the following steps?
rebase your commits from master onto 3.4 branch
git checkout
(optional) create backup branch: git branch -backup
git remote update upstream (assuming upstream is pointing to opencv/opencv GitHub repository)
git rebase -i --onto upstream/3.4 upstream/master
editor will be opened, check list of commits - there should be only your commits, save and exit
git push --force origin (assuming origin is pointing to /opencv GitHub repository)

alalek · 2019-09-03T07:34:09Z

Need I resolve these conflicts

Yes. You can resolve conflicts during rebase.

Before that you may enable diff3 conflict style for more information about changes: https://stackoverflow.com/questions/27417656/should-diff3-be-default-conflictstyle-on-git
Edit conflicts. Add file into index (git add -i). Then continue rebase for next commit (git rebase --continue).

To abort rebase process use git rebase --abort

alalek · 2019-09-03T07:48:22Z

I rebased PR onto 3.4 branch - fetch changes from your fork.
Looks like GitHub conflicts are appear after switching the branch (they are not related to your patch - just between our 3.4/master branches)

Old saved commits are here: https://github.com/alalek/opencv/commits/pr15422_0903

mipsopen-fwu · 2019-09-03T08:20:45Z

@alalek Thanks for rebasing! So what I need to do for the next step is to separate SIMD backend & build scripts from my previous commits, then commit and 'force push' to my branch (msa-dev) to update PR, right?

…build scripts for MIPS platforms are added. Signed-off-by: Fei Wu <fwu@wavecomp.com>

mipsopen-fwu · 2019-09-04T07:40:24Z

@alalek I have separated MSA backend and build scripts from the old commit in this PR which has been rebased to 3.4 branch.

alalek · 2019-09-04T11:21:48Z

I have added builder with cross-compilation from Ubuntu 18:04 x64:

see here "Custom" column
looks like toolchain is not able to detect properly cross-compiler (perhaps we can't use this one - we need to fix it or create new one)
list of installed packages and corresponding scripts are here

Update:

adjusted some build options, but we still need to remove strange code block to unblock compiler detection

mipsopen-fwu · 2019-09-05T01:32:46Z

@alalek Can you build the project for mips platform successfully now?
Looking from the cmake errors, it seems all the mips compiling options were not recognized. I think maybe you need to intall the mips linux gnu toolchain. The build-image name should be mips64r6el or mips32r5el.
Here is the link for the toolchain installer:
Online installer: https://codescape.mips.com/sdk/v2.0.0k/mipssdk.v2.0.0k.linux-x64.x64.webinstall.run
Offline installer: https://codescape.mips.com/sdk/v2.0.0k/mipssdk.v2.0.0k.linux-x64.x64.offline.run
Usually we use cmake-gui to configure and generate the buid scripts by selecting /platform/linux/mips64r6el-gnu.toolchain.cmake .
I will remove the unused code block you refered and commit it.

Signed-off-by: Fei Wu <fwu@wavecomp.com>

mipsopen-fwu · 2019-09-06T01:54:04Z

@alalek
I reviewed the build result. There are one failure and some warnings.

https://pullrequest.opencv.org/buildbot/builders/precommit_linux64/builds/22854/steps/Generate%20ABI%20dump/logs/log
The GCC parameters:
gcc -fdump-translation-unit -fkeep-inline-functions -c -x c++ -fpermissive -w -fsigned-char -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wundef -Winit-self -Wpointer-arith -Wshadow -Wsign-promo -Wuninitialized -Winit-self -Wno-delete-non-virtual-dtor -Wno-comment -Wno-missing-field-initializers -fdiagnostics-show-option -Wno-long-long -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -fvisibility-inlines-hidden -O3 -DNDEBUG -DNDEBUG -DOPENCV_ABI_CHECK=1 "/tmp/iJM73d75Px/dump1.h" -I/build/precommit_linux64/install/include -I/build/precommit_linux64/install/include/opencv -I/usr/include/vtk-6.0 -I/build/precommit_linux64/build

In file included from /tmp/iJM73d75Px/dump1.h:55:0:
/build/precommit_linux64/install/include/opencv2/core/hal/msa_macros.h:8:17: fatal error: msa.h: No such file or directory
#include "msa.h"
^
compilation terminated.
-- There is no -mmsa compiling option in the gcc parameters, instead -msse is used.

Warnings: https://pullrequest.opencv.org/buildbot/builders/precommit_custom_linux/builds/2596/steps/compile%20release/logs/warnings%20%2815%29
/build/precommit_custom_linux/3.4/opencv/3rdparty/libpng/mips/filter_msa_intrinsics.c:315:11: warning: declaration of 'zero_m' shadows a previous local [-Wshadow]
/build/precommit_custom_linux/3.4/opencv/3rdparty/libpng/mips/filter_msa_intrinsics.c:315:11: warning: declaration of 'zero_m' shadows a previous local [-Wshadow]
/build/precommit_custom_linux/3.4/opencv/3rdparty/libpng/mips/filter_msa_intrinsics.c:315:11: warning: declaration of 'zero_m' shadows a previous local [-Wshadow]
/build/precommit_custom_linux/3.4/opencv/3rdparty/libpng/mips/filter_msa_intrinsics.c:338:11: warning: declaration of 'zero' shadows a previous local [-Wshadow]
/build/precommit_custom_linux/3.4/opencv/3rdparty/libpng/mips/filter_msa_intrinsics.c:338:11: warning: declaration of 'zero' shadows a previous local [-Wshadow]
/build/precommit_custom_linux/3.4/opencv/3rdparty/libpng/mips/filter_msa_intrinsics.c:338:11: warning: declaration of 'zero' shadows a previous local [-Wshadow]
/build/precommit_custom_linux/3.4/opencv/3rdparty/libpng/mips/filter_msa_intrinsics.c:338:11: warning: declaration of 'zero' shadows a previous local [-Wshadow]
/build/precommit_custom_linux/3.4/opencv/3rdparty/libpng/mips/filter_msa_intrinsics.c:315:11: warning: declaration of 'zero_m' shadows a previous local [-Wshadow]
/build/precommit_custom_linux/3.4/opencv/3rdparty/libpng/mips/filter_msa_intrinsics.c:315:11: warning: declaration of 'zero_m' shadows a previous local [-Wshadow]
/build/precommit_custom_linux/3.4/opencv/3rdparty/libpng/mips/filter_msa_intrinsics.c:315:11: warning: declaration of 'zero_m' shadows a previous local [-Wshadow]
/build/precommit_custom_linux/3.4/opencv/3rdparty/libpng/mips/filter_msa_intrinsics.c:338:11: warning: declaration of 'zero' shadows a previous local [-Wshadow]
/build/precommit_custom_linux/3.4/opencv/3rdparty/libpng/mips/filter_msa_intrinsics.c:338:11: warning: declaration of 'zero' shadows a previous local [-Wshadow]
/build/precommit_custom_linux/3.4/opencv/3rdparty/libpng/mips/filter_msa_intrinsics.c:338:11: warning: declaration of 'zero' shadows a previous local [-Wshadow]
/build/precommit_custom_linux/3.4/opencv/3rdparty/libpng/mips/filter_msa_intrinsics.c:338:11: warning: declaration of 'zero' shadows a previous local [-Wshadow]

--These warnings are introduced by the original filter_msa_intrinsics.c file from libpng-1.6.37. We didn't make any changes on it. If needed, I can make a fix.

cc1plus: warning: 'void* __builtin_memcpy(void*, const void*, long unsigned int)' forming offset [25, 32] is out of the bounds [0, 24] of object 'scharr' with type 'const int [6]' [-Warray-bounds]

--This warning is not introduced by my commit.

alalek · 2019-09-06T04:56:30Z

Thank you for recommended build instructions!
I believe it make sense to add them into PR description and/or add comment in r6 toolchain file with SDK link + this PR link.

Currently on CI we prefer using cross-compilation mode through builtin Ubuntu compiler kits. We believe it is enough for basic validation builds (no tests). It is easy enough to integrate and upgrade dependencies from our side.

It is Linux x86-64 build. No -mmsa option is expected here.
Each header is included directly into ABI checker. Need to add guard in hal/msa_macros.h:

#ifdef __mips_msa
...
#endif

In 3rdparty/libpng/CMakeLists.txt near add_definitions(-DPNG_MIPS_MSA_OPT=2) add this line:

ocv_warnings_disable(CMAKE_C_FLAGS -Wshadow)

OK, I will try to check this with GCC-8 on other archs and find out how to workaround that.

…g warnings for libpng. Signed-off-by: Fei Wu <fwu@wavecomp.com>

mipsopen-fwu · 2019-09-08T07:12:42Z

@alalek I checked the failed build log for 'Android pack' : https://pullrequest.opencv.org/buildbot/builders/precommit_pack_android/builds/11090/steps/build%20sdk/logs/stdio

This error Error ‘ opcode not supported on this processor: mips32 (mips32) `pause' ’ is not introduced by my commit.

I think maybe this error can be fixed by adding __mips_isa_rev check in modules/core/src/parallel_impl.cpp:
elif defined GNUC && defined mips
define CV_PAUSE(v) do { for (int __delay = (v); __delay > 0; --__delay) { asm volatile("pause" ::: "memory"); } } while (0)
------>
elif defined GNUC && defined mips && __mips_isa_rev >= 3
define CV_PAUSE(v) do { for (int __delay = (v); __delay > 0; --__delay) { asm volatile("pause" ::: "memory"); } } while (0)

… is less than 2. Signed-off-by: Fei Wu <fwu@wavecomp.com>

mipsopen-fwu · 2019-09-09T02:27:06Z

@alalek I have committed the fix for compiling error in 'Android pack' build

mipsopen-fwu · 2019-09-10T00:32:19Z

Hi, Alexander The PR #15422 is kept in ‘waiting for builds’ state currently. How can I make it go ahead? I’ll submit another PR for the other parts of MSA optimizations after 15422 is merged. Thanks. Best Regards, Fei Wu 发件人: Alexander Alekhin <notifications@github.com> 发送时间: 2019年9月6日 12:59 收件人: opencv/opencv <opencv@noreply.github.com> 抄送: Fei Wu <fwu@wavecomp.com>; Author <author@noreply.github.com> 主题: [EXTERNAL]Re: [opencv/opencv] Added MSA implementations for mips platforms. (#15422) Thank you for recommended build instructions! I believe it make sense to add them into PR description and/or add comment in r6 toolchain file with SDK link + this PR link. Currently on CI we prefer using cross-compilation mode through builtin Ubuntu compiler kits. We believe it is enough for basic validation builds (no tests). It is easy enough to integrate and upgrade dependencies from our side. 1. It is Linux x86-64 build. No -mmsa option is expected here. Each header is included directly into ABI checker. Need to add guard in hal/msa_macros.h: #ifdef __mips_msa ... #endif 1. In 3rdparty/libpng/CMakeLists.txt near add_definitions(-DPNG_MIPS_MSA_OPT=2) add this line: ocv_warnings_disable(CMAKE_C_FLAGS -Wshadow) 1. OK, I will try to check this with GCC-8 on other archs and find out how to workaround that. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub<#15422?email_source=notifications&email_token=AM4VXTJFXXMCNA6HTJRRTHDQIHPPRA5CNFSM4ISK27G2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6BXEDA#issuecomment-528708108>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AM4VXTJEXMOKMS37GT66CGTQIHPPRANCNFSM4ISK27GQ>.

alalek

I added 'patch' files for '3rdparty' code.

Please take a look on comments below.

alalek · 2019-09-10T09:25:03Z

cmake/OpenCVCompilerOptimizations.cmake

+  ocv_update(CPU_MSA_TEST_FILE "${OpenCV_SOURCE_DIR}/cmake/checks/cpu_msa.cpp")
+  ocv_update(CPU_KNOWN_OPTIMIZATIONS "MSA")
+  ocv_update(CPU_MSA_FLAGS_ON "")
+  ocv_update(CPU_FP16_IMPLIES "MSA")


FP16 is not enabled in current builds.
What did you mean here?

modules/core/include/opencv2/core/cv_cpu_dispatch.h

alalek · 2019-09-10T09:28:24Z

modules/core/include/opencv2/core/cvdef.h

@@ -319,6 +321,8 @@ enum CpuFeatures {
    CPU_AVX512_CEL      = 261, //!< Cascade Lake with AVX-512F/CD/BW/DQ/VL/IFMA/VBMI/VNNI
    CPU_AVX512_ICL      = 262, //!< Ice Lake with AVX-512F/CD/BW/DQ/VL/IFMA/VBMI/VNNI/VBMI2/BITALG/VPOPCNTDQ

+    CPU_MSA             = 300,


alalek · 2019-09-10T09:31:04Z

modules/core/include/opencv2/core/hal/intrin_msa.hpp

+}
+
+inline v_uint32x4 v_interleave_pairs(const v_uint32x4& vec) { return v_reinterpret_as_u32(v_interleave_pairs(v_reinterpret_as_s32(vec))); }
+inline v_float32x4 v_interleave_pairs(const v_float32x4& vec) { return v_reinterpret_as_f32(v_interleave_pa


Please remove static inline bool hasSIMD128() - it is not used anymore in OpenCV
(wrong code is linked due GitHub bug, but file is correct)

…ptimizations.cmake. 2. Use CV_CPU_COMPILE_MSA instead of __mips_msa for MSA feature check in cv_cpu_dispatch.h. 3. Removed hasSIMD128() in intrin_msa.hpp. 4. Define CPU_MSA as 150. Signed-off-by: Fei Wu <fwu@wavecomp.com>

mipsopen-fwu · 2019-09-11T08:07:23Z

@alalek

Thanks a lot for committing patches for the '3rdparth' code.
FP16 related option seems to be some legacy code. Since it's not supported and mips also doesn't support 16 bit wide floating point register, I removed this line.
I replaced __mips_msa with CV_CPU_COMPILE_MSA for MSA feature checking in cv_cpu_dispatch.h.
CPU_MSA has been defined as 150 now.
hasSIMD128() has been removed.

modules/core/include/opencv2/core/hal/intrin_msa.hpp

terfendail · 2019-09-12T15:18:32Z

modules/core/include/opencv2/core/hal/intrin_msa.hpp

+OPENCV_HAL_IMPL_MSA_EXPAND(v_uint32x4, v_uint64x2, uint, u32, s32, v4u32, v4i32)
+OPENCV_HAL_IMPL_MSA_EXPAND(v_int32x4, v_int64x2, int, s32, s32, v4i32, v4i32)
+
+#if 0


Why this version of implementation is disabled? Is it slow, inaccurate or something else?

The '#if 0' version is slower according to the test results. The reason is that MSA doesn't actually provide data type expanding instructions for vectors. The 'movl' related conversion are done by C.
So it would take more steps to use 'movl' interface than doing converion directly.

terfendail · 2019-09-12T15:42:12Z

modules/core/src/arithm.simd.hpp

@@ -393,7 +393,7 @@ static void bin_loop(const T1* src1, size_t step1, const T1* src2, size_t step2,
 #if CV_SIMD
    typedef bin_loader<OP, T1, Tvec> ldr;
    enum {wide_step = Tvec::nlanes};
-    #if !CV_NEON && CV_SIMD_WIDTH == 16
+    #if !CV_NEON && !CV_MSA && CV_SIMD_WIDTH == 16


Is it necessary? Do manual loop unrolling affect performance?

In case data width is less than wide_step * 2, the performance would be better with this change. Otherwise, the performance shouldn't be better. Considering 'width' should be larger in most cases, this change would not be necessary. I will remove it.

terfendail · 2019-09-12T15:51:43Z

modules/core/src/matmul.simd.hpp

@@ -2390,6 +2390,45 @@ double dotProd_8u(const uchar* src1, const uchar* src2, int len)
            i += blockSize;
        }
    }
+#elif CV_MSA


Do performance of this implementation differs essentially from generic universal intrinsics version? I suppose generic UI version should work quite fast due to availability of native intrinsic for v_dotrpod. In case generic version provide similar performance I prefer to avoid platform specific code wherever possible.
BTW This code is anyway disabled since CV_SIMD is available and is checked prior to this.

I will remove CV_MSA related code block dotProd_8u(), it's not necessary.

mipsopen-fwu · 2019-09-16T06:17:33Z

@terfendail
Sorry for late response.

Yes, you are right. CV_SIMD128_64F guarding is not needed since it's unconditionally supported in this implementation. I will removed them form intrin_msa.hpp, only the define is kept.
The '#if 0' version is slower according to the test results. The reason is that MSA doesn't actually provide data type expanding instructions for vectors. The 'movl' related conversion are done by C.
So it would take more steps to use 'movl' interface than doing converion directly.
In case data width is less than wide_step * 2, the performance would be better with this change. Otherwise, the performance wouldn't be better. Considering 'width' should be larger in most cases, this change is not necessary. I will remove it.
I will remove CV_MSA related code block dotProd_8u(), it's not necessary.
Thanks.

2. Removed unnecessary CV_MSA related code block in dotProd_8u(). Signed-off-by: Fei Wu <fwu@wavecomp.com>

alalek

Looks good to me 👍

alalek · 2019-09-17T12:26:31Z

modules/core/src/matmul.simd.hpp

@@ -2476,6 +2476,45 @@ double dotProd_8s(const schar* src1, const schar* src2, int len)
            i += blockSize;
        }
    }
+#elif CV_MSA


#if CV_SIMD blocks this code execution here too (and compilation). The same note is about #elif CV_NEON code block (dead code too)

@mipsopen-fwu Have you compared performance of this MSA specific implementation against generic UI implementation?

@terfendail
This CV_MSA related code block was added by mistake when I merged msa implementations from 4.0.1 branch to 4.1.1 and 3.4 branch. On our pervious 4.0.1 development branch, this MSA specific implementation doesn't exist. It's correct to remove them from the pull request.

Even though, in order to make clear the performance difference, I did MatType_Length_dot test on mips32r5 HW by setting testparam=(8UC1, 1024) and samples=15. From the result, we can see that the performance of generic UI implementation(using CV_SIMD) is better against CV_MSA specific implementation which was added by mistake in this code block.
The following is the test result:

The total running time for CV_MSA implementation is:
[----------] 18 tests from MatType_Length_dot (2177 ms total)

The total running time for CV_SIMD implementations is:
[----------] 18 tests from MatType_Length_dot (618 ms total)

mshabunin · 2019-09-17T15:52:58Z

cmake/OpenCVCompilerOptimizations.cmake

+elseif(MIPS)
+  ocv_update(CPU_MSA_TEST_FILE "${OpenCV_SOURCE_DIR}/cmake/checks/cpu_msa.cpp")
+  ocv_update(CPU_KNOWN_OPTIMIZATIONS "MSA")
+  ocv_update(CPU_MSA_FLAGS_ON "")


Why is it empty? It should contain "-mmsa" or something similar.

This option seems not be used elsewhere, but it should be better to be defined as '-mmsa' for future use. I will modify it.

This option is used, but it not obvious:

there is a list of CPU_KNOWN_OPTIMIZATIONS which contain MSA

cmake first tries to compile file CPU_${OPT}_TEST_FILE with option CPU_${OPT}_FLAGS_ON, where OPT=MSA

if compilation succeeded and MSA is requested by user (via CPU_BASELINE), then these options will be added to compiler

Is the -mmsa option the only one required by implemented intrinsics? What about -fhard-float or others? We should move these options from toolchain file to CompilerOptimization.cmake.

Yes. -mmsa option is enough. When '-mmsa' is included, '-mfp64' and '-mhard-float' will be enabled by default. '-mmsa' has the same effect as '-mmsa -mfp64 -mhard-float'.

terfendail · 2019-09-17T16:10:19Z

modules/core/include/opencv2/core/hal/intrin_msa.hpp

+inline bool v_check_any(const v_float32x4& a)
+{ return v_check_any(v_reinterpret_as_u32(a)); }
+
+#if CV_SIMD128_64F


v_check_all(const v_int64x2&) shouldn't be guarded by CV_SIMD128_64F as well.

I will remove CV_SIMD128_64F guarding form intrin_msa.hpp, only the define is kept.

2. Removed CV_SIMD128_64F guardings in intrin_msa.hpp. Signed-off-by: Fei Wu <fwu@wavecomp.com>

Signed-off-by: Fei Wu <fwu@wavecomp.com>

alalek reviewed Aug 30, 2019

View reviewed changes

mipsopen-fwu changed the base branch from master to 3.4 September 2, 2019 06:45

Added MSA implementations for mips platforms. Intrinsics for MSA and …

7cd5726

…build scripts for MIPS platforms are added. Signed-off-by: Fei Wu <fwu@wavecomp.com>

mipsopen-fwu force-pushed the msa-dev branch from 31292d0 to 7cd5726 Compare September 4, 2019 07:33

Removed some unused code in mips.toolchain.cmake.

427a3fb

Signed-off-by: Fei Wu <fwu@wavecomp.com>

Added comments for mips toolchain configuration and disabled compilin…

054af0d

…g warnings for libpng. Signed-off-by: Fei Wu <fwu@wavecomp.com>

Fixed the build error of unsupported opcode 'pause' when mips isa_rev…

228df71

… is less than 2. Signed-off-by: Fei Wu <fwu@wavecomp.com>

alalek reviewed Sep 10, 2019

View reviewed changes

mipsopen-fwu force-pushed the msa-dev branch from bb24134 to f75fcb6 Compare September 11, 2019 08:00

terfendail reviewed Sep 12, 2019

View reviewed changes

modules/core/include/opencv2/core/hal/intrin_msa.hpp Show resolved Hide resolved

terfendail reviewed Sep 12, 2019

View reviewed changes

1. Removed unnecessary CV_SIMD128_64F guarding in intrin_msa.hpp.

8d2f42d

2. Removed unnecessary CV_MSA related code block in dotProd_8u(). Signed-off-by: Fei Wu <fwu@wavecomp.com>

alalek approved these changes Sep 17, 2019

View reviewed changes

alalek assigned terfendail Sep 17, 2019

mshabunin reviewed Sep 17, 2019

View reviewed changes

terfendail reviewed Sep 17, 2019

View reviewed changes

mipsopen-fwu added 2 commits September 18, 2019 10:32

1. Defined CPU_MSA_FLAGS_ON as "-mmsa".

23c826a

2. Removed CV_SIMD128_64F guardings in intrin_msa.hpp. Signed-off-by: Fei Wu <fwu@wavecomp.com>

Removed unused msa_mlal_u16() and msa_mlal_s16 from msa_macros.h.

362407d

Signed-off-by: Fei Wu <fwu@wavecomp.com>

alalek merged commit b1ea91d into opencv:3.4 Sep 20, 2019

This was referenced Sep 20, 2019

3rdparty patch files for PR #15422 #15554

Merged

Merge 3.4 #15557

Merged

mipsopen-fwu deleted the msa-dev branch September 23, 2019 06:05

mipsopen-fwu mentioned this pull request Sep 26, 2019

Adding some MSA specific optimizations for imgproc/video/vidoeio/dnn … #15599

Closed

Uh oh!

Added MSA implementations for mips platforms. #15422

Added MSA implementations for mips platforms. #15422

Uh oh!

Conversation

mipsopen-fwu commented Aug 30, 2019 • edited by alalek Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mipsopen-fwu commented Aug 30, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alalek left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alalek Sep 10, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mipsopen-fwu commented Aug 31, 2019 via email • edited by alalek Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mipsopen-fwu commented Sep 2, 2019

Uh oh!

alalek commented Sep 2, 2019

Uh oh!

mipsopen-fwu commented Sep 3, 2019

Uh oh!

alalek commented Sep 3, 2019

Uh oh!

alalek commented Sep 3, 2019

Uh oh!

mipsopen-fwu commented Sep 3, 2019

Uh oh!

mipsopen-fwu commented Sep 4, 2019

Uh oh!

alalek commented Sep 4, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mipsopen-fwu commented Sep 5, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mipsopen-fwu commented Sep 6, 2019

Uh oh!

alalek commented Sep 6, 2019

Uh oh!

mipsopen-fwu commented Sep 8, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mipsopen-fwu commented Sep 9, 2019

Uh oh!

mipsopen-fwu commented Sep 10, 2019 via email

Uh oh!

alalek left a comment

Choose a reason for hiding this comment

Uh oh!

alalek Sep 10, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

mipsopen-fwu commented Aug 30, 2019 •

edited by alalek

Loading

mipsopen-fwu commented Aug 30, 2019 •

edited

Loading

alalek Sep 10, 2019 •

edited

Loading

mipsopen-fwu commented Aug 31, 2019 via email •

edited by alalek

Loading

alalek commented Sep 4, 2019 •

edited

Loading

mipsopen-fwu commented Sep 5, 2019 •

edited

Loading

mipsopen-fwu commented Sep 8, 2019 •

edited

Loading

alalek Sep 10, 2019 •

edited

Loading

alalek Sep 10, 2019 •

edited

Loading

mipsopen-fwu Sep 20, 2019 •

edited

Loading