Add support for v_exp (exponential) #24941

WanliZhong · 2024-01-30T15:14:49Z

This PR aims to implement v_exp(v_float16 x), v_exp(v_float32 x) and v_exp(v_float64 x).

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

I agree to contribute to the project under Apache 2 License.
To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
The PR is proposed to the proper branch
There is a reference to the original bug report and related work
There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
The feature is well documented and sample code can be built with the project CMake

force_builders=Linux AVX2

WanliZhong · 2024-02-01T09:25:07Z

The accuracy of the AVX512 instruction set will be worse when the result is too large.

WanliZhong · 2024-02-01T11:38:58Z

The accuracy of the AVX512 instruction set will be worse when the result is too large.

The difference comes from the errors of Remes algorithm. Different implements will return different results.

e^83.4375 =
math.h: 1.7236370391800008e+36
ours: 1.7236371976363258e+36
other calculator: 1.7236371016733e+36, 1.7236371016732233e+36, 1.7236371017e+36

All of them are different but the 6 significant digitals are the same. By definition, the error is always less than 1 ulp (unit in the last place). The larger the number, the larger the ulp will be. So I think it's acceptable and I propose to relax the tests. I use EXPEXT_FLOAT_EQ to compare float numbers.

WanliZhong · 2024-02-02T06:06:29Z

Need help identifying the error in the Android Test.
Linux AVX2 failure looks like it is related to other PRs which modified the elementwise. Create an issue: OCL_FP16 target tests failed in CI linux64-avx2 #24954

modules/core/include/opencv2/core/hal/intrin.hpp

modules/core/test/test_intrin_utils.hpp

vpisarev · 2024-02-20T19:01:01Z

@WanliZhong, I think, we really need to hurry up. Please, finalize this PR as soon as possible. Then, submit another PR with v_log, v_sin and v_cos implemented.

opencv-alalek

Take a look how to run SIMD emulator configuration (intrin_cpp.hpp):

https://pullrequest.opencv.org/buildbot/builders/4_x_etc-simd-emulator-lin64/builds/100378

modules/core/include/opencv2/core/hal/intrin.hpp

asmorkalov · 2024-02-27T13:08:49Z

@WanliZhong friendly reminder.

fengyuentau · 2024-03-01T08:35:31Z

Possible to merge it soon? This is needed by v_erf, which is needed by gelu acceleration.

modules/core/include/opencv2/core/hal/intrin_cpp.hpp

WanliZhong · 2024-06-26T12:05:11Z

@asmorkalov Hi Alexander, I think we can review this PR again. I have finalized the code as the comments that you, Vadim and Alekin left before.

Add support for v_log (Natural Logarithm) #25781 This PR aims to implement `v_log(v_float16 x)`, `v_log(v_float32 x)` and `v_log(v_float64 x)`. Merged after #24941 TODO: - [x] double and half float precision - [x] tests for them - [x] doc to explain the implementation ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake

opencv-alalek · 2024-07-08T07:45:30Z

There is regression detected by weekly build: https://pullrequest.opencv.org/buildbot/builders/4_x_etc-simd-emulator-lin64/builds/100410

asmorkalov · 2024-07-08T08:54:43Z

I was able to reproduce the regression locally on Linux host with cmake -DOPENCV_EXTRA_FLAGS="-DCV_FORCE_SIMD128_CPP=1" ../opencv-master

WanliZhong · 2024-07-09T11:16:44Z

I have reproduced this error too, I am trying to fix it

opencv-alalek · 2024-07-12T13:38:26Z

modules/core/include/opencv2/core/hal/intrin_math.hpp

+#endif
+
+    inline v_float32 v_exp(const v_float32 &x) {
+        const v_float32 _vexp_lo_f32 = vx_setall_f32(-88.3762626647949f);


In general OpenCV SIMD provides for each compilation unit (.cpp file) all available of these:

SIMD128 types and v_ functions

SIMD256 types and v256_ functions

SIMD512 types and v512_ functions

aliases for SIMDMAX types and necessary vx_ functions

Here we have just one definition.

And plus one SIMD128 implementation in intrin_cpp.hpp (which conflicts in SIMD128_CPP case)

See simd_utils.impl.hpp which provides 4 implementations inside. If you put your header near that then you should follow the similar scheme (or have the same result).

Note we should have 4 #include statements of that file (to avoid code duplication).

So should I provide v_, v256_, v512_ and vx_ versions implementation in intrin_math.hpp like simd_utils.impl.hpp? That's really will make many duplicated code.

It must be a better way of doing it without duplication.

modules/core/test/test_intrin_utils.hpp

WanliZhong added optimization feature labels Jan 30, 2024

WanliZhong added this to the 4.10.0 milestone Jan 30, 2024

WanliZhong requested review from vpisarev, asmorkalov and opencv-alalek January 30, 2024 15:14

asmorkalov requested review from mshabunin and removed request for asmorkalov January 30, 2024 15:35

WanliZhong force-pushed the v_exp branch from 83f2c45 to 4bbe1b5 Compare February 1, 2024 13:06

asmorkalov reviewed Feb 2, 2024

View reviewed changes

modules/core/include/opencv2/core/hal/intrin.hpp Outdated Show resolved Hide resolved

vpisarev reviewed Feb 2, 2024

View reviewed changes

modules/core/include/opencv2/core/hal/intrin.hpp Outdated Show resolved Hide resolved

vpisarev reviewed Feb 2, 2024

View reviewed changes

modules/core/test/test_intrin_utils.hpp Show resolved Hide resolved

vpisarev mentioned this pull request Feb 13, 2024

New CPU HAL for OpenCV 5.0 #25019

Open

opencv-alalek reviewed Feb 20, 2024

View reviewed changes

modules/core/include/opencv2/core/hal/intrin.hpp Outdated Show resolved Hide resolved

modules/core/include/opencv2/core/hal/intrin.hpp Outdated Show resolved Hide resolved

vpisarev mentioned this pull request Mar 4, 2024

added in-place support for cartToPolar and polarToCart #24893

Merged

6 tasks

asmorkalov modified the milestones: 4.10.0, 4.11.0 May 16, 2024

WanliZhong force-pushed the v_exp branch 3 times, most recently from 60b02a0 to f9d3e58 Compare June 17, 2024 15:27

WanliZhong mentioned this pull request Jun 18, 2024

Add support for v_log (Natural Logarithm) #25781

Merged

9 tasks

add support for v_exp (exponential)

976a817

WanliZhong force-pushed the v_exp branch from f9d3e58 to 976a817 Compare June 25, 2024 18:47

vpisarev reviewed Jun 26, 2024

View reviewed changes

modules/core/include/opencv2/core/hal/intrin_cpp.hpp Show resolved Hide resolved

WanliZhong added 4 commits June 26, 2024 14:47

move license above all functions

59c2323

insert the macros for HAL MATH FUNC

3269ca9

refine test

772cfb6

change test avoid underflow tests mistake

535e99e

vpisarev approved these changes Jun 27, 2024

View reviewed changes

asmorkalov approved these changes Jul 2, 2024

View reviewed changes

asmorkalov merged commit 6e1864e into opencv:4.x Jul 2, 2024
30 checks passed

WanliZhong mentioned this pull request Jul 9, 2024

Resolve Compilation Error for v_func Function in SIMD Emulator #25891

Merged

6 tasks

opencv-alalek reviewed Jul 12, 2024

View reviewed changes

asmorkalov mentioned this pull request Jul 16, 2024

(5.x) Merge 4.x #25915

Merged

Uh oh!

Add support for v_exp (exponential) #24941

Add support for v_exp (exponential) #24941

Uh oh!

Conversation

WanliZhong commented Jan 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Readiness Checklist

Uh oh!

WanliZhong commented Feb 1, 2024

Uh oh!

WanliZhong commented Feb 1, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

WanliZhong commented Feb 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

vpisarev commented Feb 20, 2024

Uh oh!

opencv-alalek left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

asmorkalov commented Feb 27, 2024

Uh oh!

fengyuentau commented Mar 1, 2024

Uh oh!

Uh oh!

WanliZhong commented Jun 26, 2024

Uh oh!

Uh oh!

opencv-alalek commented Jul 8, 2024

Uh oh!

asmorkalov commented Jul 8, 2024

Uh oh!

WanliZhong commented Jul 9, 2024

Uh oh!

opencv-alalek Jul 12, 2024

Choose a reason for hiding this comment

Uh oh!

WanliZhong Jul 12, 2024

Choose a reason for hiding this comment

Uh oh!

vpisarev Jul 19, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

WanliZhong commented Jan 30, 2024 •

edited

Loading

WanliZhong commented Feb 1, 2024 •

edited

Loading

WanliZhong commented Feb 2, 2024 •

edited

Loading