CARVIEW |
Navigation Menu
-
-
Notifications
You must be signed in to change notification settings - Fork 56.2k
Add RISC-V HAL implementation for minMaxIdx #26789
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Use a macro generator for all data types of minMaxIdx. Add the mask_step argument for minMaxIdx, check mask.isContinuous() before calling cv_hal_minMaxIdx. Including two algorithms, one read source data twice while the other only read once but with more calculation. EMUL is configurable, use EMUL=4 when EEW>=32. Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>
Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>
c697fd0
to
82903e3
Compare
Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HAL integration looks good to me 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good to me.
Though, I agree with @opencv-alalek - it would be better to reimplement macros as templates.
@asmorkalov , did we decide anything on adding new function variabnt with mask_step?
Let's add. The current version presumes compatibility. I vote for template instead of macros. |
My solution to similar problems is to add some template specialization. And use implicit (overloaded) intrinsics in other places, however this solution does not work for integer/floating-point and signed/unsigned version yet, we may have to add more member functions template <typename T, int LOGM> struct rvv_helper;
#define RVV_HELPER(T, NAME, SIZE, EEW, EMUL, LOGM) \
template <> struct rvv_helper<T, LOGM> { \
using type = v##NAME##EMUL##_t; \
static size_t setvl(size_t n) { return __riscv_vsetvl_e##SIZE##EMUL(n); } \
static size_t setvlmax() { return __riscv_vsetvlmax_e##SIZE##EMUL(); } \
static type le(const T *ptr, size_t vl) { \
return __riscv_vle##SIZE##_v_##EEW##EMUL(ptr, vl); \
} \
... \
}
...
RVV_HELPER(unsigned char, uint8, 8, u8, mf2, -1);
RVV_HELPER(unsigned char, uint8, 8, u8, m1, 0);
RVV_HELPER(unsigned char, uint8, 8, u8, m2, 1);
...
...
RVV_HELPER(float, float32, 32, f32, mf2, -1);
RVV_HELPER(float, float32, 32, f32, m1, 0);
RVV_HELPER(float, float32, 32, f32, m2, 1);
...
|
Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>
cc @asmorkalov @mshabunin. |
Co-authored-by: Liutong HAN <liutong2020@iscas.ac.cn>
@amane-ame Thanks a lot for the effort. Please fix conflict and I'll merge it. |
@asmorkalov Ready to be merged. |
Add RISC-V HAL implementation for cv::norm and cv::normalize #26804 This patch implements `cv::norm` with norm types `NORM_INF/NORM_L1/NORM_L2/NORM_L2SQR` and `Mat::convertTo` function in RVV_HAL using native intrinsic, optimizing the performance for `cv::norm(src)`, `cv::norm(src1, src2)`, and `cv::normalize(src)` with data types `8UC1/8UC4/32FC1`. `cv::normalize` also calls `minMaxIdx`, #26789 implements RVV_HAL for this. Tested on MUSE-PI for both gcc 14.2 and clang 20.0. ``` $ opencv_test_core --gtest_filter="*Norm*" $ opencv_perf_core --gtest_filter="*norm*" --perf_min_samples=300 --perf_force_samples=300 ``` The head of the perf table is shown below since the table is too long. View the full perf table here: [hal_rvv_norm.pdf](https://github.com/user-attachments/files/18468255/hal_rvv_norm.pdf) <img width="1304" alt="Untitled" src="https://github.com/user-attachments/assets/3550b671-6d96-4db3-8b5b-d4cb241da650" /> ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake
Add RISC-V HAL implementation for minMaxIdx opencv#26789 On the RISC-V platform, `minMaxIdx` cannot benefit from Universal Intrinsics because the UI-optimized `minMaxIdx` only supports `CV_SIMD128` (and does not accept `CV_SIMD_SCALABLE` for RVV). https://github.com/opencv/opencv/blob/1d701d1690b8cc9aa6b86744bffd5d9841ac6fd3/modules/core/src/minmax.cpp#L209-L214 This patch implements `minMaxIdx` function in RVV_HAL using native intrinsic, optimizing the performance for all data types with one channel. Tested on MUSE-PI for both gcc 14.2 and clang 20.0. ``` $ opencv_test_core --gtest_filter="*MinMaxLoc*" $ opencv_perf_core --gtest_filter="*minMaxLoc*" ``` <img width="1122" alt="Untitled" src="https://github.com/user-attachments/assets/6a246852-87af-42c5-a50b-c349c2765f3f" /> See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake
Add RISC-V HAL implementation for cv::norm and cv::normalize opencv#26804 This patch implements `cv::norm` with norm types `NORM_INF/NORM_L1/NORM_L2/NORM_L2SQR` and `Mat::convertTo` function in RVV_HAL using native intrinsic, optimizing the performance for `cv::norm(src)`, `cv::norm(src1, src2)`, and `cv::normalize(src)` with data types `8UC1/8UC4/32FC1`. `cv::normalize` also calls `minMaxIdx`, opencv#26789 implements RVV_HAL for this. Tested on MUSE-PI for both gcc 14.2 and clang 20.0. ``` $ opencv_test_core --gtest_filter="*Norm*" $ opencv_perf_core --gtest_filter="*norm*" --perf_min_samples=300 --perf_force_samples=300 ``` The head of the perf table is shown below since the table is too long. View the full perf table here: [hal_rvv_norm.pdf](https://github.com/user-attachments/files/18468255/hal_rvv_norm.pdf) <img width="1304" alt="Untitled" src="https://github.com/user-attachments/assets/3550b671-6d96-4db3-8b5b-d4cb241da650" /> See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake
On the RISC-V platform,
minMaxIdx
cannot benefit from Universal Intrinsics because the UI-optimizedminMaxIdx
only supportsCV_SIMD128
(and does not acceptCV_SIMD_SCALABLE
for RVV).opencv/modules/core/src/minmax.cpp
Lines 209 to 214 in 1d701d1
This patch implements
minMaxIdx
function in RVV_HAL using native intrinsic, optimizing the performance for all data types with one channel.Tested on MUSE-PI for both gcc 14.2 and clang 20.0.
Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.