You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This patch implements static cv::DFT function in RVV_HAL using native intrinsic, optimizing the performance for cv::dft and cv::dct with data types 32FC1/64FC1/32FC2/64FC2.
The reason I chose to create a new cv_hal_dftOcv interface is that if I were to use the existing interfaces (cv_hal_dftInit1D and cv_hal_dft1D), it would require handling and parsing the dft flags within HAL, as well as performing preprocessing operations such as handling unit roots. Since these operations are not performance hotspots and do not require optimization, reusing the existing interfaces would result in copying approximately 300 lines of code from core/src/dxt.cpp into HAL, which I believe is unnecessary.
Moreover, if I insert the new interface into static cv::DFT, both static cv::RealDFT and static cv::DCT can be optimized as well. The processing performed before and after calling static cv::DFT in these functions is also not a performance hotspot.
Tested on MUSE-PI (Spacemit X60) for both gcc 14.2 and clang 20.0.
This optimize ran into the same problem in #26923 (comment). I strongly recommend that update the clang to at least 18.1.0 because vcreate series function would be very important in the optimizing of algorithms with multi channels images.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This patch implements
static cv::DFT
function in RVV_HAL using native intrinsic, optimizing the performance forcv::dft
andcv::dct
with data types32FC1/64FC1/32FC2/64FC2
.The reason I chose to create a new
cv_hal_dftOcv
interface is that if I were to use the existing interfaces (cv_hal_dftInit1D
andcv_hal_dft1D
), it would require handling and parsing the dft flags within HAL, as well as performing preprocessing operations such as handling unit roots. Since these operations are not performance hotspots and do not require optimization, reusing the existing interfaces would result in copying approximately 300 lines of code fromcore/src/dxt.cpp
into HAL, which I believe is unnecessary.Moreover, if I insert the new interface into
static cv::DFT
, bothstatic cv::RealDFT
andstatic cv::DCT
can be optimized as well. The processing performed before and after callingstatic cv::DFT
in these functions is also not a performance hotspot.Tested on MUSE-PI (Spacemit X60) for both gcc 14.2 and clang 20.0.
The head of the perf table is shown below since the table is too long.
View the full perf table here: hal_rvv_dxt.pdf
Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.