CARVIEW |
Navigation Menu
-
-
Notifications
You must be signed in to change notification settings - Fork 56.2k
3rdparty: NDSRVP - A New 3rdparty Library with Optimizations Based on RISC-V P Extension v0.5.2 - Part 1: Basic Functions #25167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@Junyan721113 Could you add more details on hardware configuration and how to reproduce the result? |
No problem, here are the details for accuracy and performance tests. RISC-V P Extension v0.5.2Envexport RISCV=/opt/andes
export PATH=$PATH:/opt/andes/bin ToolchainPrebuilt Releases: Andes-Development-Kit Suggested Version: v5_1_1 ./build_linux_toolchain.sh TARGET=riscv64-linux
PREFIX=/opt/andes
ARCH=rv64imafdcxandes
ABI=lp64d
CPU=andes-25-series
XLEN=64
BUILD=`pwd`/build-nds64le-linux-glibc-v5d Qemushell ./build ../configure --prefix=/opt/andes --target-list=riscv32-linux-user,riscv64-linux-user --disable-werror --static BoardThe development board used for performance tests is TinkerV with Andes AX45. Upload the installed toolchain's sysroot at /etc/ld.so.conf include /etc/ld.so.conf.d/*.conf
/path/to/the/sysroot/library shell ldconfig -v After that the sysroot library should appear in the result. OpenCV Testshell ./build cmake -D CMAKE_BUILD_TYPE=Debug -D CMAKE_INSTALL_PREFIX=/opt/andes -D BUILD_SHARED_LIBS=OFF -D CMAKE_TOOLCHAIN_FILE=../platforms/linux/riscv64-andes-gcc.toolchain.cmake .. Qemushell ./build/bin qemu-riscv64 -cpu andes-ax25 -L /opt/andes/sysroot opencv_test_core BoardDirectly upload and run the test, and it would perform properly. |
ec66a3e
to
9d1a0fb
Compare
Considering the Todo List of this PR might be too long, would it be better to divide this PR into smaller ones? |
@Junyan721113 , you can finalize current state more or less (HAL integration, several core functions implementation). And extend supported functions list in future PRs. |
Considering the relation between HAL functions, this PR might be ready for review now.
The rest of HAL functions are related to convolution, thus left for another PR. |
e41c22b
to
73130d0
Compare
Besides, I've noticed that some optimizations could be better if several functions required is also opened as HAL interface, such as:
Meanwhile, I wonder how will the HAL inferface change in the coming OpenCV 5.0. The changes may affect the next PR related to this 3rdparty library. |
@Junyan721113 Thanks a lot for the contribution!
|
Thank you! This helps me a lot. |
The mentioned PR contains
Meanwhile, the to-do list of "Part 1" is finished, other new features will be in "Part 2". This PR is ready for review now. |
32f stands to mapx and mapy are floats, but bot fixed point. source and destination may be any OpenCV supported type. Sorry for the confusion. |
Currently there are several warnings regarding strict aliasing in the new HAL library (warpAffine and warpPerspective). Are they serious issues or not? Can we somehow avoid these constructions (maybe with some reinterpret intrinsics)?
|
It was a mistake. They've been replaced with safer explicit type conversions. |
Strict-aliasing warnings have been fixed. Are there any other suggested changes? |
f5ba8f1
to
e69d8b6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it is possible to open different types of remap functions as different HAL interfaces, called
cv_hal_remapNN8u
cv_hal_remapNN16s
for example? Currently there is only one related inferface calledcv_hal_remap32f
.
You can implement only one case and return NOT_IMPLEMENTED for others.
Overall looks good to me. We need to fix cmp operation before merge. Refactoring and some cleanup (warpAffine/warpPerspective/headers) can be postponed to the next PR.
e69d8b6
to
f77b05a
Compare
I've made a logical mistake. Fixed. |
Summary
Previous context
From PR #24556:
Progress
Part 1 (This PR)
Part 2 (Next PR)
Rough Estimate. Todo List May Change.
Performance Tests
The optimization does not contain floating point opreations.
Absolute Difference
Geometric mean (ms)
Bitwise Operation
Geometric mean (ms)
Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.