🐛 Fix CUDA for old GPUs without FP16 support #25880

Jamim · 2024-07-07T14:48:33Z

~~This is a build-time solution that reflects https://github.com/opencv/opencv/blob/4.10.0/modules/dnn/src/cuda4dnn/init.hpp#L68-L82.~~
~~We shouldn't add an invalid target while building with CUDA_ARCH_BIN < 53.~~
(please see this discussion)

This is a run-time solution that basically reverts these lines.

I've debugged these changes, coupled with other fixes, on Gentoo Linux and related tests passed on my laptop with GeForce GTX 960M.

Alternative solution:

Do not accept FP16 target on old, incompatible Nvidia cards #21462

Best regards!

Pull Request Readiness Checklist

I agree to contribute to the project under Apache 2 License.
To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
The PR is proposed to the proper branch
There is a reference to the original bug report and related work
n/a There is accuracy test, performance test and test data in opencv_extra repository, if applicable
n/a The feature is well documented and sample code can be built with the project CMake

cmake/OpenCVDetectCUDA.cmake

asmorkalov · 2024-07-09T07:10:54Z

modules/dnn/src/registry.cpp

+            if (cuda4dnn::doesDeviceSupportFP16())
+                backends.push_back(std::make_pair(DNN_BACKEND_CUDA, DNN_TARGET_CUDA_FP16));


It works well with single gpu configuration. The current GPU may be old and does not support FP16, but the second one does. It's popular case, if one GPU is used for rendering another one - for compute.

It is fine to check the CURRENT CUDA device.
Lets left the caller responsibility to properly select used device via cudaSetDevice before these calls.

BTW, OpenCL backend has similar problems but looks like they are handled in another place.

I pushed fix for the issue and extended check to target management. @opencv-alalek could you take a look again?

asmorkalov · 2024-07-09T08:25:09Z

@Jamim Could you pull the branch from Github and test the last commit with your GPU.

Jamim · 2024-07-09T08:34:46Z

Could you pull the branch from Github and test the last commit with your GPU.

Sorry, I'm not at home now. I'll be able to do so in the evening.

Jamim

Hello @asmorkalov,

I've tested your changes and they work well on my system.
Also, I have a couple of minor suggestions. Please take a look.

modules/dnn/src/registry.cpp

modules/dnn/src/cuda4dnn/init.hpp

Co-authored-by: Aliaksei Urbanski <aliaksei.urbanski@gmail.com>

This was referenced Jul 7, 2024

Do not report for FP16 compatibility on old Nvidia cards #21461

Closed

Do not accept FP16 target on old, incompatible Nvidia cards #21462

Closed

⬆️🚩🦺🐛 media-libs/opencv: add 4.10.0 gentoo/gentoo#37479

Open

asmorkalov added category: gpu/cuda (contrib) OpenCV 4.0+: moved to opencv_contrib category: dnn labels Jul 8, 2024

asmorkalov requested changes Jul 8, 2024

View reviewed changes

cmake/OpenCVDetectCUDA.cmake Outdated Show resolved Hide resolved

🐛 Fix CUDA for old GPUs without FP16 support

5115dc6

Jamim force-pushed the fix/cuda-no-fp16 branch from 07c67c0 to 5115dc6 Compare July 8, 2024 23:11

Jamim requested review from opencv-alalek and asmorkalov July 8, 2024 23:54

asmorkalov added this to the 4.11.0 milestone Jul 9, 2024

asmorkalov self-assigned this Jul 9, 2024

asmorkalov reviewed Jul 9, 2024

View reviewed changes

Added CUDA FP16 availability check for target management.

cfb2bc3

asmorkalov approved these changes Jul 9, 2024

View reviewed changes

Jamim commented Jul 10, 2024

View reviewed changes

modules/dnn/src/registry.cpp Show resolved Hide resolved

modules/dnn/src/cuda4dnn/init.hpp Show resolved Hide resolved

Update modules/dnn/src/registry.cpp

cc91789

Co-authored-by: Aliaksei Urbanski <aliaksei.urbanski@gmail.com>

opencv-alalek approved these changes Jul 10, 2024

View reviewed changes

asmorkalov merged commit 35ca2f7 into opencv:4.x Jul 10, 2024
28 of 30 checks passed

Jamim deleted the fix/cuda-no-fp16 branch July 10, 2024 09:46

asmorkalov mentioned this pull request Jul 16, 2024

(5.x) Merge 4.x #25915

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

🐛 Fix CUDA for old GPUs without FP16 support #25880

🐛 Fix CUDA for old GPUs without FP16 support #25880

Uh oh!

Jamim commented Jul 7, 2024 •

edited by asmorkalov

Loading

Uh oh!

Uh oh!

asmorkalov Jul 9, 2024

Uh oh!

opencv-alalek Jul 9, 2024

Uh oh!

asmorkalov Jul 9, 2024

Uh oh!

asmorkalov commented Jul 9, 2024

Uh oh!

Jamim commented Jul 9, 2024 via email •

edited

Loading

Uh oh!

Jamim left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

		if (cuda4dnn::doesDeviceSupportFP16())
		backends.push_back(std::make_pair(DNN_BACKEND_CUDA, DNN_TARGET_CUDA_FP16));

Uh oh!

🐛 Fix CUDA for old GPUs without FP16 support #25880

🐛 Fix CUDA for old GPUs without FP16 support #25880

Uh oh!

Conversation

Jamim commented Jul 7, 2024 • edited by asmorkalov Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Readiness Checklist

Uh oh!

Uh oh!

asmorkalov Jul 9, 2024

Choose a reason for hiding this comment

Uh oh!

opencv-alalek Jul 9, 2024

Choose a reason for hiding this comment

Uh oh!

asmorkalov Jul 9, 2024

Choose a reason for hiding this comment

Uh oh!

asmorkalov commented Jul 9, 2024

Uh oh!

Jamim commented Jul 9, 2024 via email • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Jamim left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Jamim commented Jul 7, 2024 •

edited by asmorkalov

Loading

Jamim commented Jul 9, 2024 via email •

edited

Loading