You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Fixes VSO-2295101 "[RWC][prod/fe][Regression] Cutlass failed with error C2146: syntax error: missing ')' before identifier '_Dest'". It appears that CUDA 12.4 doesn't understand __restrict, so when it attempts to split code between the device (GPU) and host (MSVC) compilers, it mangles what's sent to MSVC.
Apparently this is only an issue when the template is instantiated, which is why it wasn't found by our CUDA test coverage (that includes all headers and does nothing with them). It was only encountered in our Real World Code test suite where we wisely harnessed NVIDIA's Cutlass project.
I am slightly nervous that bad kitties have grabbed _RESTRICT but we're following that form for _LIKELY etc. and I don't want to pre-emptively avoid issues when we have all legitimate rights to this identifier.
It's _LIKELY that someone else who considers themselves to be "implementation" is using _RESTRICT, but I suppose we may as well try. We should consider consistently using a common prefix (_STL_ seems the clear candidate) for our macro definitions. (Not for this PR.)
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Followup to #4958.
Fixes VSO-2295101 "[RWC][prod/fe][Regression] Cutlass failed with error C2146: syntax error: missing
')'
before identifier'_Dest'
". It appears that CUDA 12.4 doesn't understand__restrict
, so when it attempts to split code between the device (GPU) and host (MSVC) compilers, it mangles what's sent to MSVC.Apparently this is only an issue when the template is instantiated, which is why it wasn't found by our CUDA test coverage (that includes all headers and does nothing with them). It was only encountered in our Real World Code test suite where we wisely harnessed NVIDIA's Cutlass project.
I am slightly nervous that bad kitties have grabbed
_RESTRICT
but we're following that form for_LIKELY
etc. and I don't want to pre-emptively avoid issues when we have all legitimate rights to this identifier.Fixes AB#2295101.