CARVIEW |
Navigation Menu
-
Notifications
You must be signed in to change notification settings - Fork 24.7k
[Inductor][CPP] Add oneDNN BRGEMM config for Half cpp gemm template #136255
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/136255
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 5070fdd with merge base 3672c68 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
c27521a
to
febedb3
Compare
a2a7a0f
to
399b92d
Compare
@@ -134,6 +134,14 @@ def _check_amx_counter(self, vec_amx): | |||
else: | |||
self.assertEqual(counters["inductor"]["cpp_micro_gemm_amx_counter"], 0) | |||
|
|||
def _check_brgemm_counter(self, vec_amx): | |||
if vec_amx and ( | |||
torch.cpu._is_amx_fp16_supported() or torch.cpu._is_avx512_fp16_supported() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we count in AVX512-FP16? Are we relying on the FP16 FMA which is less accurate than AMX?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking avx512_fp16 is enough. Modified. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, my question is why we care about avx512_fp16 here. We only use AMX FP16, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed non-amx fp16 support. Only check AMX FP16 here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please explain why we need avx512 fp16 here.
Removed non-amx fp16 support. |
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
@pytorchbot rebase |
@pytorchbot merge |
@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here |
Successfully rebased |
b9b2588
to
25f410f
Compare
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: 1 mandatory check(s) failed. The first few are: Dig deeper by viewing the failures on hud |
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
…ytorch#136255) `kernel_micro_gemm` generated using BRGEMM: ``` template <bool accum> inline void kernel_micro_gemm( const half* __restrict__ A, const half* __restrict__ B, float* __restrict__ C, int64_t M, int64_t N, int64_t K, int64_t lda, int64_t ldb, int64_t ldc ) { at::native::cpublas::brgemm( M, N, K, lda, ldb, ldc, 1.f, accum ? 1.f : 0.f, A, B, C); } ``` Pull Request resolved: pytorch#136255 Approved by: https://github.com/jgong5, https://github.com/jansel
kernel_micro_gemm
generated using BRGEMM:cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @voznesenskym @penguinwu @EikanWang @Guobing-Chen @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang @aakhundov @rec