[Inductor][CPP] Add oneDNN BRGEMM config for Half cpp gemm template #136255

CaoE · 2024-09-18T07:33:31Z

kernel_micro_gemm generated using BRGEMM:

template <bool accum>
inline void kernel_micro_gemm(
    const half* __restrict__ A,
    const half* __restrict__ B,
    float* __restrict__ C,
    int64_t M,
    int64_t N,
    int64_t K,
    int64_t lda,
    int64_t ldb,
    int64_t ldc
) {
    at::native::cpublas::brgemm(
      M, N, K,
      lda, ldb, ldc,
      1.f, accum ? 1.f : 0.f,
      A,
      B,
      C);
}

cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @voznesenskym @penguinwu @EikanWang @Guobing-Chen @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang @aakhundov @rec

pytorch-bot · 2024-09-18T07:33:34Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/136255

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 5070fdd with merge base 3672c68 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

torch/csrc/cpu/Module.cpp

torch/_inductor/codegen/cpp_micro_gemm.py

test/inductor/test_cpu_select_algorithm.py

jgong5 · 2024-10-23T05:28:48Z

test/inductor/test_cpu_select_algorithm.py


+    def _check_brgemm_counter(self, vec_amx):
+        if vec_amx and (
+            torch.cpu._is_amx_fp16_supported() or torch.cpu._is_avx512_fp16_supported()


Why do we count in AVX512-FP16? Are we relying on the FP16 FMA which is less accurate than AMX?

Checking avx512_fp16 is enough. Modified. Thanks.

No, my question is why we care about avx512_fp16 here. We only use AMX FP16, right?

Removed non-amx fp16 support. Only check AMX FP16 here.

jgong5

Please explain why we need avx512 fp16 here.

CaoE · 2024-10-29T08:53:46Z

Please explain why we need avx512 fp16 here.

Removed non-amx fp16 support.

CaoE · 2024-11-03T10:54:47Z

@pytorchbot merge

pytorchmergebot · 2024-11-03T10:56:39Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2024-11-03T10:56:57Z

Merge failed

Reason: 13 mandatory check(s) failed. The first few are:

Dig deeper by viewing the failures on hud

Details for Dev Infra team

Raised by workflow job

Failing merge rule: Core Maintainers

CaoE · 2024-11-03T12:08:36Z

@pytorchbot rebase

CaoE · 2024-11-03T12:08:57Z

@pytorchbot merge

pytorchmergebot · 2024-11-03T12:10:12Z

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

pytorchmergebot · 2024-11-03T12:10:16Z

Successfully rebased gemm_template onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout gemm_template && git pull --rebase)

pytorchmergebot · 2024-11-03T12:10:30Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2024-11-03T13:08:34Z

Merge failed

Reason: 1 mandatory check(s) failed. The first few are:

pull / cuda12.1-py3.10-gcc9-sm75 / test (pr_time_benchmarks, 1, 1, linux.g4dn.metal.nvidia.gpu)

Dig deeper by viewing the failures on hud

Details for Dev Infra team

Raised by workflow job

Failing merge rule: Core Maintainers

CaoE · 2024-11-05T05:25:51Z

@pytorchbot merge

pytorchmergebot · 2024-11-05T05:27:38Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

…ytorch#136255) `kernel_micro_gemm` generated using BRGEMM: ``` template <bool accum> inline void kernel_micro_gemm( const half* __restrict__ A, const half* __restrict__ B, float* __restrict__ C, int64_t M, int64_t N, int64_t K, int64_t lda, int64_t ldb, int64_t ldc ) { at::native::cpublas::brgemm( M, N, K, lda, ldb, ldc, 1.f, accum ? 1.f : 0.f, A, B, C); } ``` Pull Request resolved: pytorch#136255 Approved by: https://github.com/jgong5, https://github.com/jansel

pytorch-bot bot added ciflow/inductor module: cpu CPU specific problem (e.g., perf, algorithm) module: dynamo module: inductor labels Sep 18, 2024

CaoE added ciflow/trunk Trigger trunk jobs on your pull request topic: not user facing topic category labels Sep 18, 2024

pytorchbot added the open source label Sep 18, 2024

CaoE force-pushed the gemm_template branch 3 times, most recently from c27521a to febedb3 Compare September 19, 2024 07:58

CaoE requested review from jgong5 and leslie-fang-intel September 19, 2024 08:13

CaoE force-pushed the gemm_template branch from febedb3 to e574d5b Compare September 20, 2024 07:20

jgong5 requested changes Sep 23, 2024

View reviewed changes

torch/csrc/cpu/Module.cpp Show resolved Hide resolved

torch/_inductor/codegen/cpp_micro_gemm.py Show resolved Hide resolved

test/inductor/test_cpu_select_algorithm.py Show resolved Hide resolved

CaoE force-pushed the gemm_template branch 2 times, most recently from a2a7a0f to 399b92d Compare October 21, 2024 01:20

CaoE requested a review from jgong5 October 21, 2024 13:02

jgong5 requested changes Oct 23, 2024

View reviewed changes

CaoE force-pushed the gemm_template branch from 399b92d to 6993512 Compare October 23, 2024 08:50

CaoE changed the title ~~Add oneDNN BRGEMM config for Half cpp gemm template~~ [Inductor][CPP] Add oneDNN BRGEMM config for Half cpp gemm template Oct 23, 2024

CaoE requested a review from jgong5 October 29, 2024 00:43

jgong5 reviewed Oct 29, 2024

View reviewed changes

CaoE force-pushed the gemm_template branch from 6993512 to 639d225 Compare October 29, 2024 08:50

CaoE requested a review from jgong5 October 29, 2024 08:54

jgong5 approved these changes Oct 30, 2024

View reviewed changes

CaoE marked this pull request as ready for review October 30, 2024 03:25

CaoE requested a review from jansel October 30, 2024 03:26

CaoE force-pushed the gemm_template branch from 639d225 to b9b2588 Compare October 30, 2024 05:04

jansel approved these changes Oct 31, 2024

View reviewed changes

pytorchmergebot added the merging label Nov 3, 2024

pytorchmergebot removed the merging label Nov 3, 2024

pytorchmergebot force-pushed the gemm_template branch from b9b2588 to 25f410f Compare November 3, 2024 12:10

pytorchmergebot added the merging label Nov 3, 2024

pytorchmergebot removed the merging label Nov 3, 2024

CaoE added 2 commits November 4, 2024 17:07

add oneDNN BRGEMM config for Half cpp gemm template

baa10cf

minor fix

5070fdd

CaoE force-pushed the gemm_template branch from 25f410f to 5070fdd Compare November 5, 2024 01:07

pytorchmergebot added the merging label Nov 5, 2024

pytorchmergebot closed this in 9e14d86 Nov 5, 2024

pytorchmergebot added Merged and removed merging labels Nov 5, 2024

[Inductor][CPP] Add oneDNN BRGEMM config for Half cpp gemm template #136255

[Inductor][CPP] Add oneDNN BRGEMM config for Half cpp gemm template #136255

Uh oh!

Conversation

CaoE commented Sep 18, 2024 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Sep 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/136255

✅ No Failures

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jgong5 Oct 23, 2024

Choose a reason for hiding this comment

Uh oh!

CaoE Oct 24, 2024

Choose a reason for hiding this comment

Uh oh!

jgong5 Oct 29, 2024

Choose a reason for hiding this comment

Uh oh!

CaoE Oct 29, 2024

Choose a reason for hiding this comment

Uh oh!

jgong5 left a comment

Choose a reason for hiding this comment

Uh oh!

CaoE commented Oct 29, 2024

Uh oh!

CaoE commented Nov 3, 2024

Uh oh!

pytorchmergebot commented Nov 3, 2024

Merge started

Uh oh!

pytorchmergebot commented Nov 3, 2024

Merge failed

Uh oh!

CaoE commented Nov 3, 2024

Uh oh!

CaoE commented Nov 3, 2024

Uh oh!

pytorchmergebot commented Nov 3, 2024

Uh oh!

pytorchmergebot commented Nov 3, 2024

Uh oh!

pytorchmergebot commented Nov 3, 2024

Merge started

Uh oh!

pytorchmergebot commented Nov 3, 2024

Merge failed

Uh oh!

CaoE commented Nov 5, 2024

Uh oh!

pytorchmergebot commented Nov 5, 2024

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

CaoE commented Sep 18, 2024 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Sep 18, 2024 •

edited

Loading