CARVIEW |
Navigation Menu
-
Notifications
You must be signed in to change notification settings - Fork 24.7k
[Inductor][CPU] Enable the oneDNN Linear fusion for special case #139172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Inductor][CPU] Enable the oneDNN Linear fusion for special case #139172
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/139172
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 61d0eed with merge base 52c80f6 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
torch/_decomp/decompositions.py
Outdated
return all( | ||
st1 == st2 * s2 if s2 != 1 else True | ||
for (st1, st2, s2) in zip(t1_stride[:-2], t1_stride[1:-1], t1_shape[1:-1]) | ||
(isinstance(size, utils.IntWithoutSymInt) and size == 1) or left == right |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking size == 1
is not enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, in the PreCI, we met a UT which the size seems to be a unbacked int as symbol of u0
which failed in _maybe_evaluate_static
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ezyang Is this the recommended way for checking size
is concrete and is 1
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No it's not... usually guard_size_oblivious(size == 1) is what you want here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comment and change to use guard_size_oblivious
.
torch/_decomp/decompositions.py
Outdated
return all( | ||
st1 == st2 * s2 if s2 != 1 else True | ||
for (st1, st2, s2) in zip(t1_stride[:-2], t1_stride[1:-1], t1_shape[1:-1]) | ||
(isinstance(size, utils.IntWithoutSymInt) and size == 1) or left == right |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ezyang Is this the recommended way for checking size
is concrete and is 1
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nope don't do it this way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sympy is BANNED from decompositions.py it should never be necessary to explicitly work with sympy there
Changed without |
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
…orch#139172) **Summary** In the case of LLaMA2, for a linear operation with an activation size of `(4, 1, 4096)` and a stride of `(4096, 128, 1)` which has been decomposed into `matmul`. And the decomposition of `matmul` results in `bmm` due to a strict continuity check. We can align the continuity check with ATen by skip dim of size 1 to enable decomposition into `mm` instead. **Test Plan** ``` python -u -m pytest -s -v test/inductor/test_mkldnn_pattern_matcher.py -k test_linear_input_non_contiguous_3D_wo_bias ``` Pull Request resolved: pytorch#139172 Approved by: https://github.com/jgong5, https://github.com/ezyang
Stack from ghstack (oldest at bottom):
Summary
In the case of LLaMA2, for a linear operation with an activation size of
(4, 1, 4096)
and a stride of(4096, 128, 1)
which has been decomposed intomatmul
. And the decomposition ofmatmul
results inbmm
due to a strict continuity check. We can align the continuity check with ATen by skip dim of size 1 to enable decomposition intomm
instead.Test Plan
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang @aakhundov