CARVIEW |
Navigation Menu
-
Notifications
You must be signed in to change notification settings - Fork 24.7k
[PyTorch] Use 128-bit vectors for ARM64 #137426
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The correct vector length for ARM64 is 128 bits (16 bytes). We were previously using double this, apparently just because that would be the same length as AVX2. Differential Revision: [D63984039](https://our.internmc.facebook.com/intern/diff/D63984039/) [ghstack-poisoned]
đź”— Helpful Linksđź§Ş See artifacts and rendered test results at hud.pytorch.org/pr/137426
Note: Links to docs will display an error until the docs builds have been completed. âś… No FailuresAs of commit 7370baf with merge base fb0da32 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This pull request was exported from Phabricator. Differential Revision: D63984039 |
The correct vector length for ARM64 is 128 bits (16 bytes). We were previously using double this, apparently just because that would be the same length as AVX2. Differential Revision: [D63984039](https://our.internmc.facebook.com/intern/diff/D63984039/) ghstack-source-id: 246599433 Pull Request resolved: #137426
@pytorchbot label "ciflow/linux-aarch64" |
The correct vector length for ARM64 is 128 bits (16 bytes). We were previously using double this, apparently just because that would be the same length as AVX2. Differential Revision: [D63984039](https://our.internmc.facebook.com/intern/diff/D63984039/) cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 voznesenskym penguinwu EikanWang Guobing-Chen zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang [ghstack-poisoned]
This pull request was exported from Phabricator. Differential Revision: D63984039 |
The correct vector length for ARM64 is 128 bits (16 bytes). We were previously using double this, apparently just because that would be the same length as AVX2. Differential Revision: [D63984039](https://our.internmc.facebook.com/intern/diff/D63984039/) cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 voznesenskym penguinwu EikanWang Guobing-Chen zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang [ghstack-poisoned]
This pull request was exported from Phabricator. Differential Revision: D63984039 |
Pull Request resolved: #137426 The correct vector length for ARM64 is 128 bits (16 bytes). We were previously using double this, apparently just because that would be the same length as AVX2. ghstack-source-id: 246714926 Differential Revision: [D63984039](https://our.internmc.facebook.com/intern/diff/D63984039/)
The correct vector length for ARM64 is 128 bits (16 bytes). We were previously using double this, apparently just because that would be the same length as AVX2. Differential Revision: [D63984039](https://our.internmc.facebook.com/intern/diff/D63984039/) cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 voznesenskym penguinwu EikanWang Guobing-Chen zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang [ghstack-poisoned]
This pull request was exported from Phabricator. Differential Revision: D63984039 |
The correct vector length for ARM64 is 128 bits (16 bytes). We were previously using double this, apparently just because that would be the same length as AVX2. Differential Revision: [D63984039](https://our.internmc.facebook.com/intern/diff/D63984039/) cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 voznesenskym penguinwu EikanWang Guobing-Chen zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang [ghstack-poisoned]
Pull Request resolved: #137426 The correct vector length for ARM64 is 128 bits (16 bytes). We were previously using double this, apparently just because that would be the same length as AVX2. ghstack-source-id: 246998171 Differential Revision: [D63984039](https://our.internmc.facebook.com/intern/diff/D63984039/)
This pull request was exported from Phabricator. Differential Revision: D63984039 |
… for ARM64" The correct vector length for ARM64 is 128 bits (16 bytes). We were previously using double this, apparently just because that would be the same length as AVX2. Differential Revision: [D63984039](https://our.internmc.facebook.com/intern/diff/D63984039/) cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 voznesenskym penguinwu EikanWang Guobing-Chen zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang [ghstack-poisoned]
Pull Request resolved: #137426 The correct vector length for ARM64 is 128 bits (16 bytes). We were previously using double this, apparently just because that would be the same length as AVX2. ghstack-source-id: 247065546 Differential Revision: [D63984039](https://our.internmc.facebook.com/intern/diff/D63984039/)
This pull request was exported from Phabricator. Differential Revision: D63984039 |
The correct vector length for ARM64 is 128 bits (16 bytes). We were previously using double this, apparently just because that would be the same length as AVX2. Differential Revision: [D63984039](https://our.internmc.facebook.com/intern/diff/D63984039/) cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 malfet snadampal milpuz01 voznesenskym penguinwu EikanWang Guobing-Chen zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang [ghstack-poisoned]
This pull request was exported from Phabricator. Differential Revision: D63984039 |
There is no guarantee that `len` here is enough for a full vector. This was causing at least one test failure on #137426. Differential Revision: [D64857786](https://our.internmc.facebook.com/intern/diff/D64857786/) [ghstack-poisoned]
The correct vector length for ARM64 is 128 bits (16 bytes). We were previously using double this, apparently just because that would be the same length as AVX2. Differential Revision: [D63984039](https://our.internmc.facebook.com/intern/diff/D63984039/) cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 malfet snadampal milpuz01 voznesenskym penguinwu EikanWang Guobing-Chen zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang [ghstack-poisoned]
This pull request was exported from Phabricator. Differential Revision: D63984039 |
Pull Request resolved: #137426 The correct vector length for ARM64 is 128 bits (16 bytes). We were previously using double this, apparently just because that would be the same length as AVX2. ghstack-source-id: 249693590 Differential Revision: [D63984039](https://our.internmc.facebook.com/intern/diff/D63984039/)
The correct vector length for ARM64 is 128 bits (16 bytes). We were previously using double this, apparently just because that would be the same length as AVX2. Differential Revision: [D63984039](https://our.internmc.facebook.com/intern/diff/D63984039/) cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 malfet snadampal milpuz01 voznesenskym penguinwu EikanWang Guobing-Chen zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang [ghstack-poisoned]
This pull request was exported from Phabricator. Differential Revision: D63984039 |
Pull Request resolved: #137426 The correct vector length for ARM64 is 128 bits (16 bytes). We were previously using double this, apparently just because that would be the same length as AVX2. ghstack-source-id: 249869364 Differential Revision: [D63984039](https://our.internmc.facebook.com/intern/diff/D63984039/)
This issue is one of two inductor bugs blocking land of #137426. Turned out to be simple Differential Revision: [D64734116](https://our.internmc.facebook.com/intern/diff/D64734116/) Pull Request resolved: #138542 Approved by: https://github.com/jgong5, https://github.com/malfet ghstack dependencies: #138486 Co-authored-by: leslie-fang-intel <leslie.fang@intel.com>
There is no guarantee that `len` here is enough for a full vector. This was causing at least one test failure on #137426. Differential Revision: [D64857786](https://our.internmc.facebook.com/intern/diff/D64857786/) Pull Request resolved: #138744 Approved by: https://github.com/jgong5, https://github.com/malfet ghstack dependencies: #138486, #138542, #138655, #138716
@pytorchbot merge |
Merge failedReason: Approvers from one of the following sets are needed:
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Stack from ghstack (oldest at bottom):
The correct vector length for ARM64 is 128 bits (16
bytes). We were previously using double this, apparently just because
that would be the same length as AVX2.
Differential Revision: D63984039
cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @malfet @snadampal @milpuz01 @voznesenskym @penguinwu @EikanWang @Guobing-Chen @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang @aakhundov