[inductor] force contiguous layout for implicit fallback #140996

shunting314 · 2024-11-19T01:23:02Z

Stack from ghstack (oldest at bottom):

-> [inductor] force contiguous layout for implicit fallback #140996

Horace found that when we implicitly fallback to eager, some eager kernels may not work correctly if Inductor provide non-contiguous inputs (due to padding etc.). The original issue is found for the backward op of weight_norm. The fix in this PR is a general one: we force inputs to all implicit fallback kernels to be contiguous.

I have to refactor the code a bit to make it work. Previously we apply layout constraint in GraphLowering.run_node. We looks for implicit fallback in call_function. The problem here is, when we setup the implicit fallback in call_function with a layout constraint, we don't have a chance to apply the constraints.. The refactor moves the code that applies layout constraints to call_function.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang @aakhundov

[ghstack-poisoned]

pytorch-bot · 2024-11-19T01:23:06Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/140996

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

[DomainsOnly] Jobs fail with GLIBC version not found

❌ 1 New Failure

As of commit 5a54d8e with merge base eff2217 ():

NEW FAILURE - The following job has failed:

Lint / lintrunner-noclang / linux-job (gh)
>>> Lint for test/test_nestedtensor.py:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ghstack-source-id: e1b27f0 Pull Request resolved: #140996

Fix #140462 . Horace found that when we implicitly fallback to eager, some eager kernels may not work correctly if Inductor provide non-contiguous inputs (due to padding etc.). The original issue is found for the backward op of weight_norm. The fix in this PR is a general one: we force inputs to all implicit fallback kernels to be contiguous. I have to refactor the code a bit to make it work. Previously we apply layout constraint in `GraphLowering.run_node`. We looks for implicit fallback in `call_function`. The problem here is, when we setup the implicit fallback in `call_function` with a layout constraint, we don't have a chance to apply the constraints.. The refactor moves the code that applies layout constraints to `call_function`. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang aakhundov [ghstack-poisoned]

ghstack-source-id: 35a276d Pull Request resolved: #140996

Chillee · 2024-11-19T15:13:33Z

Should we force them to be contiguous or force them to follow the FX strides?

shunting314 · 2024-11-19T22:17:06Z

Should we force them to be contiguous or force them to follow the FX strides?

The problem is we don't have the real eager strides. The strides stored in meta['val'] may be the one after optimization for the bwd graph.

Richard also mentioned that having the real eager strides is needed for custom-op. That gives us more motivation to pass the real eager strides to Inductor.

Chillee · 2024-11-19T22:21:28Z

@shunting314 If that's true then we need to avoid changing eager strides I think haha.

shunting314 · 2024-11-19T22:30:49Z

If that's true then we need to avoid changing eager strides I think haha.

But we should normally be free to change the strides of saved activations since they are not user visible. Keeping them following eager strides may be too restrictive and leaves little padding/layout-optimization opportunities.

Fix #140462 . Horace found that when we implicitly fallback to eager, some eager kernels may not work correctly if Inductor provide non-contiguous inputs (due to padding etc.). The original issue is found for the backward op of weight_norm. The fix in this PR is a general one: we force inputs to all implicit fallback kernels to be contiguous. I have to refactor the code a bit to make it work. Previously we apply layout constraint in `GraphLowering.run_node`. We looks for implicit fallback in `call_function`. The problem here is, when we setup the implicit fallback in `call_function` with a layout constraint, we don't have a chance to apply the constraints.. The refactor moves the code that applies layout constraints to `call_function`. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang aakhundov [ghstack-poisoned]

ghstack-source-id: 30c7f8a Pull Request resolved: #140996

shunting314 · 2024-11-20T06:38:57Z

@pytorchbot merge -f

Line error for nested tensors is unrelated.

pytorch-bot · 2024-11-20T06:38:59Z

❌ 🤖 pytorchbot command failed:

@pytorchbot merge: error: argument -f/--force: expected one argument
usage: @pytorchbot merge [-f MESSAGE | -i] [-ic] [-r [{viable/strict,main}]]

Try @pytorchbot --help for more info.

shunting314 · 2024-11-20T06:39:31Z

@pytorchbot merge -f "Line error for nested tensors is unrelated."

pytorchmergebot · 2024-11-20T06:41:05Z

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

) Fix pytorch#140462 . Horace found that when we implicitly fallback to eager, some eager kernels may not work correctly if Inductor provide non-contiguous inputs (due to padding etc.). The original issue is found for the backward op of weight_norm. The fix in this PR is a general one: we force inputs to all implicit fallback kernels to be contiguous. I have to refactor the code a bit to make it work. Previously we apply layout constraint in `GraphLowering.run_node`. We looks for implicit fallback in `call_function`. The problem here is, when we setup the implicit fallback in `call_function` with a layout constraint, we don't have a chance to apply the constraints.. The refactor moves the code that applies layout constraints to `call_function`. Pull Request resolved: pytorch#140996 Approved by: https://github.com/jansel

zou3519 · 2025-02-26T02:05:41Z

Btw, by default we want custom operators to match their strides in eager mode. Forcing .contiguous() is not the best. This unfortunately broke the vllm-compile integration but at least there's a workaround (setting the tags on the custom ops)

[inductor] force contiguous layout for implicit fallback

e60adf5

[ghstack-poisoned]

pytorch-bot bot added ciflow/inductor module: inductor labels Nov 19, 2024

shunting314 added a commit that referenced this pull request Nov 19, 2024

[inductor] force contiguous layout for implicit fallback

9833afe

ghstack-source-id: e1b27f0 Pull Request resolved: #140996

shunting314 mentioned this pull request Nov 19, 2024

Aten ops might not support non-contiguous inputs, this is a problem for fallbacks. #140462

Closed

shunting314 requested review from eellison and Chillee November 19, 2024 02:29

shunting314 added a commit that referenced this pull request Nov 19, 2024

[inductor] force contiguous layout for implicit fallback

ac4acaf

ghstack-source-id: 35a276d Pull Request resolved: #140996

shunting314 requested review from ezyang and jansel November 19, 2024 02:29

jansel approved these changes Nov 19, 2024

View reviewed changes

shunting314 added the topic: not user facing topic category label Nov 19, 2024

shunting314 added a commit that referenced this pull request Nov 20, 2024

[inductor] force contiguous layout for implicit fallback

62940b6

ghstack-source-id: 30c7f8a Pull Request resolved: #140996

pytorchmergebot added the merging label Nov 20, 2024

pytorchmergebot added the Merged label Nov 20, 2024

pytorchmergebot closed this in 7e9e83a Nov 20, 2024

pytorchmergebot removed the merging label Nov 20, 2024

eellison mentioned this pull request Dec 12, 2024

NHWC GroupNorm weight gradient error #140913

Closed

github-actions bot deleted the gh/shunting314/188/head branch December 21, 2024 02:05

shunting314 mentioned this pull request Jan 14, 2025

Inductor accuracy failures for Weight Norm #140452

Closed

zou3519 mentioned this pull request Apr 9, 2025

sglang x torch.compile silent incorrectness in PyTorch 2.6 for deepseek-v3 #150714

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[inductor] force contiguous layout for implicit fallback #140996

[inductor] force contiguous layout for implicit fallback #140996

Uh oh!

shunting314 commented Nov 19, 2024 •

edited

Loading

Uh oh!

pytorch-bot bot commented Nov 19, 2024 •

edited

Loading

Uh oh!

Chillee commented Nov 19, 2024

Uh oh!

shunting314 commented Nov 19, 2024 •

edited

Loading

Uh oh!

Chillee commented Nov 19, 2024

Uh oh!

shunting314 commented Nov 19, 2024

Uh oh!

shunting314 commented Nov 20, 2024

Uh oh!

pytorch-bot bot commented Nov 20, 2024

Uh oh!

shunting314 commented Nov 20, 2024

Uh oh!

pytorchmergebot commented Nov 20, 2024

Uh oh!

zou3519 commented Feb 26, 2025 •

edited

Loading

Uh oh!

Uh oh!

[inductor] force contiguous layout for implicit fallback #140996

[inductor] force contiguous layout for implicit fallback #140996

Uh oh!

Conversation

shunting314 commented Nov 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Nov 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/140996

❗ 1 Active SEVs

❌ 1 New Failure

Uh oh!

Chillee commented Nov 19, 2024

Uh oh!

shunting314 commented Nov 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Chillee commented Nov 19, 2024

Uh oh!

shunting314 commented Nov 19, 2024

Uh oh!

shunting314 commented Nov 20, 2024

Uh oh!

pytorch-bot bot commented Nov 20, 2024

Uh oh!

shunting314 commented Nov 20, 2024

Uh oh!

pytorchmergebot commented Nov 20, 2024

Merge started

Uh oh!

zou3519 commented Feb 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

shunting314 commented Nov 19, 2024 •

edited

Loading

pytorch-bot bot commented Nov 19, 2024 •

edited

Loading

shunting314 commented Nov 19, 2024 •

edited

Loading

zou3519 commented Feb 26, 2025 •

edited

Loading