[ONNX] Insert contiguous node between transpose and view before calling run_decompositions #137340

xadupre · 2024-10-04T12:50:06Z

Works around #136543.

This fix solves the issue only in the context of the ONNX exporter but this issue happens in other context.

The bug happens when method run_decompositions is called. The failing pattern is assumed to be view(transpose(x, ...)). This pattern is replaced by view(flatten(transpose(x, ..))). By changing the dimensions, the strides are updated as well and run_decompositions does not fail anymore. It would be inefficient on a 1D tensor but then transpose would not be used. The extra node appears in the final onnx graph but is removed after optimization. The final onnx graph should not be impacted and no performance loss should be observed for the onnx model.

pytorch-bot · 2024-10-04T12:50:10Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/137340

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 65057d6 with merge base a063a82 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

justinchuby · 2024-10-04T14:10:27Z

Is this intended to be merged in? If we can address the root cause that would be the best imo, since we have had issues with too many FX passes where we don’t know which one to look when an error is introduced. I can run a git bisect today to try to pin down the commit.

If we do decide to merge this as a temporary fix, I recommend moving the logic to _fx_passes, where all fx passes live.

test/onnx/exporter/test_dynamo_exporter.py

xadupre · 2024-10-04T14:26:05Z

The fix is just one pass on the fx graph. It is done inplace. It should be very fast. I can move the function to _fx_passes.py if that's what you mean.

justinchuby · 2024-10-04T14:59:57Z

The fix is just one pass on the fx graph. It is done inplace. It should be very fast. I can move the function to _fx_passes.py if that's what you mean.

I am more concerned about complexity and robustness over speed. But since this should be a temporary fix, it should be useful to unblock us for now. I would just make it very clear that it should be removed when the issue is resolved and should be removed before the 2.6 release

torch/onnx/_internal/exporter/_fx_passes.py

torch/onnx/_internal/exporter/_core.py

test/onnx/exporter/test_dynamo_exporter.py

Co-authored-by: Justin Chu <justinchuby@users.noreply.github.com>

torch/onnx/_internal/exporter/_fx_passes.py

justinchuby · 2024-10-05T00:05:58Z

I tested that changing to contiguous works.

        with graph.inserting_after(node):
            new_node = graph.call_function(torch.ops.aten.contiguous.default, args=(node,))
            node.replace_all_uses_with(new_node)
            # new_node is replaced as well so we manually revert the replacement
            new_node.update_arg(0, node)
            node.users = {new_node: None}

test/onnx/exporter/test_dynamo_exporter.py

torch/onnx/_internal/exporter/_fx_passes.py

test/onnx/exporter/test_small_models_e2e.py

justinchuby · 2024-10-08T14:29:23Z

@pytorchbot merge

pytorchmergebot · 2024-10-08T14:31:16Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Insert flatten node between transpose and view

22dd120

xadupre requested review from titaiwangms, shubhambhokare1, justinchuby and wschin as code owners October 4, 2024 12:50

pytorch-bot bot added the release notes: onnx torch.onnx related changes that should show up in the release notes label Oct 4, 2024

pytorchbot added the open source label Oct 4, 2024

avoid one corner case to fail

98daedb

justinchuby reviewed Oct 4, 2024

View reviewed changes

test/onnx/exporter/test_dynamo_exporter.py Outdated Show resolved Hide resolved

justinchuby mentioned this pull request Oct 4, 2024

[export] Cannot view a tensor with shape torch.Size([1, 512, 32, 128]) and strides (2097152, 128, 65536, 1) as a tensor with shape (1, 512, 4096) #136543

Closed

xadupre added 2 commits October 4, 2024 18:13

reply to comment

82969da

lint

0edccfd

justinchuby reviewed Oct 4, 2024

View reviewed changes

torch/onnx/_internal/exporter/_fx_passes.py Outdated Show resolved Hide resolved

justinchuby reviewed Oct 4, 2024

View reviewed changes

torch/onnx/_internal/exporter/_core.py Outdated Show resolved Hide resolved

justinchuby approved these changes Oct 4, 2024

View reviewed changes

justinchuby reviewed Oct 4, 2024

View reviewed changes

test/onnx/exporter/test_dynamo_exporter.py Outdated Show resolved Hide resolved

justinchuby reviewed Oct 4, 2024

View reviewed changes

test/onnx/exporter/test_dynamo_exporter.py Outdated Show resolved Hide resolved

justinchuby reviewed Oct 4, 2024

View reviewed changes

test/onnx/exporter/test_dynamo_exporter.py Outdated Show resolved Hide resolved

justinchuby reviewed Oct 4, 2024

View reviewed changes

test/onnx/exporter/test_dynamo_exporter.py Outdated Show resolved Hide resolved

justinchuby reviewed Oct 4, 2024

View reviewed changes

test/onnx/exporter/test_dynamo_exporter.py Outdated Show resolved Hide resolved

xadupre and others added 4 commits October 4, 2024 20:36

Update torch/onnx/_internal/exporter/_fx_passes.py

515d439

Co-authored-by: Justin Chu <justinchuby@users.noreply.github.com>

Update torch/onnx/_internal/exporter/_core.py

c7bf746

Co-authored-by: Justin Chu <justinchuby@users.noreply.github.com>

remove unused option

cbb90d0

remove device

a6a7bca

justinchuby reviewed Oct 4, 2024

View reviewed changes

torch/onnx/_internal/exporter/_fx_passes.py Outdated Show resolved Hide resolved

lint

0bf7cfb

justinchuby added the topic: bug fixes topic category label Oct 4, 2024

justinchuby reviewed Oct 4, 2024

View reviewed changes

torch/onnx/_internal/exporter/_fx_passes.py Outdated Show resolved Hide resolved

justinchuby reviewed Oct 4, 2024

View reviewed changes

torch/onnx/_internal/exporter/_fx_passes.py Outdated Show resolved Hide resolved

lint

a9b0a92

justinchuby reviewed Oct 5, 2024

View reviewed changes

torch/onnx/_internal/exporter/_fx_passes.py Outdated Show resolved Hide resolved

justinchuby reviewed Oct 5, 2024

View reviewed changes

torch/onnx/_internal/exporter/_fx_passes.py Outdated Show resolved Hide resolved

justinchuby reviewed Oct 5, 2024

View reviewed changes

test/onnx/exporter/test_dynamo_exporter.py Outdated Show resolved Hide resolved

xadupre added 4 commits October 7, 2024 11:20

Merge branch 'main' of https://github.com/pytorch/pytorch into flatten

6f6a2ae

Lint and comments

4bec0dd

rename flatten into contiguous

3914242

add missing file

f3a647c

justinchuby reviewed Oct 7, 2024

View reviewed changes

torch/onnx/_internal/exporter/_fx_passes.py Outdated Show resolved Hide resolved

xadupre added 2 commits October 7, 2024 18:12

call_function

4c20bda

lint

abc7156

titaiwangms reviewed Oct 7, 2024

View reviewed changes

test/onnx/exporter/test_small_models_e2e.py Show resolved Hide resolved

justinchuby changed the title ~~[ONNX] Insert flatten node between transpose and view before calling run_decompositions~~ [ONNX] Insert contiguous node between transpose and view before calling run_decompositions Oct 7, 2024

remove call_method

de21d88

justinchuby reviewed Oct 8, 2024

View reviewed changes

test/onnx/exporter/test_small_models_e2e.py Outdated Show resolved Hide resolved

justinchuby approved these changes Oct 8, 2024

View reviewed changes

justinchuby reviewed Oct 8, 2024

View reviewed changes

test/onnx/exporter/test_small_models_e2e.py Outdated Show resolved Hide resolved

lint

65057d6

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 8, 2024

pytorchmergebot added the merging label Oct 8, 2024

pytorchmergebot added the Merged label Oct 8, 2024

pytorchmergebot closed this in 7267363 Oct 8, 2024

pytorchmergebot removed the merging label Oct 8, 2024

[ONNX] Insert contiguous node between transpose and view before calling run_decompositions #137340

[ONNX] Insert contiguous node between transpose and view before calling run_decompositions #137340

Uh oh!

Conversation

xadupre commented Oct 4, 2024 • edited by justinchuby Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Oct 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/137340

✅ No Failures

Uh oh!

justinchuby commented Oct 4, 2024

Uh oh!

Uh oh!

xadupre commented Oct 4, 2024

Uh oh!

justinchuby commented Oct 4, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

justinchuby commented Oct 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

justinchuby commented Oct 8, 2024

Uh oh!

pytorchmergebot commented Oct 8, 2024

Merge started

Uh oh!

Uh oh!

xadupre commented Oct 4, 2024 •

edited by justinchuby

Loading

pytorch-bot bot commented Oct 4, 2024 •

edited

Loading

justinchuby commented Oct 5, 2024 •

edited

Loading