CARVIEW |
Navigation Menu
-
Notifications
You must be signed in to change notification settings - Fork 24.7k
Fix example: Address broadcasting error in the addition of `attn_bias… #135427
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…` and `attn_mask`, and correct device assignment for newly created variables in the method.
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/135427
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit fecdc84 with merge base 8334cb2 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
@pytorchbot label "topic: not user facing" |
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
pytorch#135427) …` and `attn_mask`, and correct device assignment for newly created variables in the method. Fix example: Address broadcasting error in the addition of `attn_bias` and `attn_mask`, and correct device assignment for newly created variables in the method. 1. Adding `attn_bias += attn_mask` results in a broadcasting error. The expected shape of `attn_bias` is (L, S), so the output should also have the shape (L, S). However, when the input shape is (N, num_heads, L, S), broadcasting occurs, leading to an output shape of (N, num_heads, L, S), which is not desired. 2. `attn_bias` is a newly created variable within the method, but it is not assigned to the correct device. **This is my retry of PR pytorch#130209 . The PR has been merged into commit `d4a79d4a7c746068d25fe5cf9333495561f4ce1f`, but the modifications were overwritten by subsequent commits.** Co-authored-by: mikaylagawarecki <mikaylagawarecki@gmail.com> @mikaylagawarecki provided a more elegant implementation. Pull Request resolved: pytorch#135427 Approved by: https://github.com/ezyang
pytorch#135427) …` and `attn_mask`, and correct device assignment for newly created variables in the method. Fix example: Address broadcasting error in the addition of `attn_bias` and `attn_mask`, and correct device assignment for newly created variables in the method. 1. Adding `attn_bias += attn_mask` results in a broadcasting error. The expected shape of `attn_bias` is (L, S), so the output should also have the shape (L, S). However, when the input shape is (N, num_heads, L, S), broadcasting occurs, leading to an output shape of (N, num_heads, L, S), which is not desired. 2. `attn_bias` is a newly created variable within the method, but it is not assigned to the correct device. **This is my retry of PR pytorch#130209 . The PR has been merged into commit `d4a79d4a7c746068d25fe5cf9333495561f4ce1f`, but the modifications were overwritten by subsequent commits.** Co-authored-by: mikaylagawarecki <mikaylagawarecki@gmail.com> @mikaylagawarecki provided a more elegant implementation. Pull Request resolved: pytorch#135427 Approved by: https://github.com/ezyang
…
and
attn_mask`, and correct device assignment for newly created variables in the method.Fix example: Address broadcasting error in the addition of
attn_bias
andattn_mask
, and correct device assignment for newly created variables in the method.attn_bias += attn_mask
results in a broadcasting error. The expected shape ofattn_bias
is (L, S), so the output should also have the shape (L, S). However, when the input shape is (N, num_heads, L, S), broadcasting occurs, leading to an output shape of (N, num_heads, L, S), which is not desired.attn_bias
is a newly created variable within the method, but it is not assigned to the correct device.This is my retry of PR #130209 . The PR has been merged into commit
d4a79d4a7c746068d25fe5cf9333495561f4ce1f
, but the modifications were overwritten by subsequent commits.Co-authored-by: mikaylagawarecki mikaylagawarecki@gmail.com
@mikaylagawarecki provided a more elegant implementation.