CARVIEW |
Navigation Menu
-
Notifications
You must be signed in to change notification settings - Fork 24.7k
Fix RMSNorm doc per #136597 #136727
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix RMSNorm doc per #136597 #136727
Conversation
[ghstack-poisoned]
π Helpful Linksπ§ͺ See artifacts and rendered test results at hud.pytorch.org/pr/136727
Note: Links to docs will display an error until the docs builds have been completed. β No FailuresAs of commit 25c7820 with merge base 13b0baf ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
[ghstack-poisoned]
Fixes #136597 [ghstack-poisoned]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Do we actually have the epsilon inside the square root? Or we add it outside? We had issues with this kind of things in the optimizer making the eps actually zero so you might want to double check with @janeyx99 on this. (not for this PR obviously)
It's really inside pytorch/aten/src/ATen/native/layer_norm.cpp Line 295 in 2202eb5
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
The merge job was canceled or timed out. This most often happen if two merge requests were issued for the same PR, or if merge job was waiting for more than 6 hours for tests to finish. In later case, please do not hesitate to reissue the merge command |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: Comment with id 2378345814 not found Details for Dev Infra teamRaised by workflow job |
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
@mikaylagawarecki @albanD I have noticed a small detail: It says
Using y_i = \frac{x_i}{\text{RMS}(x)} \gamma_i, \quad \text{where} \quad \text{RMS}(x) = \sqrt{\epsilon + \frac{1}{n} \sum_{i=1}^{n} x_i^2} What do you think about that? |
When will this be uploaded to the website? |
It's on |
Ah, okay, thanks - I was wondering why it was not on stable yet: |
Fixes #136597 (remove incorrect sqrt around
RMS(x)
)Stack from ghstack (oldest at bottom):