CARVIEW |
Navigation Menu
-
Notifications
You must be signed in to change notification settings - Fork 24.7k
Ensure SWA boundary conditions w.r.t. definition #133773
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
đź”— Helpful Linksđź§Ş See artifacts and rendered test results at hud.pytorch.org/pr/133773
Note: Links to docs will display an error until the docs builds have been completed. âś… You can merge normally! (1 Unrelated Failure)As of commit bdf0be1 with merge base 73fde0d ( FLAKY - The following job failed but was likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we're looking to improve the UX of these functions, let's go all the way and expand the current documentation to include what decay should be used for in get_ema_multi_avg_fn
and get_ema_avg_fn
. Then, instead of using an assert, we can throw a ValueError if a bad decay was given as input, similar to how we handle bad inputs for optimizer constructors. https://github.com/pytorch/pytorch/blob/main/torch/optim/adamw.py#L55
Great idea, will do that. |
Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as |
@SkBlaz do you still plan on bringing this over the finish line? |
yes, sorry for some delay. Will do it in the next 4 days
…On Thu, Oct 24, 2024, 18:56 Jane (Yuan) Xu ***@***.***> wrote:
@SkBlaz <https://github.com/SkBlaz> do you still plan on bringing this
over the finish line?
—
Reply to this email directly, view it on GitHub
<#133773 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACMSERA5N3PIOGM3GDKAX73Z5ERDHAVCNFSM6AAAAABMVTYTRWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMZVG44TCMZQHA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@janeyx99 please lmk if this is what you had in mind. |
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: 2 jobs have failed, first few of them are: trunk / macos-py3-arm64 / test (default, 1, 3, macos-m1-stable), trunk / macos-py3-arm64 / test (default, 3, 3, macos-m1-stable) Details for Dev Infra teamRaised by workflow job |
@janeyx99 hi, the failing builds don't seem related to this PR. Any idea? |
@pytorchbot merge -r |
@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here |
According to the documentation, decay is a number in [0,1] range. An inspection of `swa_utils.py` indicates there are no checks for invalid values of `decay`. Adding asserts as suggested in this PR ensures valid compute range (one way to enforce correct behavior, there are perhaps more suitable ones). Papers `torch` cites for reference idea/implementation also consider exclusively this range (e.g., https://arxiv.org/pdf/2310.04415).
Successfully rebased |
bb58258
to
bdf0be1
Compare
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
According to the documentation, decay is a number in [0,1] range,[ i.e.](https://pytorch.org/docs/stable/optim.html) ``` Decay is a parameter between 0 and 1 that controls how fast the averaged parameters are decayed. If not provided to get_ema_multi_avg_fn, the default is 0.999. ``` An inspection of `swa_utils.py` indicates there are no checks for invalid values of `decay`. Adding asserts as suggested in this PR ensures valid compute range (one way to enforce correct behavior, there are perhaps more suitable ones). Papers `torch` cites for reference idea/implementation also consider exclusively this range (e.g., https://arxiv.org/pdf/2310.04415). Fixes pytorch#133772 Pull Request resolved: pytorch#133773 Approved by: https://github.com/janeyx99
According to the documentation, decay is a number in [0,1] range, i.e.
An inspection of
swa_utils.py
indicates there are no checks for invalid values ofdecay
. Adding asserts as suggested in this PR ensures valid compute range (one way to enforce correct behavior, there are perhaps more suitable ones). Paperstorch
cites for reference idea/implementation also consider exclusively this range (e.g., https://arxiv.org/pdf/2310.04415).Fixes #133772