You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Reported by Aditya Anand to our internal C++ mailing list.
On x86, it's barely possible for stable_sort() to trigger an integer overflow when repeatedly doubling a chunk size. This happens when sorting over a billion 1-byte elements; the test case below uses 2^30 + 2 bytes. Surprisingly, this bug has been present since the beginning of time; we've historically been pretty good about checking for overflow before multiplying (e.g. in geometric growth), but missed this case.
We can avoid this integer overflow by testing whether we're done before doubling the chunk size. As the comment explains, I've carefully transformed the test to avoid even/odd issues. ranges::stable_sort was equally affected.
I'm not altering the initial doubling in this loop. Because we start with 32, i.e. 2^5, the first doubling is always safe (e.g. from 2^29 to 2^30). It's only the second doubling that might overflow (e.g. from 2^30 to 2^31 here). At this time I simply don't care about pathological _Iter_diff_t that might have a limit other than a normal power of 2. (Note that this logic still works for 16-bit ultra-narrow difference types; it would have to be extremely pathological for this to be an issue.)
Manual Testing
Because this involves allocating an enormous amount of memory, automated test coverage isn't appropriate. I manually tested the fix, and the detailed comments should be sufficient to avoid regressions during future maintenance.
C:\Temp>type meow.cpp
#include<algorithm>
#include<print>
#include<vector>usingnamespacestd;static_assert(sizeof(void*) == 4, "This test requires x86.");
intmain() {
println("_MSVC_STL_UPDATE: {}", _MSVC_STL_UPDATE);
constint count = (1 << 30) + 2;
vector<unsignedchar> v(count);
for (int i = 0; i < count; ++i) {
v[i] = i % 10;
}
println("Before std::stable_sort: {}", is_sorted(v.begin(), v.end()));
stable_sort(v.begin(), v.end());
println(" After std::stable_sort: {}", is_sorted(v.begin(), v.end()));
for (int i = 0; i < count; ++i) {
v[i] = i % 10;
}
println("Before ranges::stable_sort: {}", is_sorted(v.begin(), v.end()));
ranges::stable_sort(v);
println(" After ranges::stable_sort: {}", is_sorted(v.begin(), v.end()));
}
C:\Temp>cl /EHsc /nologo /W4 /std:c++latest /MT /O2 meow.cpp /link /largeaddressaware
meow.cpp
C:\Temp>meow && echo SUCCESS || echo FAILURE
_MSVC_STL_UPDATE: 202508
Before std::stable_sort: false
After std::stable_sort: true
Before ranges::stable_sort: false
After ranges::stable_sort: true
SUCCESS
These tests are compiled with optimizations so they finish in a reasonable amount of time. The behavior is the same, just agonizingly slower, with /MTd /Od.
Is the check in '_Partition_point_unchecked' correct? It can still overflow.
Good catch, that's bogus. It's harder to repro in practice - because we skip over 2 + 4 + 8 + ... elements, we actually handle up to 2 billion. It would require an artificially small iterator difference type (usually only seen in library test suites).
It's trivial to express the check properly with numeric_limits so I've pushed a commit. Here it isn't critical that we endlessly double (we reach the correct asymptotic complexity even if we stop doubling early), so guarding the doubling alone is still fine.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Reported by Aditya Anand to our internal C++ mailing list.
On x86, it's barely possible for
stable_sort()
to trigger an integer overflow when repeatedly doubling a chunk size. This happens when sorting over a billion 1-byte elements; the test case below uses 2^30 + 2 bytes. Surprisingly, this bug has been present since the beginning of time; we've historically been pretty good about checking for overflow before multiplying (e.g. in geometric growth), but missed this case.We can avoid this integer overflow by testing whether we're done before doubling the chunk size. As the comment explains, I've carefully transformed the test to avoid even/odd issues.
ranges::stable_sort
was equally affected.I'm not altering the initial doubling in this loop. Because we start with 32, i.e. 2^5, the first doubling is always safe (e.g. from 2^29 to 2^30). It's only the second doubling that might overflow (e.g. from 2^30 to 2^31 here). At this time I simply don't care about pathological
_Iter_diff_t
that might have a limit other than a normal power of 2. (Note that this logic still works for 16-bit ultra-narrow difference types; it would have to be extremely pathological for this to be an issue.)Manual Testing
Because this involves allocating an enormous amount of memory, automated test coverage isn't appropriate. I manually tested the fix, and the detailed comments should be sufficient to avoid regressions during future maintenance.
Plain VS 2022 17.14.12 x86 simply crashes:
After this PR:
These tests are compiled with optimizations so they finish in a reasonable amount of time. The behavior is the same, just agonizingly slower, with
/MTd /Od
.