You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The actual work is done in two new functions __std_regex_transform_primary_char/wchar_t, which are basically 1:1 copies of _Strxfrm() and _Wcsxfrm() but pass different flags to __crtLCMapStringA/W. I also took the liberty to correct the SAL annotations.
__crtLCMapStringA/W are declared in awint.hpp which includes yvals.h. I'm uncertain if this is the best approach, but I undefined _ENFORCE_ONLY_CORE_HEADERS so that awint.hpp can be included.
transform_primary has to check the types of the collate facets using RTTI, so I made the function always returns an empty string when dynamic RTTI is disabled/_CPPRTTI is undefined. The implementation itself is heavily based on collate::do_transform (including the change in #5431). It also needs access to the internals of collate, so I made _Regex_traits a friend of it.
When I implemented <regex>: Properly parse and match collating symbols and equivalencesΒ #5392, I assumed [re.req]/20, so I didn't add any character translation using translate and translate_nocase when parsing equivalences. Now we have to add such logic in _Parser::_Do_ex_class2 to handle potentially case-sensitive sort keys when case-insensitive regexes are used (else "A" would even fail to match [[=A=]]).
A number of test cases (some of my own making) failed, because they all assumed that lower and upper case characters are equivalent in the C locale.
Since matching and parsing of equivalences no longer go through collate::transform, related tests no longer have to be skipped under IDL mismatch.
Thanks! π» I pushed some fixes, the most significant being INT_MAX, please double-check.
β οΈ Note to self: I'll need to perform MSVC-internal changes to add the new file regex.cpp, and I will likely need to push changes to deal with /clr:pure.
bugSomething isn't workingLWGLibrary Working Group issueregexmeow is a substring of homeowner
4 participants
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #5435. Fixes #5291.
The actual work is done in two new functions
__std_regex_transform_primary_char/wchar_t
, which are basically 1:1 copies of_Strxfrm()
and_Wcsxfrm()
but pass different flags to__crtLCMapStringA/W
. I also took the liberty to correct the SAL annotations.__crtLCMapStringA/W
are declared inawint.hpp
which includesyvals.h
. I'm uncertain if this is the best approach, but I undefined_ENFORCE_ONLY_CORE_HEADERS
so thatawint.hpp
can be included.transform_primary
has to check the types of the collate facets using RTTI, so I made the function always returns an empty string when dynamic RTTI is disabled/_CPPRTTI
is undefined. The implementation itself is heavily based oncollate::do_transform
(including the change in #5431). It also needs access to the internals ofcollate
, so I made_Regex_traits
a friend of it.There is a behavior change for the C locale: As I explained in more detail in #5435, the traits requirement in [re.req]/20 is actually misleading, since it is wrong for precisely one locale: the C locale (or the POSIX locale, see the collation order definition here: https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/V1_chap07.html#tag_07_03_02_06). Since the equivalence classes are derived from POSIX and the definition of
regex_traits::transform_primary
also alludes to "primary sort keys" which indirectly reference terminology from the POSIX standard (https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/V1_chap07.html#tag_07_03_02), I think we should do as POSIX says: "A" should not match[[=a=]]
.This has consequences:
<regex>
: Properly parse and match collating symbols and equivalencesΒ #5392, I assumed [re.req]/20, so I didn't add any character translation usingtranslate
andtranslate_nocase
when parsing equivalences. Now we have to add such logic in_Parser::_Do_ex_class2
to handle potentially case-sensitive sort keys when case-insensitive regexes are used (else "A" would even fail to match[[=A=]]
).Since matching and parsing of equivalences no longer go through
collate::transform
, related tests no longer have to be skipped under IDL mismatch.