LWG-3899 co_yielding elements of an lvalue generator is unnecessarily inefficient #5303

HNeitzel · 2025-02-18T01:05:42Z

Resolves #5111
I decided to look into this issue because I like coroutines and wanted to try contributing something.

It turns out that the implementation is quite straight forward, just adding a yield_value overload taking a generator by lvalue reference which is otherwise the exact same as the overload taking a generator by rvalue reference.

Since the LWG paper doesn't state specific numbers and to test the implementation I wrote a short test which I tried with both the current STL (as shipped with VS 17.13.0) and my patched version and in both Debug and Release configurations.

#include <generator>
#include <algorithm>
#include <chrono>
#include <print>
constexpr int N = 100000;
std::generator<int> f() {
  co_yield 1;
}
std::generator<int> g_rvalue()
{
  for (int i = 0; i < N; i++) {
    auto f1 = f();
    co_yield std::ranges::elements_of(std::move(f1));
  }
}
std::generator<int> g_lvalue()
{
  for (int i = 0; i < N; i++) {
    auto f1 = f();
    co_yield std::ranges::elements_of(f1);
  }
}
int main()
{
  {
    auto start = std::chrono::high_resolution_clock::now();
    int res = std::ranges::fold_left(g_rvalue(), 0, std::plus<int>());
    auto duration = std::chrono::high_resolution_clock::now() - start;
    std::println("With elements_of(rvalue generator)\nresult: {}, time: {}, per element: {}", res, std::chrono::duration_cast<std::chrono::microseconds>(duration), duration / N);
  }
  {
    auto start = std::chrono::high_resolution_clock::now();
    int res = std::ranges::fold_left(g_lvalue(), 0, std::plus<int>());
    auto duration = std::chrono::high_resolution_clock::now() - start;
    std::println("With elements_of(lvalue generator)\nresult: {}, time: {}, per element: {}", res, std::chrono::duration_cast<std::chrono::microseconds>(duration), duration / N);
  }
}

Results on my machine (time per element):

	Debug		Release
	rvalue	lvalue	rvalue	lvalue
current	~870ns	~1700ns	~49ns	91ns
patched	~870ns	~870ns	~49ns	~49ns

The results are bit noisy across multiple runs but show clearly that the general overload of yield_value (which is used in the current version) takes almost twice as much time as the generator specialised version (both unoptimised and optimised). This is unsurprising since the general overload wraps the range in an extra generator resulting in two coroutine calls per element. The results also show that the difference disappears in the patched version since the lvalue generator also uses the specialised overload.

… inefficient microsoft#5111

StephanTLavavej · 2025-02-18T15:39:39Z

Thanks, looks perfect! I've edited the PR title to remove the issue number - it doesn't get linked to anything there, and would appear in the git history of PR titles which otherwise only contain PR numbers.

I'll get this merged this week - we have a semi-manual process of merging simultaneously to the GitHub and MSVC-internal repos.

StephanTLavavej · 2025-02-19T09:14:27Z

I'm mirroring this to the MSVC-internal repo - please notify me if any further changes are pushed.

StephanTLavavej · 2025-02-20T07:56:23Z

Thanks for implementing this LWG issue resolution, and congratulations on your first microsoft/STL commit! 💚 😻 🎉

This change is expected to ship in VS 2022 17.14 Preview 3.

LWG-3899 co_yielding elements of an lvalue generator is unnecessarily…

c8ed14a

… inefficient microsoft#5111

HNeitzel requested a review from a team as a code owner February 18, 2025 01:05

This comment was marked as resolved.

Sign in to view

frederick-vs-ja approved these changes Feb 18, 2025

View reviewed changes

StephanTLavavej added LWG Library Working Group issue generator C++23 generator labels Feb 18, 2025

StephanTLavavej approved these changes Feb 18, 2025

View reviewed changes

StephanTLavavej changed the title ~~LWG-3899 co_yielding elements of an lvalue generator is unnecessarily inefficient #5111~~ LWG-3899 co_yielding elements of an lvalue generator is unnecessarily inefficient Feb 18, 2025

StephanTLavavej self-assigned this Feb 19, 2025

StephanTLavavej merged commit dfbe5ea into microsoft:main Feb 20, 2025
39 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

LWG-3899 co_yielding elements of an lvalue generator is unnecessarily inefficient #5303

LWG-3899 co_yielding elements of an lvalue generator is unnecessarily inefficient #5303

Uh oh!

HNeitzel commented Feb 18, 2025 •

edited

Loading

Uh oh!

This comment was marked as resolved.

Uh oh!

StephanTLavavej commented Feb 18, 2025

Uh oh!

StephanTLavavej commented Feb 19, 2025

Uh oh!

Uh oh!

StephanTLavavej commented Feb 20, 2025

Uh oh!

Uh oh!

LWG-3899 co_yielding elements of an lvalue generator is unnecessarily inefficient #5303

LWG-3899 co_yielding elements of an lvalue generator is unnecessarily inefficient #5303

Uh oh!

Conversation

HNeitzel commented Feb 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as resolved.

Uh oh!

StephanTLavavej commented Feb 18, 2025

Uh oh!

StephanTLavavej commented Feb 19, 2025

Uh oh!

Uh oh!

StephanTLavavej commented Feb 20, 2025

Uh oh!

Uh oh!

HNeitzel commented Feb 18, 2025 •

edited

Loading