It seems really dumb that they are stackless. If you are saving/restoring the stack pointer anyway in your yield routine it's trivial to set it to a block of memory you allocated in the initial coroutine creation.
Is there no setjmp/longjmp happening? Are C++ 20 coroutines all just compiler slight-of-hand similar to duff's device with no real context switching?
Why? C/C++ already has stackful coroutines. And that seems extremely wasteful unless you know you'll run the coroutines at the same time... with single threaded stackful coroutines, you'd get two stacks and only ever use one at a time. that wastes megabytes, requires heavier context switching, makes cache misses more likely, etc.
Modern stackful coroutines don't (or shouldn't) use context switching (at least not anything like ucontext_t, just use threads if you're going to spend the cycles to preserve the signal mask) or setjmp/longjmp. Those tricks are slow and hazardous, especially when code needs to cross language boundaries or remain exception safe.
Is there no setjmp/longjmp happening? Are C++ 20 coroutines all just compiler slight-of-hand similar to duff's device with no real context switching?