There is no light-weight threading concept in standard C/C++, and the cost of one OS thread is at least 1Mb (stack) + 4Kb (tls) reserved memory (at least by default, on modern Windows), so for 1m threads that will be 1Tb of RAM :)
C++ 20 has async/await (but of course named co_async and co_await, because C++ isn't the sort of language where you're allowed nice things, the beatings will continue until morale improves)
However AIUI C++ 20 doesn't actually supply an executor out of the box, so you would need to choose what to do here as for Rust, where they picked both tokio and async-std and you can see they have different performance.
C does only have threads, but you could presumably pull off the same trick in C that Rust does, to get Linux to give you the bare minimum actual resident memory for your thread and you needn't care about the notional "real" size of the thread since we're 64-bit and address space is basically free.
> (but of course named co_async and co_await, because C++ isn't the sort of language where you're allowed nice things, the beatings will continue until morale improves)
Stack memory is allocated lazily on Linux. I thought this benchmarks shows it pretty clearly. Otherwise 10k threads would be already in GBs territory, when in fact it is below 100 MB.