Yeah, and the devil is really in the details there; not all context switches are created equal. If you're dealing with less than (rough ballpark) 1000 requests per core per second, there's just no way context switching is going to be anywhere near a bottleneck. Depending on all those details (app/OS/processor/lunar cycle), you may be able to deal with 10 to 100 times as many before context switching is an issue.
Those kind of servers are simply quite rare (IME). Nothing wrong with interesting problems in niche spaces, but it's not something you should be worrying about by default.
(No idea what "big" servers provide public stats, but e.g. https://nickcraver.com/blog/2016/02/17/stack-overflow-the-ar... ) lists a random day for stackoverflow having 209,420,973 http reqests, i.e. a little less than 3000 per second. I doubt context switching is going to matter for them (in the hypothetical world all this was served by one HTTP server, which of course it isn't).
I largely agree with the above and I would say that 1000 reqs/sec is probably a decent threshold for considering when async IO is going to matter for performance. That said, the details of your particular workload may benefit from async IO at significantly lower levels. As an example, one message routing application on which I worked with a typical workload of ~100 reqs/sec increased its performance by about 5x when switched from a blocking thread per request model to async IO. The application typically maintained a larger number (500-1000) of open but usually idle network connections. With that particular workload and on that particular platform, the overhead of thread context switching became a significant factor at much fewer than 1000 reqs/sec. One hint that this was the case was relatively high percentage (30%+) of CPU time spent in kernel mode. Switching to async IO dropped kernel time to about 5% on this particular application.
Those kind of servers are simply quite rare (IME). Nothing wrong with interesting problems in niche spaces, but it's not something you should be worrying about by default.
(No idea what "big" servers provide public stats, but e.g. https://nickcraver.com/blog/2016/02/17/stack-overflow-the-ar... ) lists a random day for stackoverflow having 209,420,973 http reqests, i.e. a little less than 3000 per second. I doubt context switching is going to matter for them (in the hypothetical world all this was served by one HTTP server, which of course it isn't).