Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> [IOCP] is similar to epoll on linux and kqueue on BSDs and MacOS

Similar, but the differences matter. kqueue is readyness-oriented. It tells you that the driver is ready to queue more writes. But you still need to call fsync and friends to confirm that your data has actually been written out. There's no non-blocking fsync. Servers like Redis run an extra thread just to call fsync; and eat all the performance problems that entails.

In contrast, IOCP is completion-oriented. There's no need for any blocking fsync calls in order to find out when data has been written. Its way more granular - the OS can reorder writes. And it doesn't suffer from fsync's awful non-local error handling problems.

Honestly I'm surprised there's no equivalent for linux / BSD. (I mean, we have AIO but its really not good enough). IOCP is a fantastic API for high performance databases and servers. It enables faster and simpler server code. We should collectively get on it.



Yeah, unix usually is based on reactor pattern async I/O, while IOCP is proactor pattern.

You can build a proactor pattern on top of a reactor pattern, but that implementation needs to provide quite a bit - more coordination of pending memory/state and probably needing back pressure support (which I never really understood if Microsoft supplied or if you had to track that yourself).

I know Microsoft used to have several IOCP-related patents; I believe I remember one for event scheduling for multi-threaded locality (preferring to dispatch a completion to the same thread that made the original request).


Write flushing is not a good example.

You still need FlushFileBuffers even with IOCP if files are opened in buffered mode (which is the default). While you will know when a write was completed, an explicit flush is still needed to make sure it’s written through. So it’s pretty much exactly the same case as with fsync(). Alternatively, you can use the write-through mode, but that kills performance like that, especially with a random write pattern.

IOCP is basically a better/cleaner abstracted epoll() that also doubles as a manager of a worker thread pool.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: