Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Why should a filesystem care about NVMe?

Because at some point the filesystem becomes a bottleneck. ZFS was designed with the assumption that CPUs would be way faster than storage. When you get speeds over 10GB/sec, [0] you are going to spend a lot of time checksumming all that data.

[0] http://www.seagate.com/ca/en/about-seagate/news/seagate-demo...



Fletcher checksums are very cheap and not a bottleneck.

https://github.com/zfsonlinux/zfs/issues/4789#issuecomment-2...


Maybe I'm reading those benchmarks wrong, but they appear to max out well under 10GB/s. This would mean you'd be CPU bound on your checksums alone with one of those Seagate cards.


That is a Xeon Phi, a modern Xeon with 10+ cores should do tens of GB/s

https://cloud.githubusercontent.com/assets/472018/13333262/f...

https://github.com/zfsonlinux/zfs/pull/4330


Hmm, I for one don't want to tie up ten cores with IO.


If you aren't checksumming your data, is it truely your data?


Who checksums the checksummers? ;)


You can't handle the checksum


Mr Edward Checksum Checker.

Update: At least 1 person didn't get this joke :D


> Hmm, I for one don't want to tie up ten cores with IO.

What else are you going to use the two CPUs in a filer for?


On a single thread.

ZFS is well pipelined for multicore throughput.


Although not recommended, you can turn checksum off. The granularity is at the dataset level, too, so you have a lot of flexibility.


What could ZFS do differently to solve that problem while maintaining data integrity?


Just one idea: offload checksum calculation to a DMA engine. Linux already has a generic DMA engine facility in the kernel, backed by e.g. I/OAT on some Intel hardware.


Assuming this is the same thing as hardware-assisted checksums, both the ZFS on Linux maintainer and Intel have said that they are working on it at various cons last year.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: