165s (2:45) — RAM only 451s (7:31) — NVMe SSD Good argument for when the uninfor...

californical · on Sept 7, 2021

I mean that’s really close. I always thought of RAM as multiple orders of magnitude faster than disk. Within 3x of speed is pretty excellent.

(though, I guess this doesn’t give us any latency info, just throughput. I’d expect RAM latency to still be faster)

koala_man · on Sept 7, 2021

Author here. Keep in mind that most access in the swapping case is still RAM, so we can't just say that there's a 3x difference between DRAM and NVMe flash.

I originally tried running the test with only 1GB RAM, but killed the job after 9 hours of churning.

nh2 · on Sept 7, 2021

I would not take this benchmark to draw general conclusions.

The spinning disk result is only 10x slower than RAM. But a spinning disk's throughput is 100-1000x less than current RAM, and for latency it's even worse.

Similarly, the other factors in the benchmark graph are way off their hardware factors.

This benchmark is measuring how one specific program (the Haskell Compiler compiling ShellCheck) scales with faster memory, and the answer is "not very well".

koala_man · on Sept 7, 2021

The overwhelming majority of access would still happen in the 2GB RAM the benchmark has. The disk is only hit to stash or load overflowing pages, not on every memory access. That's why it doesn't mirror the hardware difference between DRAM and disk.

nh2 · on Sept 7, 2021

That makes sense, thanks!

zaarn · on Sept 7, 2021

Generally, in terms of transfer speed, NVMe is damn close. The latencies is where that hits you because NVMe hasn't nearly as short latencies and doesn't have latency guarantees about the 99th percentile.

If your ops aren't latency sensitive, then NVMe might as well be RAM, if they are latency sensitive, then NVMe is not RAM (yet)

koala_man · on Sept 7, 2021

Isn't it about ~2GB/s vs ~20GB/s? It's really impressive but still an order of magnitude.

zaarn · on Sept 7, 2021

A modern NVMe on PCIe 4.0 can deliver up to 5GB/s, which is only 4 times slower. You can get faster by using RAIDs and I believe some enterprise class stuff can get a bit faster still at the expense of disk space. PCIe 4.0 would top out at 8GB/s, so for faster you'll need PCIe 5.0 (soon).

nh2 · on Sept 7, 2021

RAM bandwidth scales with the number of DIMMs used, e.g. a current AMD EPYC machines can do 220 GB/s with 16 DIMMs per spec sheet.

How well does NVMe scale to multiple devices, that is, how many GB/s can you practically get today out of a server packed with NVMe until you hit a bottleneck (e.g. running out of PCIe lanes)?

zaarn · on Sept 7, 2021

An AMD Epyc can have 128 PCIe 4.0 lanes, each 8GB/s, meaning it tops out at a measely 1TB/s of total bandwidth. And you can in fact saturate that with the bigger Epycs. However, You will probably loose 4 lanes to your chipset and local disk setup, maybe some more depending on server setup but it'll remain close to 1TB/s.

jxcl · on Sept 7, 2021

I tested this on my own system somewhat recently, with a Ryzen 5950X, 64 GB of 3600 MHz CL 18 RAM and a 1TB Samsung 970 Evo, using the config file that ships with Fedora 33.

I created a ramdisk as follows:

    ~$ sudo mount -t tmpfs -o size=32g tmpfs ~/ramdisk/
    ~$ cp -r Downloads/linux-5.14-rc3 ramdisk/
    ~/ramdisk$ cp /boot/config-5.13.5-100.fc33.x86_64 linux-5.14-rc3/.config
    ~/ramdisk$ cd linux-5.14-rc3/
    ~/ramdisk/linux-5.14-rc3$ time make -j 32

My compiler invocation was:

    ~/ramdisk/linux-5.14-rc3$ time make -j 32

And got the following results

    Kernel: arch/x86/boot/bzImage is ready  (#3)

    real 6m2.575s
    user 143m42.402s
    sys  21m8.122s

When I compiled straight from the SSD I got a surprisingly similar number:

    Kernel: arch/x86/boot/bzImage is ready  (#1)

    real 6m23.194s
    user 154m24.760s
    sys  23m26.304s

I drew the conclusion that for compiling Linux, NVMe might as well be RAM, though if I did something wrong I'd be happy to hear about it!