PS5, XBox Series X, and NVidia have a "GPU Direct I/O" feature. https://www.nvid...

etaioinshrdlu · on April 6, 2021

The CPU is still acting as the PCIe controller though (right?), which kind of makes the CPU act like a network switch. PCIe is a point-to-point protocol kind of like ethernet too. Old-school PCI was a shared bus so devices might be able to directly talk to each other, but I don't think that was ever actually used.

dragontamer · on April 6, 2021

Take a look at the Radeon more closely.

I think the Radeon + Premier Pro documentation makes it clear how it works: https://www.amd.com/system/files/documents/radeon-pro-ssg-pr...

As you can see, the GPU is attached to the x16 slot, and the 4x NVMe SSDs are attached to the GPU. When the CPU wants to store data on the SSD, it communicates first to the GPU, which then pass-throughs the data to the four SSDs.

That's the simpler example.

--------------

In NVidia's case, they're building on top of GPUDirect Storage (https://developer.nvidia.com/blog/gpudirect-storage/), which seems to be based on enterprise technology where PCIe switches were used.

NVidia's GPUs would command the PCIe switch to grab data, without having the PCIe switch send data to the CPU (which would most likely be dropped in DDR4, or maybe L3 in an optimized situation).

d110af5ccf · on April 6, 2021

My understanding matches yours, but it's worth noting that (IIUC) memory and PCIe are (last time I checked?) a separate I/O subsystem that just happens to reside within the same package as the CPU on modern chips. So P2PDMA avoids burning CPU cycles and RAM bandwidth shuffling data around that you never wanted to use on the CPU anyway. (Also see: https://lwn.net/Articles/767281/)