We ran into the same ksoftirqd issue in our own bare-metal deployment. Turns out there's a performance regression in the linux kernel that manifests when we configured the system with more receive queues than we had physical cores in a single socket.
We dropped the receive queues down to 12, from 48, and hit line rate. More info here:
We dropped the receive queues down to 12, from 48, and hit line rate. More info here:
https://github.com/coreos/bugs/issues/1275