> It is not recommended to use RAID5 since the performance of this type of RAID for databases is not good.
This strikes me as a generalization that is, at best, questionable. (Also, these days, with spinning disks, I'd more likely do RAID6, but I expect it's comparable enough).
I'd tend to agree that, historically, it's been true, and is demonstrably true for certain situations, such as a read-heavy load or recovering from a failed disk.
However, is it really still true for modern RAID cards?
What about with SSDs? I'd expect that cost is still a significant consideration for anyone specifying hardware, making the difference between RAID5 and RAID10 important.
If you have 4 disks in RAID10 you have 4x read iops of a single disk and 2x write iops. RAID5 has at most 1x write iops (still needs a read) and 4x read iops. RAID6 is even worse. Details here: https://blog.storagecraft.com/raid-performance/
While the CPU is not a bottleneck anymore the amount of possible read/write iops still matter.
Same for SSDs but here you could probably argue that the single SSD write iops for RAID5 are good enough for most usecases.
It really depends on your workload and datasize and usage patterns...
But you can scale RAID 10 linear with more disks and you'll get more write iops - this does not work with RAID5 / RAIDZ.
> If you have 4 disks in RAID10 you have 4x read iops of a single disk and 2x write iops. RAID5 has at most 1x write iops (still needs a read) and 4x read iops. RAID6 is even worse. Details here: https://blog.storagecraft.com/raid-performance/
It's articles like that linked one which make me ask, because they make assumptions that are, themselves, questionable. Specifically, RAID5 requiring a read to do a write is only true in the case of non-full-stripe writes. Is that really likely with a sanely configured stripe width combined with a modern RAID card's cache (4GB)?
There's also an implicit assumption that the channel bandwidths (disk to expander, expander to raid card, raid card's PCIe) don't matter, but they do, especially when SSDs can saturate them.
Even exclusively using the notion of "iops" is (and always has been) suspect to me, especially lacking an expected data transfer size and some notion of randomness/locality.
> The general direction should be similiar to RAID cards.
I'd only be convinced of that if I were to see a few comparitive benchmarks of a variety of workloads. Otherwise, ZFS does too much of its work actually on the CPU and in main memory, whereas the RAID cards moves it down the PCIe bus. At least for RAID10, that would eliminate all the write-amplification over PCIe.
Sure, I don't have much data for either problem - how NVMe works out in practice is also interesting - I've found some benchmarks for SATA-SSDs - https://www.thomas-krenn.com/de/wiki/SSD_RAID_Performance-Te... but I guess if you want or need to squeeze out the max performance on RAID5 you need to test your own system and run different benchmarks with different RAID configurations...
> Write-back caching at the controller level can reduce write latencies. This is especially true with RAID 5/50/6/60 as RAID parity calculations introduce additional latency.
>Also, write-back caching helps with merging smaller sequential write transfers into larger write transfers. Size of the resulting larger transfers equals the write-back cache element size, which typically equals stripe unit size. Merging write transfers is important when there are concurrent sequential write threads with small transfer sizes (for example, database transaction logs).
That is, it cites what appear to be exactly my earlier points regarding full stripe writes.
> I guess if you want or need to squeeze out the max performance on RAID5
Since the whole premise of the article is performance, max performance is the goal.
> test your own system
That still, however, does not address my original question, which is about the general advice that RAID5 performs poorly for databases.
My suspicion remains that this accepted wisdom is outdated (i.e. rendered irrelevant my modern technology).
Does anyone have a direct link without signing up? Just put your logo in the pdf, brand it, but I'm not gonna sign up just to get spammed with advertising.
This strikes me as a generalization that is, at best, questionable. (Also, these days, with spinning disks, I'd more likely do RAID6, but I expect it's comparable enough).
I'd tend to agree that, historically, it's been true, and is demonstrably true for certain situations, such as a read-heavy load or recovering from a failed disk.
However, is it really still true for modern RAID cards?
What about with SSDs? I'd expect that cost is still a significant consideration for anyone specifying hardware, making the difference between RAID5 and RAID10 important.