Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

how "massive" is massive in your case?


I've had CitusDB running across 68 bare metal machines (40 vCPU, 768GiB ram, 20TiB of storage each + 40GiB network links) and it ran decently well.

Not sure what your definition of massive is, I think Spanner would easily beat it.

Also, it's very use-case dependent, you can't "just use" Citus for everything, it's not quite as flexible as a bog-standard pgsql install due to the way it's sharding, you have to be a tad more careful with your data model.


Is there a reason there's comparatively little storage in your machines in relation to RAM or even CPUs?

Do your machines do compute heavy loads or something?

For a DB I'd expect a lot more storage per node


NVMe SSDs aren't so large unfortunately.

a 1U server has capacity for 8 drives, we used 2 slots for the OS (RAID1), 2 slots for the WAL volume (2 slots) leaving only 4 slots in RAID10.

So I'm already cheating a little and claiming WAL storage was part of total storage.


Shouldn't you try to get something with PCIe bifurcation in this case?

I doubt you're saturating the PCIe bus bandwidth on any of them?

I imagine your DB is extremely high performance, though!


What is your definition of "decently well", and is your primary cluster (without replicas) above 1PB?


They said 20TiB * 68, which I think is 1.5PB.


That could be all of the nodes, or just the primaries without replicas.


Around ten heavily-updated (50-400k updated rows/min) tables ranging between 500M and 5B rows, with a couple tables over 40B rows each (5TB each IIRC).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: