Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

So, we have found that that really IS correct, though perhaps not in the ways you might think. For example, a major problem with servers today is the accepted geometry: 1U or 2U x 19". In order to drive density, you are either doing dual socket in 2U or single socket in 1U -- but to make that geometry work you have small fans that really have to crank fans to ram air through. (And have we mentioned that the AC power supplies also need fans?). These little screamers are acoustically distasteful, but it's worse than that: they have to do much more work (and draw more power!) to move a fraction of the air because they are so small (air movement is ~cubic with respect to diameter). By changing the enclosure geometry (our sleds are 100mm high) we have much larger fans (80mm). The upshot? Our fans move SO much more air that we can run them much more slowly (we worked with our fan provider to drop the RPM at 0% PWM from 5000 RPM to 2000 RPM) -- which means our rack is not only silent by comparison, it means the power in the rack is going to compute rather than fans.

So, this isn't really chasing the 0.1% -- it's chasing much bigger wins, and it's doing it by changing the power distribution (shelf of rectifiers to a 54V DC busbar vs. redundant AC power supplies in every 1U/2U), changing the cooling (the larger fans), changing the networking (we have a blindmated cabled backplane, obviating the need for operator cabling), etc. etc. These are not just multipliers on efficiency, but also on manageability: once physically installed, our rack is designed to be provisioning VMs orders of magnitude faster than the traditional manual rack/stack/cable/SW install.



> So, we have found that that really IS correct, though perhaps not in the ways you might think.

I was replying more to the idea there's all this costly legacy IO.

I guess you aren't the first to try different geometries or power delivery or cabling either. There's been lots of little opencompute-type efforts and startups come and go. I'm skeptical there's a lot in it in a significant niche that does not already do these things, but you don't need to convince me. Although if you did want to you could show some comparative density numbers. What can you fit, 2048 cores, 32TB, and push 15kW through a rack?


Yeah: 32 AMD 7713P (64 cores/128 threads), so 2048 cores/4096 threads, 32 TB of DRAM, 1PB of NVMe, with 2x100GbE to dual, redundant 6.4Tb/s switches -- all in 15 kW. In terms of other efforts, there have certainly been some industry initiatives (OCP, Open19), but no startups that I'm aware (the smaller companies in the space have historically been integrators rather than doing their own de novo designs -- and they don't do/haven't done their own software at all); is there one in particular you're thinking of?


Well the first OCP specification more than 10 years ago specced 13kW per rack. ORv3 is up to 30kW now, some vendors are pushing 40 and more. So maybe I'm missing something, didn't really seem like density was a major point of difference there.

And not one in particular, there's just a bunch that have sprung up around OCP over the past decade. None that I'm aware of that are doing everything that Oxide does, but we were talking more about the mechanical, electrical, and cooling side of it there -- they do seem to do okay with power density.


To be clear, the problem is in how the power budget is being spent (most enterprise DCs don't even have 15 kW going to the rack). The question on density is: how can you get the most compute elements into the space/power that you've got, and cramming towards highest possible density (i.e., 1U) actually decreases density because you spend so much of that power budget cranking fans and ramming air. And the challenge with OCP is: those systems aren't for sale (we tried to buy one!) and even if you got one, it has no software.


> To be clear, the problem is in how the power budget is being spent (most enterprise DCs don't even have 15 kW going to the rack).

I thought you were going for cloud DCs rather than enterprise. Seems like a big uphill battle to get software certified to run on your platform. Are any of the major Linux distros or Windows certified to run on your hypervisor platform? Any ISVs?

> The question on density is: how can you get the most compute elements into the space/power that you've got, and cramming towards highest possible density (i.e., 1U) actually decreases density because you spend so much of that power budget cranking fans and ramming air.

The question really is how much compute power, and electrical/thermal power ~= compute power. Sure you could fit more CPUs and run them at a lower frequency or duty cycle.

> And the challenge with OCP is: those systems aren't for sale (we tried to buy one!) and even if you got one, it has no software.

OCP is a set of standards. They certainly are systems sold. I guess the nature of the beast is that buyers of one probably don't get taken very seriously, particularly not a competitor.


> electrical/thermal/support power ~= compute power.

Yes, and using larger fans decreases the proportional there. Centralized rectifiers reduces the proportional. Google can make it all the way down to 10% overhead power. That's the point they are making.


Yes, and the question I am asking is how much better are they doing than competition there. For compute density, the real cloud racks are pushing 30-40kW per rack so even if they were 20% and oxide 0%, they don't seem to have a compute density advantage there.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: