It's very interesting to see this. I wonder what Amazon, Facebook, Google, and Microsoft are going to do in the next few generations. Will they be moving to 2U1N as well? Or do they have liquid cooling in mind? Or maybe they'll be keeping lower power budgets? Or building data centers that run cooler and support higher power density (probably not great for the environment).
I knew fan power usage could be a substantial amount of a server's power draw, but I didn't realize they would need 320W to move enough air to cool 500W of dissapation.
I recently picked up an R730xd. By default, a non-Dell-branded PCI-E device will force the fans to ~16000RPM, but there's an IPMI command to disable that and go back to regular, temperature-driven, fan control. Total server power came down by 70W after issuing that command.
This is very uninteresting and way behind the big cloud vendors high density installs. Theyre just saying “when power/cooling limited use a form factor that optimizes your remaining constraints, like RU.” So yes, 2p & 3p edge sites are commonly limited to 5-10kva, and you end up with low density racks and “extra” physical space you can use. In the modern dense 1p deployments youll see 20kva, or more, per position. The physical form factor always follows those constraints based on cooling/power/connectivity/weight/etc.
The latest Zen4c-based EPYC chips are reaching very high power densities. They're reportedly targeted towards hyperscalers that are badly cramped for space in their datacenters but can be more flexible wrt. power budgets. In a way, these concerns had to be there already with HPC hardware, so there's probably well-established approaches for dealing with them too.
The major cloud vendors have their own data centers & do not need to deal with space or power limits at a rack unit. They have the entire data center worth of space and power supply to work with.
All data centers have space and power/cooling limits. If you're in someone else's DC, they are usually going to quote you as per-rack power, but if you really want you can probably convince them to let you use 2 racks worth of power in one rack and hold a second rack empty. In owned and operated DCs, it's kind of similar, if the DC was built a while ago, as server power density has increased, a significant portion of the floor space may be unusable for racks.
10-20 years ago, 1U and blades made a lot of sense because space was a major constraint; now 2U or larger can make sense because 1U has a lot of tradeoffs for size, and power is the bigger constraint.
If space density is still needed, 2 nodes in 2u or similar narrow but tall configurations provide a more square profile that allows generating airflow with larger fans which is more efficient than the 40mm fans needed in 1U systems.
Direct Connect locations are not the locations where the main region data centers are. AWS has a network link from their DC to Equinix, etc to make it easier for customers who are in Equinix's colo to make a direct connect to the AWS network.
Nice to see some of the hardware details behind cloudflare!
One that I found nice too, Cloudflare is able to quickly install GPU's everywhere, because for over 6 years, they didn't utilize 1/6th of the slots for that purpose.
I bought my last 1U server a few years ago. The space savings I thought I was gaining was quickly diminished by the fact that we weren't going to be able to fill a cabinet simply due to running over our power envelope. It was going to require a bigger circuit, an electrician to install it, a change in our cool billing, etc.
So, these newer CPUs while being a lot more powerful also draw a lot more power as well. In the end the squeezing 1000s of CFM through a 1.75" aperture wasn't worth the headache.
I'm glad to see cloud providers are still using datacenter standard 1U chassis. I assumed everyone had moved to Open Compute rack dimensions 537x44mm.
The neighboring cabinet to ours has a 3x Walmart box fans. I thought that was pretty innovative. Just use big silent, slow, fans instead of these tiny, noisy airhorns.
I’m a bit surprised the big datacenter / OpenCompute buyers haven’t moved toward a form factor that’s more square. Big, high-power, quiet desktop systems have been using monster fans for many years, and ISTM one could do better with systems that are maybe 4U high but only half width, enabling a proper fan in back. Or maybe even a form factor that’s packs a lot of individual machines in front of a full width truly immense fan (19” or more!).
40mm, 12.6W gets 21.56cfm (at zero pressure) or 3.416 inH2O (at zero flow). That's 1.7 cfm/W.
140mm, 46.8W gets 282.31cfm or 2.033 inH2O. That's 6.03 cfm/W, over 3x better.
Now, I'm being a bit unfair: I ignored pressure. I don't see fan curves, but a vaguely credible figure of merit if we're just trying to estimate efficiency is to multiple the flow and the pressure and divide by power. The little fan gets 0.68 and the big fan gets 1.44. That's still quite a lot better. (It does not mean that fan exceeds 100% efficiency. It's not delivering 282.31cfm at 2.033 in H2O.)
But, interestingly, these fans produce almost the same velocity. If the moderately lower available pressure from the large fan isn't a problem (which it shouldn't be -- we're talking about a regime where there is too much TDP per rack, so it's beneficial to spread things out, thus reducing resistance to airflow), these fans really can substitute for each other in the sense that covering the same area with big fans produces about the same airflow as covering the area with small fans.
Oh, and the big fan is louder, but not louder than 10 small fans!
So I stand by my claim: if you are cooling the contents of your rack by filling it with metal tubes and blowing air through the tubes using fans, use big fans, which you can achieve by using taller units. And 2U isn't enough for that 140mm fan.
Thank you for flagging! We've updated the post with a brief definition of 1N (as you and others have noted, it means there is one server node per chassis).
I'm surprised. They mention that cooling is only going to get worse and that fans consume a lot of power, should liquid cooling not be something to look at? Maybe they did and found it either unnecessary or improbable to do at a colo depending on how many servers they have at the data center it might not be worthwhile but if they have 10 or 20 servers surely it would be worth while no?
If you're building your own facility, there's a lot of things you can do with design. In our case, these servers need to go into other people's facilities (e.g., ISPs around the world) so we need to live within their constraints. Things like liquid cooling either wouldn't be possible or would add significant complexity to install and maintain.
There are at least a couple vendors with self-contained liquid cooling systems, kind of like the “all-in-one” systems available for desktop and gaming use. I don’t know how effective they are.
You're right and while it might take up valuable rack space I would imagine there are some rack-based solutions that have most of the stuff included so that you don't have to rely on the dc providing it, pretty sure ServeTheHome has done a couple reports on them
AIUI this is just starting to take off. Look at Supermicro's 2U4N and 1U2N offerings, which need to use liquid cooling to reach that density. I know Alibaba has been doing vertical mounted servers in a non-conductive oil bath for a while. I bet their generation 13 servers have liquid cooling.
I guess it must be, but does the performance per watt make sense going from 350 to 500W? That's a 42% increase in power! I suppose the # of cores per watt per u makes this solution make sense, especially with the 1u cooling overhead, but it seems like it might be closer to a wash than one might think initially.
The 500W is an estimate for the next generation CPUs so we have to speculate, but yes the performance per watt should make sense unless those products are awful. Thus far every EPYC generation has been substantially better performance/watt but the top-end power draw has gone up because the number of cores has gone up. EPYC 3 (Milan) maxed out at 64 cores but EPYC 4 maxes out at 96 (Genoa Zen4) or 128 (Bergamo Zen4c). So you get twice as many cores, faster cores, and much less than twice the max power draw (~240W > ~360W). The highest core count CPUs tend to be the most efficient as well because you amortize the relatively static power draw of the IO-die (and the other parts on the motherboard) and because they tend to have lower more efficient clock frequencies.
It's pretty safe to assume the situation will be the same with next generation EPYC 5 CPUs with maybe 192 or 256 cores, higher power draw, but more efficiency.
Nothing is really new here, at least for those who have some knowledge on server hardware. We live in a Cloud era where 98% of web developers have very little knowledge of hardware.
What I dont understand is why they haven't done 2U2N or 1N with Dual Socket design.
Everyone really thought the hardware would fail at 40+ C ambient. Then some people ran newer hardware at higher temperatures anyway and it was fine, and then everyone switched to get the savings.
I’ve been using fans bigger than 80mm in desktop machines for a long time. Is there a reason to prefer 2U1N over the equally dense 4U2N shape? Obviously 2U is widely available and 4U dual node is exotic, but I imagine that Cloudflare has plenty of scale for a custom form factor if it would benefit them.
Thank you for flagging! We've updated the post with a brief definition of 1N (as you and others have noted, it means there is one server node per chassis).
I mean this very positively, not condescendingly and I'm very much not trying to preach or push, but I've found that this sort of question is exactly the sort that I benefit a lot from handing to GPT to answer for me:
>N: This refers to the redundancy of the power and cooling systems. "N" stands for "Need" or "Normal." In data center design, "N" is used to denote the capacity required to support the IT load. "1N" implies a single path for power and cooling, with no built-in redundancy. In other words, each component necessary to support the IT equipment is present, but there is no backup if one of these components fails.
I'm pretty sure this is wrong. N means "node", so a 2U4N is a 2U chassis that houses 4 servers. I'm pretty ambivalent for this reason for using chatgpt for research in things I cannot easily verify. I will admit this is a scarily convincing hallucination.
This sort of thing honestly feels like the historical state of affairs with Wikipedia... paying too much attention to it early on was probably a bad idea, but as things progressed, many folks who had ossified into distrusting it were making the opposite mistake of paying too little attention.
I'm definitely trying to balance it out! Apologies to all for the mistake here!
Cores/RAM/Disk … even GPU since they mention the importance.
TL:DR - newer CPUs TDP is so high, they needed to change from 1U to 2U and that changes economics at a macro level per rack (since you’re density now halves).
I knew fan power usage could be a substantial amount of a server's power draw, but I didn't realize they would need 320W to move enough air to cool 500W of dissapation.