Premium: Nebius's secret sauce
Part 1 covered Nebius's origin, cloud focus, and whale deals thus far. Before we dig deeper into their footprint and buildout efforts, I need to set some context.
Let's quickly cover Nebius's expertise in buildouts, its custom in-house racks and servers, and some areas needing improvement. The second part of this topic will extensively cover their footprint, past buildout efforts, and where they are building next.
- Nebius buys capacity in advance, before contracts are secured, knowing that it will either strengthen its internal fleet or get sold off in whale deals. [CoreWeave secures customer commits ahead of obtaining new sites.] As noted last post, the demand signals for these advance purchases are there; their fleet is continually near or at peak utilization.
- They are shifting now from smaller colos to larger build-to-suit to self-owned. Nebius inherited a lot of system and rack design expertise from Yandex, and is highly opinionated on how data centers needs to be built.
- They have designed their own proprietary racks and GPU, CPU, and storage systems. This positions them for stronger long-term margins than other neoclouds using out-of-the-box racks and systems from OEM providers like Dell and Supermicro.
- SemiAnalysis believes that this strategy shaves the normal 10-12% marginal costs down to more like 2%, and mgmt insists they can save 20-25% in ongoing operating costs (from an improved thermal profile) from there.
- However, they've had to adjust these in-house designs over the past year to incorporate liquid cooling. They are adopting their own customized hybrid liquid-and-air cooling technique for NVL72 racks and on to NVL144 (Oberon architecture). They'll have to do the same retooling for future NVL576 (Kyber).
- They have published impressive PUE specs, but it was from older results from their one initial DC. Today's reality is much different.
- A key area of advantage is in their virtualization layers. This gives Nebius an advantage over CoreWeave's cloud ambitions, helping them more easily manage multi-tenancy, cluster sizing, and provisioning. This key architectural decision empowers their hyperscaler cloud ambitions, and, surprisingly, their virtual GPU performance is on par with bare metal.
- Nebius is considered a top GPUaaS provider with great pricing and terms per ClusterMAX. But they lag CoreWeave in some areas, like MFU and early access.
- However, I believe they are on a path to Platinum tier with all their reliability and monitoring improvements over the past few months.