Premium: Vera Rubin decoder ring

Now that we covered the major Vera Rubin announcements, let's dive into the finer details on those new modular rack systems across GTC in March and GTC Taipei this week.

  • Grace Blackwell was their push into inference at scale. Vera Rubin is their push into Agentic AI at scale, by disaggregating its needs across different chips (GPU, CPU, DPU, networking) and rack systems (Vera, SPX, Spectrum-6, Groq LPX).
  • NVIDIA is drastically ramping up its capacity, by doubling its supply chain for Vera Rubin and massively speeding up assembly time across it. They redesigned & simplified their MGX modular architectures, taking tray assembly from over 2 hours to now 5 minutes and greatly improving the ongoing maintainability.
  • They are shifting into disaggregated inference with a new Groq-powered LPX rack that works in tandem with a Vera Rubin NVL72 rack to drastically improve latency and per-user interactivity. This leverages the strengths of LPU chips while circumventing their shortcomings.
  • These new Groq racks allow GPUaaS and AI providers to greatly expand the type of premium AI workloads they can offer, enabling higher pricing tiers.
  • They have silently dropped the Rubin CPX rack systems they announced in September in lieu of this new LPX rack. Rising memory prices and TSMC bottlenecks likely shifted their focus towards Groq. (Those plans may be only temporarily shelved – we may see it return once DRAM prices settle down.)
  • NVIDIA is getting more serious about its Arm-based CPU. Their redesigned next-gen Vera CPU will now be sold as a standalone CPU and rack designed for agentic orchestration.
  • They are leveraging the new Vera & BlueField-4 chips in new AI storage (STX) and shared AI memory (CMX) systems, which will greatly improve data & cache access speeds from Vera Rubin clusters.
  • Spectrum-X Ethernet networking is now at 800GbE, with co-packaged optical (CPO) ports. This will be leveraged to increase the scope of scale-out and scale-across networking, with cluster sizes expected to exceed 500K in Vera Rubin. This week at GTC Taipei, they announced that it is now in production.
  • They have a new DSX suite of reference designs and tooling for the design and operation of AI data centers. They are stressing how DSX can help AI factory build outs maximize the number of GPUs for a given power load, to help smooth ongoing power demand.

Part 2:

  • Vera Rubin rack & POD
  • Groq LPX rack
  • Silently dropping Rubin CPX plans
  • Vera CPU rack
  • Networking
  • BlueField-4/Storage
  • DGX software stack