NVIDIA Vera Rubin POD Framework for Agentic AI Supercomputing Systems with MGX Architecture and Specialized Rack Scale Systems

NVIDIA Vera Rubin POD delivers 60 exaflops of computing power for agentic AI through specialized rack scale systems and third generation MGX architect
NVIDIA Vera Rubin POD Framework for Agentic AI Supercomputing Systems with MGX Architecture and Specialized Rack Scale Systems

Transitioning to Agentic AI with the NVIDIA Vera Rubin POD Framework Featuring MGX Architecture and Five Specialized Rack Scale Systems

NVIDIA Vera Rubin POD Specialized Supercomputing for the Agentic AI Era The field of artificial intelligence has reached a new stage because of its transition to agentic systems which enable AI agents to engage mostly with other AI agents instead of interacting with humans. The development has pushed token usage beyond 10 quadrillion per year which necessitates systems that can handle extensive reasoning tasks and complex operational procedures. The NVIDIA Vera Rubin POD meets these requirements by delivering a co developed platform which includes seven chips and five specialized rack scale systems.

The Vera Rubin POD serves as one unified AI supercomputer system. The system operates from 40 racks which use the third generation NVIDIA MGX architecture. The platform comprises almost 20,000 dies together with 1,152 NVIDIA Rubin GPUs which produce 60 exaflops of computing ability and 10 PB/s of scaling bandwidth.

The platform utilizes specific systems to optimize different AI workloads:

  • Vera Rubin NVL72: Operates with 72 Rubin GPUs and 36 Vera CPUs to produce computing power delivering 10 times improved inference performance per watt when compared to earlier Blackwell systems.
  • Groq 3 LPX: An inference accelerator rack which includes 256 Language Processing Units (LPUs) to eliminate latency limitations for trillion parameter models.
  • Vera CPU Rack: Contains 256 liquid cooled Vera CPUs providing maximum performance for reinforcement learning (RL) and sandboxed environments.
  • BlueField 4 STX: An AI native storage solution which diverts KV cache storage to its high bandwidth layer achieving 5x higher token per second throughput than traditional systems.
  • Spectrum 6 SPX: A networking system which uses 102.4 Tb/s switches together with silicon photonics and co packaged optics (CPO) to maintain workload synchronization.

The Vera Rubin POD presents new mechanical and electrical advancements which improve data center operational efficiency and enable swift system installation. The modular copper spine system eliminates the need for traditional cabling by using pre validated cartridges which reduce assembly time for trays by 20 times.

The Intelligent Power Smoothing system helps the MGX racks handle the significant power load variations which occur during AI training operations. The system uses rack level energy storage which consists of capacitors that deliver 400 J per GPU to stabilize power distribution while decreasing peak current needs by 25%. The Dynamic Max Q provisioning system enables data centers to recover power that remains unused, which allows them to operate up to 30% more GPUs within identical power capacity limits.

The third generation MGX racks are designed to operate with warm water inlet temperatures that reach 45°C (113°F). The system enables data centers across various climates to operate using dry coolers together with ambient air instead of using power intensive compressors. The facility can run additional 10% of compute racks using the same energy allocation by operating at higher temperatures.

The Vera Rubin Ultra NVL576 system allows NVIDIA users to achieve higher density computing by connecting eight MGX NVL racks into a single 576 GPU NVLink domain. The upcoming NVIDIA Kyber rack will launch in late 2026 and 2027, which will double GPU capacity per rack to 144 units while enabling support for the NVL1152 supercomputer setup that uses the Feynman architecture.

The Vera Rubin POD system marks the shift from individual server systems to complete AI factory operations. The platform combines computational power with storage resources and networking capabilities through its MGX architecture to achieve optimal token generation efficiency while lowering ownership costs for large scale agentic systems.

About the author

mgtid
Owner of Technetbook | 10+ Years of Expertise in Technology | Seasoned Writer, Designer, and Programmer | Specialist in In-Depth Tech Reviews and Industry Insights | Passionate about Driving Innovation and Educating the Tech Community Technetbook

Post a Comment