NVIDIA GB200 NVL72

Powering the new era of computing with the NVIDIA GB200 Grace Blackwell Superchip. 72 Blackwell GPUs delivering 30x faster real-time trillion-parameter LLM inference.

72

Blackwell GPUs

13.4 TB

Fast Memory

130 TB/s

NVLink Bandwidth

1,440

PFLOPS FP4

Request a Quote View Specifications

Hosted in EU GDPR Compliant

Made in Germany Quality Infrastructure

Data Sovereignty Full Transparency

Unlocking Real-Time Trillion-Parameter Models

The GB200 NVL72 connects 36 Grace CPUs and 72 Blackwell GPUs in a rack-scale, liquid-cooled design. It boasts a 72-GPU NVLink domain that acts as a single, massive GPU and delivers 15x faster inference and 3x faster training compared to DGX H100 systems.

Configuration

72 NVIDIA Blackwell GPUs, 36 NVIDIA Grace CPUs, 2,592 Arm Neoverse V2 cores

Memory & Bandwidth

192 GB HBM3e per B200, 8 TB/s per GPU, 130 TB/s NVLink, 576 TB/s total system

Tensor Core Performance

1,440 PFLOPS FP4, 720 PFLOPS FP8/FP6, 360 PFLOPS FP16/BF16, 180 PFLOPS TF32

Networking

Fifth-generation NVLink, 130 TB/s bandwidth, seamless GPU-to-GPU communication

AI Performance

15x faster inference vs DGX H100, 3x faster training, 2.5x performance per B200 vs H200

System Architecture

Liquid-cooled rack-scale design, single massive 72-GPU domain, exascale computing

An exascale computer in a single rack, powering the new era of computing.

Request a Quote

Technological Breakthroughs

Revolutionary innovations that redefine the boundaries of AI computing performance, delivering unprecedented capabilities for the most demanding AI workloads.

Real-Time LLM Inference

GB200 NVL72 introduces cutting-edge capabilities and a second-generation Transformer Engine, which enables FP4 AI. When coupled with fifth-generation NVIDIA NVLink, it delivers 30x faster real-time LLM inference performance for trillion-parameter language models.

Massive-Scale Training

GB200 NVL72 includes a faster second-generation Transformer Engine, featuring FP8 precision, enabling a remarkable 4x faster training for large language models at scale. This breakthrough is complemented by the fifth-generation NVLink.

Fifth-Generation NVIDIA NVLink

Unlocking the full potential of exascale computing and trillion-parameter AI models requires swift, seamless communication between every GPU in a server cluster. The fifth generation of NVLink is a scale–up interconnect that unleashes accelerated performance.

Energy-Efficient Infrastructure

Liquid-cooled GB200 NVL72 racks reduce a data center's carbon footprint and energy consumption. Liquid cooling increases compute density, reduces floor space, and facilitates high-bandwidth, low-latency GPU communication.

NVIDIA Grace CPU

The NVIDIA Grace CPU is a breakthrough processor designed for modern data centers running AI, cloud, and HPC applications. It provides outstanding performance and memory bandwidth with 2x the energy efficiency of today's leading server processors.

Data Processing

Databases play critical roles in handling, processing, and analyzing large volumes of data for enterprises. GB200 takes advantage of high-bandwidth memory performance, NVLink-C2C, and dedicated decompression engines to speed up key database queries by 18x compared to CPU.

Ready to start your AI project?

Let's discuss your specific requirements in a personal conversation. I'll help you find the perfect AI infrastructure solution for your organization.

Book Personal Meeting Contact Us

Nils Herhaus

Business Development

@Polarise

Contact via LinkedIn