Nvidia GB300: Far Outperforms Hopper in Agentic AI Workloads!

General news News Tech Tech News 06/15/2026

0 52 Views

TECH NEWS – As the release of the Rubin platform approaches, the Blackwell-based GB300 leaves the Hopper platform far behind.

The Nvidia Blackwell GB300 set a new record in the AA-AgentPerf benchmark, which measures agent-based artificial intelligence workloads. The benchmark measures how many active agents an inference deployment can support under realistic loads. These loads include real-world agent trajectories, such as multi-turn coding sessions with interleaved reasoning, function calls, and variable context lengths; sustained parallel loads, in which simulated agents maintain continuous, ongoing requests that tax KV-cache reuse, speculative decoding, and scheduler behavior; market SLO levels, which are performance thresholds defined based on Artificial Analysis’s serverless API benchmarking data and reflect service quality levels observed among service providers; continuous updates, in which results are updated as new hardware, software packages, and model versions become available; and production-ready status, in which the models are tested with realistic optimizations and production-scale deployment topologies.

The AA-AgentPerf benchmark measures three key metrics that form the foundation of modern AI deployments. These are Time to First Token (TTFT), output rate (output tokens per request per second measured after receiving the first token), and system output throughput (aggregate output tokens per second across all concurrent agents).

NVIDIA just published its first AgentPerf-based performance benchmark results. These results were achieved using the DeepSeek V4 Pro model running on the GB300 NVL72 platform. This model is representative of the Frontier models that power today’s agents and are widely used in artificial intelligence. In the first round of performance testing, Nvidia’s GB300 hardware delivered the fastest performance, providing a 20x advantage per megawatt over the older HGX H200 platform. The GB300 supports up to 60,000 concurrent agents per megawatt, a significant improvement over Hopper. According to Nvidia, this performance demonstrates the Nvidia GB300 NVL72 and Blackwell’s ability to run large-scale, agent-based coding workloads while utilizing GPUs across multiple concurrent sessions.

The next generation, Nvidia Rubin, is expected to build on these advantages with an ultra-powerful AI architecture delivering 50 PFLOPs of computational performance from the NVFP4. Combined with the Vera CPU, Rubin will significantly improve performance and efficiency for LLM service calls and edge computing.

Source: WCCFTech, Artificial Analysis, Nvidia