Exciting Announcement! In celebration of launching our AI Certification, we’re thrilled to offer a 50% discount exclusively. Seize this unique chance—don’t let it slip by!

Accelerating AI Networking: NVIDIA Spectrum-X and Aviz ONES Integration

June 17, 2025

As AI workloads continue to scale, enterprises are quickly realizing that traditional networking isn’t built for the demands of modern, distributed GPU clusters. To address this, NVIDIA and Aviz Networks hosted a joint bootcamp showcasing the deep integration between NVIDIA Spectrum-X — the industry’s first Ethernet fabric optimized for AI — and the Aviz Open Networking Enterprise Suite (ONES), purpose-built for AI infrastructure orchestration and observability.

"AI had its iPhone moment with ChatGPT. Suddenly, enterprises everywhere wanted to deploy generative AI at scale — but Ethernet couldn’t keep up."

How is Ethernet evolving for AI clusters?

From InfiniBand to Ethernet: The AI Networking Shift

Enter Spectrum-X. NVIDIA took key capabilities from InfiniBand and extended them to Ethernet — enabling RDMA, adaptive routing, and congestion control, all with the governance and familiarity enterprises demand from Ethernet environments.

"The larger the AI cluster, the bigger the impact the network has on performance, Spectrum-X gives enterprises a purpose-built Ethernet fabric to unlock full GPU performance.

Figure 1: NVIDIA Spectrum-X Full Stack + Aviz ONES Network Orchestration Diagram

What is Spectrum-X RA 1.3.0 and how is it validated?

Figure 2: Spectrum-X Network Orchestration Ecosystem

Spectrum-X RA 1.3.0: Validated at Supercomputer Scale

Aranga Madipuri, Product Manager at NVIDIA, detailed the Spectrum-X Reference Architecture (RA 1.3.0), tested on real-world supercomputers like Israel-1. The RA offers a prescriptive blueprint combining SONiC/Cumulus NOS, NetQ telemetry, NVIDIA AIR digital twin simulation, and BlueField-accelerated switching — ensuring performance, reliability, and reproducibility for massive AI clusters.

How does Aviz ONES support Spectrum-X deployments?

Figure 3: ONES for Spectrum-X Monitoring & NetOps

ONES by Aviz: Purpose-Built for Multi-Tenant AI Infrastructure

Aviz CTO Chit Perumal and Principal Engineer Kasi Nath demonstrated how ONES seamlessly integrates with Spectrum-X RA 1.3.0. ONES delivers:

"We wanted customers to scale GPU clusters effortlessly while maintaining network visibility and operational simplicity — ONES makes that possible."

What was covered in the live demo?

Real-World Topologies, Live Demo

The session closed with a detailed demo covering:

Explore More

Whether you’re building a private AI cloud or launching GPU-as-a-Service, the NVIDIA + Aviz stack gives you the tools to scale with confidence — and visibility.

Frequently Asked Questions

1. What is NVIDIA Spectrum-X and how is it optimized for AI workloads?

Spectrum-X is the first Ethernet fabric designed specifically for AI clusters. It extends InfiniBand’s low-latency, lossless transport to Ethernet, delivering RDMA, adaptive routing, and congestion control — all with familiar enterprise Ethernet governance.

Traditional Ethernet often struggles with network congestion and packet loss during large-scale GPU training. Spectrum-X solves this by:

  • Enabling RDMA over Converged Ethernet (RoCE) for direct GPU communication.
  • Using adaptive routing to bypass congestion.
  • Delivering consistent throughput across thousands of GPUs.

It’s a validated deployment blueprint tested on real supercomputers like Israel-1. RA 1.3.0 combines:

  • Open NOS (SONiC/Cumulus).
  • NetQ telemetry.
  • NVIDIA AIR for digital twin simulation.
  • BlueField DPUs for hardware acceleration.

This ensures predictable performance and easier scaling of AI networks.

ONES is a software layer for orchestration and observability. It connects directly with Spectrum-X fabrics to automate deployment, manage multi-tenant AI workloads, and deliver agentless, real-time telemetry — reducing operational complexity.

  • Declarative fabric design: Define your network layout upfront.
  • NVIDIA AIR simulation: Test configurations in a digital twin before deployment.
  • Zero-touch provisioning: Automatically configures switches and hosts using Spectrum-X RA templates.

ONES uses EVPN and VRF-based segmentation to isolate traffic between tenants. It also provisions GPU resources intelligently and applies policies to guarantee secure workload separation.

ONES leverages built-in telemetry from the NOS, hosts, and GPUs. It integrates with tools like Slack, ServiceNow, and Zabbix for automated alerting — without installing third-party agents that consume resources.

  • Orchestration of a two-switch Spectrum-X fabric
  • Tenant creation and GPU assignment
  • Policy validation for isolation
  • Real-time monitoring dashboards and anomaly detection
  • Config comparison and structured RMA workflows.

Organizations running large AI training clusters, GPU-as-a-Service providers, and any enterprise building private AI clouds that require high performance, robust isolation, and full-stack observability.

Watch the bootcamp recording for a full walkthrough and explore Aviz ONES for detailed docs, case studies, and a deeper look at deployment best practices.

Share the Post:

Contact Us

Sign up to read more!

Accelerating AI Networking: NVIDIA Spectrum-X and Aviz ONES Integration

As AI workloads continue to scale, enterprises are quickly realizing that traditional networking isn’t built for the demands of modern, distributed GPU clusters. To address this, NVIDIA and Aviz Networks hosted a joint bootcamp showcasing the deep integration between NVIDIA Spectrum-X — the industry’s first Ethernet fabric optimized for AI — and the Aviz Open […]