Exciting Announcement! In celebration of launching our AI Certification, we’re thrilled to offer a 50% discount exclusively. Seize this unique chance—don’t let it slip by!

Categories
Open Networking Enterprise Suite SONiC

Simplify Your SONiC RMA Experience with ONES Backup & Restore

In today’s fast-paced networking landscape, data is a critical asset. Unexpected failures can lead to downtime, operational disruptions, and misconfigurations. When a network device crashes, engineers need a reliable backup to restore it quickly. Without structured backup and restore mechanisms, organizations risk prolonged outages and inefficiencies. This overview underscores the importance of regular backups and explains how ONES Fabric Manager Backup & Restore streamlines the process, ensuring seamless recovery in multi-vendor environments.

The Importance of Backup & Restore in Network Resilience

Backup and restore processes ensure rapid recovery from failures by preserving critical network configurations. Key components include:
In RMA scenarios, replacing faulty hardware is only the first step—the real challenge lies in restoring the original configurations. Without a recent backup, administrators must manually reconfigure the failed switch, resulting in extended downtime, increased risk of errors, operational disruptions, and higher recovery costs due to additional troubleshooting and resource allocation.

ONES Backup & Restore: The Lifeline for Uninterrupted Networks

ONES Fabric Manager Backup & Restore ensures seamless recovery by securely storing configurations in a persistent Docker volume, enabling quick restoration, and eliminating manual reconfiguration. With pre-replacement snapshots for ZTP or upgrades, it offers a reliable rollback option. Designed for multi-vendor compatibility, it minimizes downtime, reduces risks, and streamlines RMA processes for efficient, error-free network management.
Figure 1: Backup Taken After Configuration

Streamlined Backup & Recovery Process

ONES Fabric Manager Backup & Restore captures essential configuration files (config_db.json, frr.conf, fmcli_db.cfg) by enabling both manual and automatic snapshot creation during key operations like reboot, ZTP, or image upgrades. Each snapshot is tagged with a timestamp or custom label for easy identification and restoration. In the event of a failure, users can quickly revert to a known-good configuration—minimizing downtime and eliminating the need for complex manual recovery steps.
Figure 2: Backup Management Page
Figure 3: Restore Management Page

Multi-Vendor Support for Diverse Environments

Designed for flexibility, ONES Fabric Manager Backup & Restore works seamlessly across various network devices. Its consistent and reliable backup and recovery capabilities make it an ideal solution for dynamic, multi-vendor infrastructures, ensuring uninterrupted network performance regardless of vendor diversity.
Book a demo today — because every second of network downtime costs more than you think.

FAQs

1. How does ONES Backup & Restore help reduce SONiC RMA downtime?

ONES Fabric Manager automates the backup of SONiC configurations and enables one-click restore, eliminating the need for manual reconfiguration during RMA. This drastically reduces downtime and speeds up recovery.

Yes, ONES supports multi-vendor compatibility, allowing seamless backup and restoration across SONiC and non-SONiC devices—making it ideal for hybrid data center infrastructures.

ONES captures critical configuration files like config_db.json, frr.conf, and fmcli_db.cfg, ensuring full restoration of routing, ACLs, QoS, interfaces, and more.

Yes, ONES allows both manual and automated snapshot creation before key operations like Zero Touch Provisioning (ZTP), image upgrades, and reboots, enabling quick rollback if needed.

Without a structured backup system, RMA recovery becomes error-prone and time-consuming. ONES Backup & Restore ensures operational continuity by enabling reliable, fast, and error-free recovery after hardware failures.

Categories
Open Networking Enterprise Suite SONiC

Cisco 8000 + SONiC with Aviz ONES Bootcamp – Why You Should Join?

Why You Should Join?

Ready to unlock the future of networking? If you’re managing AI workloads, optimizing data center operations, or just curious about the power of SONiC on Cisco 8000, then this is the event you don’t want to miss.

AI is changing everything – from high-performance computing to real-time analytics, and your network needs to keep up. That’s where Cisco and Aviz Networks come in. Together, we’re redefining AI-ready infrastructure with SONiC-powered networking – delivering agility, scalability, observability, and orchestration.

Join us for the Cisco 8000 + Aviz ONES SONiC Bootcamp on April 3, 2025, where we’ll break down how to run AI-optimized networks with SONiC on Cisco hardware.

Why Should You Care?

Imagine a network that’s not just fast, but intelligent – make your network self-optimized for AI workloads, and that gives you complete control over traffic flows. That’s the power of SONiC on Cisco 8000, supercharged by Aviz ONES.
If you’re a network architect, engineer, or AI infrastructure leader, this bootcamp is designed for you. Whether you’re considering SONiC for the first time or already deploying it, we’ll show you how to make it work seamlessly on Cisco 8000.

What’s on the Agenda?

Part 1: The Cisco 8000 SONiC Evolution

(Presented by Cisco)

Part 2: AI-Driven Operations & Observability with Aviz ONES

(Presented by Aviz)

AI Networking with SONiC: A Practical Guide

Join Us – Register Now!

Categories
SONiC

5 Myths About SONiC NOS

The rise of SONiC (Software for Open Networking in the Cloud) has disrupted the traditional networking industry, offering enterprises a vendor-agnostic, open-source alternative to proprietary NOS solutions. But with disruption comes misconceptions, and SONiC is no exception.

Let’s debunk the top five myths about SONiC NOS and set the record straight.

Myth #1: SONiC is Only for Hyperscalers

Reality: SONiC is Ready for Enterprises, AI Workloads, and Beyond

While hyperscalers like Microsoft pioneered SONiC, it’s no longer just for the tech giants. Enterprises across healthcare, retail, finance, and AI-driven data centers are deploying SONiC to cut costs, increase flexibility, and escape vendor lock-in.

👉 2025 PlugFest validated SONiC for enterprise-grade use cases, including Layer 2 networking, AI fabric, and PoE-enabled whitebox switches—proving its viability beyond hyperscalers.

Want to hear real-world enterprise SONiC success stories?

Join us for a compelling SDxCentral panel discussion featuring insights from our customers, Techevolution and 1984, alongside our partner, EPS Global. Explore how SONiC is transforming enterprise networking with unparalleled flexibility and efficiency.

Myth #2: SONiC is Difficult to Deploy

Reality: SONiC Adoption is Faster and Easier Than Ever

There’s a misconception that SONiC requires deep coding skills or hyperscaler-level engineering resources. That was true in its early days, but today, we have 1-click SONiC fabric deployment and  enterprise-ready SONiC solutions come with automation, APIs, and user-friendly management tools.

💡 One-click SONiC migration guides and vendor support services have made the transition seamless—offering enterprises the same ease of deployment as traditional NOS solutions.

Myth #3: SONiC Lacks Vendor Support

Reality: A Thriving Ecosystem Backs SONiC

Some believe that using an open-source NOS means you’re on your own. That couldn’t be further from the truth. The SONiC ecosystem includes major vendors like Cisco, NVIDIA, Celestica, Edgecore, and Wistron, offering hardware compatibility, support services, and enterprise-ready integrations.

Aviz Networks is actively enabling SONiC interoperability and provide full-stack support, automation, and performance optimizations—ensuring enterprises get the help they need.

Myth #4: SONiC is Not Cost-Effective

Reality: SONiC Delivers Up to 40% TCO Savings

Proprietary NOS vendors often claim that SONiC isn’t cost-effective when factoring in hardware, support, and integration costs. But PlugFest’s TCO analysis tells a different story:

💰 Bottom line? SONiC saves enterprises millions while providing scalability for AI-driven workloads.

Myth #5: SONiC is Just a Trend—It Won’t Last

Reality: SONiC is the Future of Open Networking

Some skeptics believe SONiC is just another open-source experiment that will fade away. The reality? SONiC adoption is accelerating, with major enterprises, cloud providers, and AI data centers making it their NOS of choice.

With strong backing from industry leaders, ongoing community development, and real-world deployments, SONiC is shaping the future of open networking, AI infrastructure, and multi-vendor interoperability.

The Verdict: SONiC is Ready for Prime Time

SONiC NOS has matured beyond its hyperscaler origins into a battle-tested, cost-efficient, and enterprise-ready solution. As the 2025 PlugFest demonstrated, the myths surrounding SONiC are outdated, and the reality is clear:

Frequently Asked Questions:

1. Is SONiC only suitable for hyperscalers like Microsoft?

No. While Microsoft pioneered SONiC, enterprises across industries such as healthcare, finance, retail, and AI-driven data centers are actively deploying it. The 2025 PlugFest validated SONiC for enterprise-grade use cases, including Layer 2 networking, AI fabric, and PoE-enabled whitebox switches, proving its viability beyond hyperscalers.

2. Is deploying SONiC difficult?

Not anymore. Early versions of SONiC required deep coding expertise, but today, automation and enterprise-ready solutions have simplified deployment. One-click SONiC migration guides and vendor support services ensure a seamless transition, offering the same ease of deployment as traditional NOS solutions.

3. Does SONiC lack vendor support?

No. The SONiC ecosystem is backed by major vendors, including Cisco, NVIDIA, Celestica, Edgecore, and Wistron. Aviz Networks provides full-stack SONiC support, automation, and performance optimization, ensuring enterprises have access to the help they need for successful adoption.

4. Is SONiC cost-effective compared to proprietary NOS solutions?

Yes. According to PlugFest’s TCO analysis, SONiC delivers up to 40% lower total cost of ownership (TCO) compared to proprietary NOS solutions. By eliminating NOS licensing costs, reducing OpEx, and providing multi-vendor flexibility, SONiC enables enterprises to save millions while ensuring scalability for AI-driven workloads.

5. Is SONiC just a passing trend?

No. SONiC is shaping the future of open networking. With increasing adoption by major enterprises, cloud providers, and AI data centers, SONiC is here to stay. Its strong backing from industry leaders and ongoing community development ensure long-term viability and continuous innovation.

6. How can enterprises migrate to SONiC?

Enterprises can transition to SONiC using migration guides, vendor-supported deployment tools, and automation frameworks. With one-click SONiC fabric deployment and extensive documentation, the migration process is streamlined for efficiency and minimal disruption.

7. What are the key benefits of adopting SONiC?

8. Where can I learn more about real-world SONiC deployments?

Are you ready to break free from vendor lock-in and embrace open networking?
Learn More About SONiC and Migration Paths here.

Are you ready to break free from vendor lock-in and embrace open networking? Learn more about SONiC’s capabilities, migration paths, and real-world deployments

FAQs

1. Is SONiC NOS only meant for hyperscalers like Microsoft or Google?

No. While SONiC originated with hyperscalers, it is now widely adopted by enterprises across industries—including healthcare, retail, finance, and AI-driven data centers—for its flexibility, open-source benefits, and vendor-agnostic architecture.

SONiC deployment is now simplified with one-click fabric orchestration, migration guides, intuitive GUI tools, and vendor-supported solutions—making it as easy as traditional network operating systems for enterprises to adopt.

Yes. SONiC is backed by a growing ecosystem of top vendors like NVIDIA, Cisco, Edgecore, and Celestica. Additionally, companies like Aviz Networks offer full-stack SONiC support, automation tools, and deployment services tailored for enterprise use.

Absolutely. SONiC can deliver up to 40% lower total cost of ownership (TCO) by eliminating NOS licensing fees, offering multi-vendor hardware flexibility, and reducing OpEx through automation and open-source efficiencies.

SONiC is here to stay. Backed by major industry players and community contributions, SONiC has evolved into a mainstream open networking platform with real-world deployments in AI fabrics, enterprise data centers, and multi-vendor environments.

Categories
Open Networking Enterprise Suite SONiC

ONES 3.1 Boosts SONiC Support: Key Enhancements for Smarter Infrastructure Troubleshooting

In today’s fast-moving digital world, maintaining a stable and well-monitored infrastructure is crucial. The latest release of ONES 3.1 introduces key updates, including enhanced support for SONiC (Software for Open Networking in the Cloud). These enhancements boost visibility, automate critical processes, and strengthen system health monitoring. The improved SONiC support streamlines issue detection and response, optimizing performance and minimizing downtime. IT teams can now focus on strategic tasks, knowing their infrastructure is continuously and intelligently monitored for peak performance.
Stay ahead of issues and ensure smooth operations with ONES 3.1.

System Health Monitoring

CPU-Intensive Services

Previously, identifying resource-heavy processes was challenging due to the lack of granular insights in system-wide CPU and memory metrics. Often, system-level data shows a spike in CPU usage without providing a quick way to pinpoint the cause. To address this challenge, ONES now provides detailed reports on the top 10 CPU-consuming services running on the host, along with their memory usage. This helps users easily identify high-impact processes like redis-server, agent, syncd, and dockerd. With this level of detail, users can diagnose performance issues more quickly, optimize system resources, and prevent potential bottlenecks, resulting in greater system efficiency.
Top CPU consuming services

Unhealthy Devices with Failure Codes

ONES 3.1 introduces a new feature that highlights unhealthy devices, offering real-time failure detection for hardware (e.g., PSU, fan failures, LED alarms), software services, key processes, and containers. When a failure is detected, the device is marked as unhealthy, with detailed information readily available in the UI. This streamlined view helps operators quickly identify and resolve issues, simplifying troubleshooting. Notifications are also provided in the topology view and health summary page
Unhealthy Device with details of failure

SONiC Docker Transitions

Docker containers are the backbone of the SONiC operating system, and ensuring their stable operation is crucial for switch performance. Previously, tracking container state changes, such as shifts from “up” to “down,” was difficult and time-consuming. Operators often struggled to detect these changes in real-time, leading to delays in addressing service disruptions and unnoticed issues. ONES 3.1 introduces a new widget that visually highlights Docker container state transitions, allowing operators to quickly spot changes and respond to disruptions. Widgets provides a “Connect” button for direct SSH access to the switch, enabling swift action when needed. Additionally, it offers a timeframe selection feature, allowing operators to view container state changes over a specified period.
Docker transitions in the managed network

Automatic IP Detection, Alerting and Rediscovery:

When a monitored device’s management IP changes, it’s crucial for the monitoring software to update the IP promptly to ensure smooth operations. Previously, detecting and updating a device’s management IP was a manual, time-consuming process, often causing communication breakdowns and delayed issue identification. ONES 3.1 introduces an automatic rediscovery mechanism that instantly detects when a device’s management IP changes and re-registers the switch with the controller. This enhancement eliminates manual intervention, ensuring continuous communication, real-time monitoring, and faster issue resolution, even when devices are reconfigured.

Additionally, IP Transition Widget allows operators to track all IP changes the device has undergone over a specific period and if it had conflicted with another IP in the monitored network. To further enhance visibility, an alert generation option using ONES Rule engine notifies operators of any management IP changes, ensuring they are always aware of network modifications and can respond swiftly to maintain seamless operations.

Device IP transitions summary

Rule Engine: Enhanced Alerts

The ONES Rule Engine has emerged as a preferred tool for automating network monitoring, allowing operators to configure custom rules based on their specific threshold levels for various parameters. When a defined condition is met, the system automatically generates an alert, enabling real-time, proactive responses to potential issues. These new metrics provide deeper insights and more precise control over network performance, ensuring smoother operations and quicker issue resolution.

ONES 3.1 takes SONiC network monitoring and troubleshooting to the next level with powerful enhancements like real-time failure detection, automated IP rediscovery, detailed system health insights, and advanced alerting.
Ready to see ONES 3.1 in action? Book a demo today and experience how it can transform your network management with smarter automation and deeper insights.

FAQs

1. How does ONES 3.1 improve SONiC infrastructure monitoring?

ONES 3.1 enhances SONiC observability by offering real-time visibility into system health, including CPU-intensive services, Docker container transitions, and device-level failures. This allows IT teams to proactively detect, investigate, and resolve issues faster than before.

The new Docker Down Status alerts in ONES 3.1 notify operators immediately when SONiC containers fail, ensuring service disruptions are caught and addressed before they escalate—minimizing downtime and improving operational resilience.

Yes. ONES 3.1 introduces automatic IP rediscovery that detects management IP changes and re-registers the switch seamlessly, ensuring uninterrupted telemetry and real-time monitoring without manual intervention.

ONES 3.1 provides granular visibility into top 10 CPU-consuming services, showing memory usage per process. This helps pinpoint root causes—like syncd, redis, or dockerd—behind performance spikes and allows quick remediation.

The ONES Rule Engine can detect and alert on:

  • CPU/memory overuse by Docker containers
  • Docker container downtime
  • Hardware or service failures in devices
  • Real-time management IP changes

This enables a proactive, rule-based monitoring strategy tailored to each network’s performance needs.

Categories
Open Networking Enterprise Suite SONiC

Unveiling New Capabilities in Aviz ONES: NVIDIA Spectrum™-X, Orchestration for Small Networks and Conversational SONiC Troubleshooting

We are excited to introduce ONES 3.1, a major milestone in our continuous innovation with the Open Networking Enterprise Suite (ONES). This latest release reinforces our vision of building “Networks for AI and AI for Networks,” raising the bar for network management, configuration, and operations. With enhanced visibility and superior support, ONES 3.1 is more than just an update; it’s a transformative leap forward. This version delivers cutting-edge features that elevate the intelligence and efficiency of network operations, reflecting our unwavering commitment to redefining the possibilities in network management, orchestration, and support.

Key Features of ONES 3.1

Spectrum ™-X Observability:

Building on our existing support for Cumulus NOS, ONES 3.1 now extends compatibility to NVIDIA Spectrum™-X platforms running the latest NOS. This enhancement provides comprehensive visibility into Inventory, Environment, Firmware Versions, CPU/Memory Utilization, Transceivers, Interface Counters, LACP, BGP, RoCE Metrics including PFC, RoCE Traffic, and Queue Counters.

Additionally, ONES 3.1 brings enhanced NVIDIA GPU metrics for GPU-accelerated servers, offering a centralized dashboard that showcases the Top 10 GPU Utilization, allowing for real-time tracking and analysis of the most demanding GPU workloads.

Orchestration for Small Networks

ONES Fabric Manager introduces a simplified, intent-based orchestration experience through an intuitive GUI, enabling seamless fabric orchestration with just a few clicks. New capabilities such as Config Execution and Editor Window, Configuration Comparison, and Backups before upgrades or reboots enhance the efficiency, reliability, and manageability of data center fabric operations.

AI assistant: Conversational Troubleshooting (BETA)

ONES 3.1 introduces the AI Assistant, an intelligent conversational interface that enables users to interact effortlessly with network health and inventory data. It provides real-time insights, streamlines management, and responds to a wide range of queries, enhancing operational efficiency. Designed for on-premises deployment, the AI Assistant operates efficiently on a CPU, eliminating the need for a GPU or any tokens to process user queries.

IP Tracking & Alerting

ONES 3.1 introduces intelligent tracking of network device IP changes in real-time. A dedicated widget enables operators to monitor per-node IP changes and receive instant alerts in case of unexpected changes. Despite IP changes, telemetry streaming remains uninterrupted, ensuring continuous network monitoring without any impact on live status visibility.

Enhanced Support & Proactive Monitoring

ONES 3.1 brings a comprehensive set of default rule templates for critical metrics, ensuring instant anomaly detection and alerts with a simple one-click activation. This release expands monitoring capabilities with Docker CPU/Memory Utilization, Docker Down Status, and Unhealthy Device Detection. Additionally, users can now download a detailed summary of existing rules, enhancing visibility and control over network health.

Additional Enhancements

ONES 3.1 also introduces powerful new features, further strengthening its position as a leading network management solution:

These enhancements make ONES smarter, more efficient, and even more indispensable for modern networking.

Redefining Network Management with ONES 3.1

ONES 3.1 introduces a cutting-edge suite of features and an enhanced user experience, redefining network management, orchestration, and support. This release empowers users with advanced tools and intelligence, ensuring they stay ahead in an increasingly complex network environment.

To explore the full potential of ONES 3.1 and discover how it can transform your network operations, visit us at Aviz Networks. Embark on your journey toward seamless network monitoring and orchestration today.

FAQs

1. What is Aviz ONES 3.1 and how does it improve network operations?

Aviz ONES 3.1 is the latest version of the Open Networking Enterprise Suite, designed to optimize AI-driven data center networks. It introduces powerful enhancements in orchestration, observability, and support—tailored to modern networking needs such as RoCE fabrics, NVIDIA Spectrum™-X integration, and small-network scalability.

ONES 3.1 extends support to NVIDIA Spectrum™-X NOS by offering deep visibility into:

  • Inventory and transceiver health
  • Interface counters and RoCE traffic
  • LACP, BGP, PFC, and queue-level metrics
  • GPU utilization monitoring on accelerated servers

This makes it easier for operators to monitor, troubleshoot, and optimize NVIDIA-based AI fabrics.

Conversational Troubleshooting is a new AI assistant in ONES 3.1 that lets users interact with their network via natural language. It answers real-time questions about device health, inventory, and metrics—without requiring CLI knowledge or GPU-based LLMs—making diagnostics more intuitive for NetOps teams.

ONES 3.1 introduces an intent-based orchestration GUI that’s purpose-built for small to mid-size networks. It allows users to:

  • Execute and edit configurations visually
  • Compare changes before deployment
  • Create backups before upgrades or reboots

This streamlines network management without the complexity of CLI-heavy operations.

 

ONES 3.1 enhances proactive monitoring with a library of pre-built rules for key metrics. Administrators can now monitor:

  • Docker CPU and memory usage
  • Transition status of containerized services
  • IP address changes across network nodes
  • Unhealthy devices or service disruptions in real time

Alerts and anomaly detection are now just one click away—ideal for fast-moving AI environments.

Categories
Fabric Test Automation Suite SONiC

Single click SONiC evaluations and POCs

Learn how FTAS can do it for you!

Why Should Organizations Consider SONiC?

In today’s rapidly evolving networking landscape, organizations are seeking greater flexibility, scalability, and cost-effectiveness. SONiC (Software for Open Networking in the Cloud) has emerged as a leading open-source platform for building and managing data center networks.

SONiC empowers network operators to break free from vendor lock-in, reduce operational costs, and accelerate innovation. By providing a vendor-agnostic, open-source framework, SONiC offers unprecedented flexibility and control over network infrastructure.

What Makes Evaluating SONiC So Challenging?

While SONiC offers numerous benefits, evaluating and deploying it can be a daunting task due to several challenges :

How to Accelerate SONiC Evaluations with FTAS

How Does FTAS Keep Your Networks at Par with Quality Standards?

Aviz Networks’ Fabric Test Automation Suite (FTAS) is a powerful tool designed to ensure the quality and reliability of SONiC networks. By automating testing and validation processes, FTAS helps organizations accelerate deployment, reduce operational costs, and minimize risks.

FTAS helps maintain network quality by:

FTAS development is driven by the real-world use cases of Aviz Networks’ customers, ensuring that it meets the needs of modern data center and cloud environments.

Supported Protocols

FTAS supports a wide range of protocols essential for modern data center networks:

What are the new features in FTAS 3.1?

The latest FTAS 3.1 release brings a host of new features and enhancements to further streamline your SONiC evaluation and deployment process:

How to Use FTAS

By leveraging FTAS, you can accelerate your journey to SONiC, reduce risks, and achieve a more agile and efficient network. Start your SONiC evaluation today with FTAS.

To use FTAS, please contact Schedule a Call with Our Team to Delve into FTAS. For comprehensive information before the scheduled call, visit our FTAS product page.

FAQs

1. What makes SONiC evaluation difficult for enterprises exploring open networking?

SONiC evaluation can be complex due to:

  • Multi-vendor variability in hardware and features
  • Multiple SONiC flavors (community vs vendor-specific)
  • Extensive configuration options and networking feature sets
  • Need for deep SONiC expertise in-house
  • Lack of standardized testing tools across deployments

Aviz’s Fabric Test Automation Suite (FTAS) accelerates SONiC adoption by:

  • Automating Layer 2/3, HA, and QoS testing
  • Speeding up firmware and SONiC image qualification
  • Enabling CI/CD-based continuous validation
  • Supporting day-2 operations, including upgrades and troubleshooting

 FTAS offers full lifecycle validation through:

  • Resilience testing (reboots, link failures, container crashes)
  • Stress and scalability testing to evaluate real-world performance
  • QoS and EVPN/VXLAN validation
  • Fast reboot and SNMP visibility tests

FTAS is designed for heterogeneous data centers because it:

  • Supports standardized testing across vendors
  • Provides platform-specific CLI coverage
  • Automates interoperability checks in SONiC ecosystems
  • Evolves based on real customer deployments and feedback

FTAS dramatically speeds up SONiC evaluation by automating key network tests and validations. It reduces manual efforts and delivers rapid results through:

  • Automated Layer 2/3 functionality tests
  • High availability (HA) and security protocol validation
  • Continuous integration with CI/CD pipelines

Reduced time-to-market for new network features

Categories
Open Networking Enterprise Suite SONiC

Global Reach, Local Insight: ONES 3.0 Delivers Seamless Data Center Management

Explore the latest in AI network management with our ONES 3.0 series

ONES 3.0 introduces a range of exciting new features, with a focus on scaling data center deployments and support. In this blog post, we’ll dive into two standout features: ONES Multisite, a scalable solution for global data center deployments, and enhanced support for SONiC through tech support, servicenow integration and syslog message filtering. Let’s explore how these innovations can benefit your operations.

ONES Multi-site

The ONES rule engine enables incident detection and alert generation, but this data is limited to the specific site managed by each controller. While site data center administrators can use this information to address and resolve issues, enterprise-level administrators or executives seeking an overview of all data centers’ health must access each ONES instance individually, which can be inefficient.

To address this challenge, we introduce ONES Multisite—an application that provides a geospatial overview of anomalies across geographically distributed sites, offering a comprehensive view of the entire network’s health.

ONES instances in different data centers (DCs) around the globe can register with a central multisite application. Upon successful registration, the multisite system periodically polls each site for data related to the number of managed devices (endpoints) and the number of critical alerts. This information is displayed on a map view, showing individual sites, their health status, and last contact times. ONES Multisite also allows users to log in to individual data centers for more detailed information if needed.

ONES Multisite showing DCs across the globe
Fig 1 – ONES Multisite showing DCs across the globe
To provide a quick overview of the health conditions at various sites, different colors and blinking patterns are used

Registering ONES instance with Multisite application

A simple user interface is provided for registering the ONES application to the multisite, requiring inputs such as the site name, multisite IP, and geographical coordinates ((latitude and longitude in N and E). By default, the current location coordinates of the site are auto-populated, but they can be overridden if necessary. License page of ONES application displays the status of registration status with the multisite application.
Multisite Registration Window
Fig 2 – Multisite Registration Window

Once registered, the multisite application will regularly gather data from each site regarding the number of managed devices (endpoints) and the count of critical alerts.

ONES Multisite streamlines the monitoring process across multiple data centers, enabling enterprise-level administrators to easily access vital information and maintain a holistic view of their network’s health. This enhanced visibility not only improves operational efficiency but also empowers teams to respond more effectively to incidents, ensuring optimal performance across all locations.

Enhanced support for SONiC using ONES 3.0

Tech support feature

SONiC Tech Support feature provides a comprehensive method for collecting system information, logs, configuration data, core dumps, and other relevant information essential for identifying and resolving issues. ONES 3.0 Tech Support feature offers an easier way to download the tech support dump from any managed switch. Users can simply select a switch and click on the Tech Support option. ONES controller connects to the switch, executes the tech support command, and notifies the user when the download file is ready. This powerful option allows data center administrators to easily retrieve tech support data without the cumbersome process of logging into each switch, executing the command, and downloading the file.
ONES Tech support page

Fig 3 – ONES Tech Support page

Filtering of syslog messages

The Syslog feature empowers data center operators to easily view and download syslog messages from any of the managed switches through the ONES UI. This functionality is essential for monitoring system performance and diagnosing issues.

To enhance this feature, we’ve introduced a new enhancement that allows users to filter messages based on severity levels, such as error, warning, or all messages. This capability enables operators to quickly identify and prioritize critical alerts, streamlining the troubleshooting process and improving overall operational efficiency. By focusing on the most relevant messages, data center teams can respond more effectively to potential issues, ensuring a more reliable and robust network environment.

Syslog messages with filter applied
Fig 4 – Syslog messages with filter applied

ServiceNow Integration

ServiceNow is a cloud-based platform widely used for IT Service Management, automating business processes, and Enterprise Service Management. One of its core components is the ServiceNow ticketing system, specifically the Incident Management feature. When a user encounters a disruption in any IT service, it is reported as an incident on the platform and assigned to the responsible user or group for resolution.

The ONES Rule Engine proactively monitors the data center for potential disruptive events by creating alerts for any breaches of user-configured thresholds. It tracks various factors, such as sudden surges in CPU usage, heavy traffic bursts, and component failures (e.g., PSU, FAN).

ONES 3.0 enhances this functionality by integrating ServiceNow ticketing with the ONES Rule Engine and Alerts Engine. This integration allows ONES to automatically log tickets in the ServiceNow platform whenever any ONES rule conditions are met.

Rule creation page with Service now integrated
Fig 5 – Rule creation page with Service now integrated
Service now platform with ONES tickets
Fig 6 – Service now platform with ONES tickets
In summary, ONES 3.0 brings significant advancements that cater to the evolving needs of data center management.

To unlock the full potential of ONES 3.0 and see how it can revolutionize your network operations, book your demo today

FAQs

1.What is ONES Multisite and how does it improve global data center monitoring?

ONES Multisite provides a centralized geospatial view of data center health across global sites, allowing enterprise administrators to monitor critical alerts and device statuses from a single interface drastically improving visibility and incident response times.

ONES 3.0 connects its built-in Rule and Alerts Engine with ServiceNow to automatically generate tickets for anomalies like CPU surges, component failures, or bandwidth spikes—ensuring streamlined IT service workflows and faster resolution times.

Yes, ONES 3.0 introduces a simplified “Tech Support” feature that lets users download diagnostic logs from any managed SONiC switch with one click eliminating the need for manual CLI access across devices.

With advanced severity-level filtering (e.g., error, warning, info), ONES 3.0 helps operators quickly pinpoint critical syslog alerts from SONiC switches—accelerating root cause analysis and operational troubleshooting.

ONES 3.0 delivers single-pane visibility, ServiceNow integration, multisite scalability, and simplified support tools—making it the ideal centralized platform for managing complex, AI-powered, multi-vendor data center environments.

Categories
Open Networking Enterprise Suite SONiC

AI Fabric Orchestration: Supercharging AI Networks with SONiC NOS

Explore the latest in AI network management with our ONES 3.0 series

As the demand for high-performance parallel processing surges in the AI era, GPU clusters have become the heart of data-intensive workloads. But it’s not just about the GPUs themselves—intercommunication between GPU servers is the backbone of their overall performance. Enters the network switch fabric, which is pivotal in overcoming communication bottlenecks and ensuring seamless data flow between GPU servers. Technologies like RoCE (RDMA over Converged Ethernet) allow massive chunks of data to move efficiently between servers, but ensuring that these critical data streams remain lossless and uncongested requires a powerful solution.

That’s where SONiC’s QoS (Quality of Service) features come into play. SONiC enables you to prioritize critical data traffic, ensuring high-priority packets are always transferred ahead of other traffic and also that your important data is not lost. Using SONiC’s robust QoS capabilities and ONES 3.0’s orchestration, you can turn your switch fabric into a lossless, priority-driven highway for GPU server communications.

Let’s explore how you can achieve this through SONiC via ONES 3.0 Fabric Manager orchestration tool.

Lossless And Prioritized Data Flow

Any packet entering the fabric with any DSCP/DOT1P marking can be mapped to any queue of the interface and enabling PFC on this queue makes it lossless. With PFC in place, when congestion is detected in the queue, a pause frame is sent back to the sender, signaling it to temporarily halt sending traffic of that priority. This mechanism effectively prevents packet drops, ensuring lossless transmission for traffic of particular priority.

Beyond PFC, there’s another layer of congestion management—Explicit Congestion Notification (ECN). With ECN, we can define buffer thresholds, exceeding which Congestion Notification (ECN-CNP) packets are sent to the sender, prompting it to reduce the transmission rate and proactively avoid congestion.

At this stage, we’ve ensured that our priority traffic is lossless. Moving into the egress phase, we can further enhance performance by prioritizing this traffic over others, even under congestion. SONiC provides scheduling algorithms like Deficit Weighted Round Robin (DWRR), Weighted Round Robin (WRR), and Strict Priority Scheduling (STRICT). By binding priority queues to these schedulers, the system can ensure that higher-priority traffic is transmitted preferentially, either in a weighted manner (for WRR/DWRR) or with absolute priority (for STRICT).

In summary, through PFC, ECN, and advanced scheduling techniques, SONiC ensures that high-priority traffic from GPU servers is not only lossless but also prioritized during both congestion and egress phases.

Simplifying Complex QoS Configurations with ONES Orchestration

Configuring SONiC’s complex QoS features may sound daunting, but with ONES 3.0’s seamless orchestration, it’s a breeze. ONES allows you to set up essential QoS configurations like DSCP to traffic-class mapping, PFC, ECN thresholds, and even scheduler types—all with a few lines in a YAML template. Here’s a snapshot of the YAML template showcasing how ONES orchestrates SONiC QoS (QoS is the section in YAML below)

ONES UI AI Fabric Orchestration YAML Template
Fig 1 – ONES UI AI Fabric Orchestration YAML Template

The Fabric Manager automates the creation and assignment of QoS profiles, saving administrators from manually configuring multiple aspects. Here’s how it works:

Mapping Traffic Classes and Queues

Orchestration begins by mapping traffic into appropriate classes and queues. ONES 3.0 Orchestration allows you to specify mapping values from DSCP (Layer 3) and dot1p (Layer 2) to traffic classes, traffic classes to queues, and traffic classes to priority groups (PGs). Upon specifying these mapping values, profiles would be created with standard namings using these mapping values like DOT1P_TC_PROFILE, TC_QUEUE_PROFILE, TC_PG_PROFILE, DSCP_TC_PROFILE and are binded to the interfaces that are part of the orchestration. This configuration ensures that each type of traffic is routed to its appropriate queue and handled correctly.

For example, we can specify mapping values in the YAML as above in image and FM will create the corresponding profiles and bind it to the interface as below:

Priority Flow Control (PFC) and Explicit Congestion Notification (ECN)

The next critical part of QoS orchestration involves Priority Flow Control (PFC), where ONES YAML allows users to define the queues that should be PFC-enabled. Moreover a PFC Watchdog can be configured to ensure that the PFC is well functioning with restoration, detection times and action to be taken in case of malfunctioning .

ECN configuration parameters can be provided in the YAML template using which ONES Fabric Manager creates a profile WRED_PROFILE and attaches it to all the queues that are PFC enabled for all the interfaces that are part of orchestration.

Here’s an example of how this would be configured on the interface for the YAML input in the above image.

This approach ensures that your network proactively manages congestion and minimizes packet drops for high-priority traffic.

Advanced Scheduling for Optimized Egress

Finally, Scheduling plays a vital role in controlling how packets are forwarded from queues. Orchestration allows administrators to choose between scheduling mechanisms such as Deficit Weighted Round Robin (DWRR), Weighted Round Robin (WRR), or STRICT priority scheduling, depending on their needs.

In the case of DWRR or WRR, weights can be assigned to each queue, influencing how often a queue is serviced relative to others. Upon specifying these parameters in the YAML, ONES-FM creates the scheduler policies (SCHEDULER.<weight>) each for a unique weight assigned to the queues and attach these created policies to the queues according to their weightage for all the interfaces that are part of the orchestration.

For instance in the below given image YAML input, there are two unique weights 60 and 40 that are assigned to queue 3 and 4 respectively. So, two scheduler policies SCHEDULER.40, SCHEDULER.60 are created and binded to the interface queues 3 and 4 respectively.

Now, here comes a question , what if all the queues are congested. How does the congestion notification packets even traverse through the network to reach the sender to stop or slow down the traffic coming in ?

ONES-FM provides an option to designate a specific queue for ECN_CNP (Explicit Congestion Notification packets) traffic, using STRICT scheduling, ensuring that even when the network is heavily congested there is always a room left for the congestion notification packets, preventing further blockages. cnp_queue under the ECN section in the above image represents that and is orchestrated as below by ONES-FM:

Flexible, Day-2 Support for QoS Management

One of the standout features of ONES-FM 3.0 is its support for Day-2 operations. As your network evolves and traffic patterns change, you can modify the QoS configurations through either the YAML template or the NetOps API. This flexibility ensures your network is always tuned to deliver the performance required by your AI workloads.

Future-Proof Your AI Infrastructure with ONES 3.0

With its intuitive YAML-based approach and support for dynamic Day-2 adjustments, ONES Fabric Manager eliminates much of the complexity associated with configuring and managing networks. ONES makes one confident that network infrastructure is both reliable and future-proof. In essence, ONES Fabric Manager enables seamless orchestration for AI fabrics, ensuring your network is always ready to meet the growing demands of AI-driven data centers.

FAQs

1. How does SONiC NOS enable lossless data transfer for GPU-based AI workloads?

SONiC NOS supports Priority Flow Control (PFC) and Explicit Congestion Notification (ECN), ensuring lossless, high-priority traffic flows—critical for real-time communication between GPU clusters in AI data centers.

ONES 3.0 provides YAML-based orchestration to simplify complex SONiC QoS settings like DSCP mapping, PFC, ECN, and queue scheduling, reducing configuration time and errors.

Yes, ONES 3.0 is built for vendor-agnostic orchestration, managing QoS policies across different switches and interfaces—ideal for hybrid or evolving AI network environments.

ONES allows you to designate a CNP (Congestion Notification Packet) queue with STRICT priority, ensuring that even during congestion, ECN messages reach the sender to throttle traffic.

SONiC supports DWRR, WRR, and STRICT scheduling, and ONES Fabric Manager lets you assign and orchestrate these policies via YAML—optimizing egress packet forwarding and queue handling in AI fabrics.

Categories
Open Networking Enterprise Suite SONiC

Streamlining AI Fabric Management: The Imperative of a Centralized Management Platform

Introduction

Artificial Intelligence (AI), once a mere buzzword, has now firmly established itself as a cornerstone of technological advancement. Its insatiable appetite for data fuels its continuous evolution, and generative AI, a subset capable of creating new content, is a prime driving force behind this growth. As datacenters become increasingly AI-centric and drive businesses worldwide, the networking community must assess their readiness for this transformative shift.

The Rapid Pace of AI Development

The pace of AI development is staggering, with years of progress potentially compressed into mere weeks. This rapid evolution necessitates a proactive approach from the networking community to ensure their solutions remain aligned with the cutting-edge advancements in AI. The challenge is multifold, as the increasing demand for networking switches and GPUs opens up opportunities for innovation in multi-vendor ecosystems and data center environments.
GPU Market size and Trend

Fig 1 – GPU Market size and Trend

The Demand for Open and Flexible Networking Solutions

The rapid need for networking switches and GPUs has created a demand for multi-vendor ecosystems and data center environments. This increased demand for freedom from vendor locking has led to a surge in interest for open-source network operating systems (NOS) like SONIC for networking switches. The driving force behind this demand is the consolidation of features offered by multi-vendor hardware suitable for AI Fabrics and overall cost optimization.

Evolving Data Center Network Architectures

As data center network designs evolve from server-centric to GPU-centric architectures, the necessity for new networking topology designs such as fat-tree, dragonfly, and butterfly has become paramount. GPU workloads, including training, fine-tuning, and inferencing, have distinct networking needs, with Remote Direct Memory Access (RDMA) being the most suitable technique to handle high-bandwidth data traffic flows. Lossless networking and low entropy are also essential for optimal performance.
Fig 2 – Evolution of Data Centers

The Need for Centralized Management Solutions

A single pane of glass management tool is essential to streamline operations and optimize performance in multi-vendor AI fabric data centers. Such a tool should be capable of:

Addressing the Challenges of Centralized Management with ONES

Implementing a centralized management tool in a multi-vendor AI fabric data center requires careful consideration of several key challenges:
Aviz understands this need and has implemented ONES 3.0, a centralized management platform that provides comprehensive control over networking devices, AI workload servers and data centers.
Fig 3 – Aviz Open Networking Enterprise Suite (ONES) for AI Fabrics

The Future of Networking in the AI Era

As AI continues to evolve and its applications expand, the networking community must adapt to the changing landscape. By embracing open-source solutions, adopting new network topologies, and leveraging centralized management platforms like ONES 3.0, organizations can ensure their networks are well-equipped to support the demands of AI-driven workloads. The future of networking is inextricably linked to the advancement of AI, and those who are proactive in their approach will be well-positioned to capitalize on the opportunities that lie ahead.

All these cutting-edge innovations only mark the initial stride towards Aviz Networks’ vision, and more is yet to come. With our strong team of support engineers, we are well-equipped to empower customers with a seamless SONiC journey using the ONES platform.

As AI-driven networks grow in complexity, a centralized management platform like ONES 3.0 by Aviz Networks is essential. It provides seamless control, real-time monitoring, and multi-vendor compatibility to tackle the unique demands of AI workloads. Future-proof your network with ONES 3.0—because the future of AI fabric management starts here.

Explore more about ONES 3.0 in our latest blogs here

If you wish to get in touch with me, feel free to connect on LinkedIn here

FAQs

1. Why is centralized management essential for AI Fabric networks?

Centralized management platforms like ONES 3.0 simplify multi-vendor orchestration, offer real-time GPU and network telemetry, and streamline configuration and monitoring for evolving AI data center topologies.

ONES 3.0 supports vendor-agnostic infrastructure, enabling seamless control across switches, NICs, and GPUs, while delivering lossless RDMA optimization, topology orchestration (fat-tree, dragonfly), and proactive alerting.

Top features include:

  • Real-time infrastructure visualization
  • Multi-topology orchestration (fat-tree, dragonfly, butterfly)
  • GPU and NIC telemetry
  • Priority Flow Control (PFC)
  • End-to-end anomaly detection

Yes, ONES 3.0 is optimized for AI/ML GPU workloads and RoCE-based RDMA traffic, enabling QoS profile automation, PFC watchdogs, and deep visibility into compute and network fabric.

ONES 3.0 supports fat-tree, dragonfly, and butterfly network topologies, enabling scalable, high-performance designs tailored to the latency and throughput needs of modern AI fabrics.

Categories
SONiC

Why SONiC is Ready Not Just for Hyperscalers

When you think of SONiC (Software for Open Networking in the Cloud), it’s often associated with hyperscalers—the giants in tech like Google and Microsoft that demand unparalleled scalability and customization in their network infrastructure. But what if I told you that SONiC is no longer just for hyperscalers? What if I told you that enterprises—yes, Fortune 500 companies, mid-sized businesses, even finance and telecom industries—are now tapping into the power of SONiC to transform their networks?

At Aviz, we’re witnessing this shift firsthand. We recently met with our partners and customers, who shared how they’ve successfully adopted SONiC to enhance their network operations. The results are clear—SONiC’s open-source architecture is providing flexibility, scalability, and significant cost savings across industries. Discover more about this transformation by watching our panel discussion in a video interview hosted by SDxCentral, featuring our customers Techevolution and 1984, along with our partner EPS Global.

Flexibility Through Choices

One of SONiC’s strongest suits is its ability to give businesses choices—choices in hardware, in vendors, in deployment models. Unlike traditional, proprietary solutions that lock you into a particular vendor or set of hardware, SONiC opens the door to a wide array of options. Whether you prefer Cisco, Arista, or NVIDIA, SONiC supports them all, ensuring that your network infrastructure is as flexible and adaptable as your business needs it to be.

With SONiC, you aren’t tied to a single vendor. This freedom encourages innovation and fosters a competitive ecosystem, where businesses can pick and choose the best components to suit their specific needs.

Control Over Your Network

Another key factor that sets Aviz apart is the level of control it offers. Enterprises can manage SONiC at the source code level, which means they can choose the hardware they want at any time. At Aviz, we normalize metrics from your fabric including ASICs, and operating systems to achieve the multivendor observability, giving you the control to deploy any switch you want without worrying about how it will impact your NETOps layer. But what truly makes it enterprise-ready is the end-to-end support stack provided by Aviz. Our platform ensures that customers gain full operating control, whether they prefer a traditional Cisco-like CLI, REST APIs, or even their own in-house controllers. And we understand that enterprises need more than just flexibility—they need an experience that’s simple to use and supported 24/7.

That’s why we’ve developed Aviz Easy Deploy, Monitor, and Support, a plug-and-play solution that brings SONiC to enterprise environments with ease and without the hassle of vendor lock-in.

The Cost Savings You’ve Been Looking For

Cost has always been a critical factor in any IT decision, and SONiC shines here. By moving away from expensive, proprietary solutions and embracing the open-source model, companies can dramatically reduce both capital and operational expenditures. In fact, using SONiC as a foundation, we’ve helped businesses cut costs by half compared to traditional solutions.

A Future-Proof Solution

Perhaps the most exciting part of SONiC’s evolution is that it is on the path to becoming the Linux of networking. Much like how Linux started with a small group of adopters and evolved into a mainstream operating system, SONiC is following a similar trajectory. From a select few hyperscalers to widespread adoption across industries, SONiC is proving it’s not only scalable but also future-proof.

At Aviz, we’re proud to be leading this transformation alongside our partners, customers, and the open-source community. With our support stack, we’ve perfected the recipe for deploying, managing, and scaling SONiC in enterprise environments, offering the same robust experience you’d expect from legacy OEMs.

Ready for the Enterprise

In short, SONiC is no longer just for hyperscalers. It’s ready for enterprises of all sizes, offering flexibility, control, and cost savings that simply can’t be matched by traditional solutions. And with Aviz Easy Deploy, Monitor, and Support, we’ve made SONiC accessible to virtually any organization, making it the go-to choice for network infrastructure transformation.

As more Fortune 500 companies embrace SONiC, we’re confident that the rest of the industry will soon follow suit. So if you’re looking for a scalable, cost-effective, and flexible network solution, it’s time to look at SONiC—not just as the future of networking, but as the solution that’s ready for your enterprise today.

FAQs

1. Is SONiC only suitable for hyperscalers like Microsoft and Google?

No. While SONiC was originally developed for hyperscalers, it’s now enterprise-ready. With solutions like Aviz Easy Deploy and multi-vendor support, mid-sized businesses, Fortune 500s, and telecoms are successfully using SONiC to modernize their network infrastructure.

Enterprises adopt SONiC for its vendor-agnostic flexibility, full control of source code, significant CapEx and OpEx savings, and open support ecosystem—allowing them to customize, scale, and future-proof their networks.

By avoiding expensive proprietary software and enabling open-source switching, SONiC helps enterprises cut network costs by up to 50%. With Aviz, deployment and support are simplified, eliminating the need for costly vendor lock-ins.

Yes. SONiC is hardware-agnostic and supports popular vendors like Cisco, Arista, NVIDIA, and more, enabling enterprises to mix-and-match best-of-breed hardware without compatibility concerns.

SONiC is becoming the de-facto open standard for disaggregated networking, much like how Linux revolutionized computing. With community backing, open-source transparency, and growing adoption across industries, it’s paving the way for next-gen network architectures.

Contact Us

Why SONiC is Ready Not Just for Hyperscalers

When you think of SONiC (Software for Open Networking in the Cloud), it’s often associated with hyperscalers—the giants in tech like Google and Microsoft that demand unparalleled scalability and customization in their network infrastructure. But what if I told you that SONiC is no longer just for hyperscalers? What if I told you that enterprises—yes, […]