Exciting Announcement! In celebration of launching our AI Certification, we’re thrilled to offer a 50% discount exclusively. Seize this unique chance—don’t let it slip by!

ONES 3.1 Boosts SONiC Support: Key Enhancements for Smarter Infrastructure Troubleshooting

March 12, 2025

In today’s fast-moving digital world, maintaining a stable and well-monitored infrastructure is crucial. The latest release of ONES 3.1 introduces key updates, including enhanced support for SONiC (Software for Open Networking in the Cloud). These enhancements boost visibility, automate critical processes, and strengthen system health monitoring. The improved SONiC support streamlines issue detection and response, optimizing performance and minimizing downtime. IT teams can now focus on strategic tasks, knowing their infrastructure is continuously and intelligently monitored for peak performance.
Stay ahead of issues and ensure smooth operations with ONES 3.1.

System Health Monitoring

CPU-Intensive Services

Previously, identifying resource-heavy processes was challenging due to the lack of granular insights in system-wide CPU and memory metrics. Often, system-level data shows a spike in CPU usage without providing a quick way to pinpoint the cause. To address this challenge, ONES now provides detailed reports on the top 10 CPU-consuming services running on the host, along with their memory usage. This helps users easily identify high-impact processes like redis-server, agent, syncd, and dockerd. With this level of detail, users can diagnose performance issues more quickly, optimize system resources, and prevent potential bottlenecks, resulting in greater system efficiency.
Top CPU consuming services

Unhealthy Devices with Failure Codes

ONES 3.1 introduces a new feature that highlights unhealthy devices, offering real-time failure detection for hardware (e.g., PSU, fan failures, LED alarms), software services, key processes, and containers. When a failure is detected, the device is marked as unhealthy, with detailed information readily available in the UI. This streamlined view helps operators quickly identify and resolve issues, simplifying troubleshooting. Notifications are also provided in the topology view and health summary page
Unhealthy Device with details of failure

SONiC Docker Transitions

Docker containers are the backbone of the SONiC operating system, and ensuring their stable operation is crucial for switch performance. Previously, tracking container state changes, such as shifts from “up” to “down,” was difficult and time-consuming. Operators often struggled to detect these changes in real-time, leading to delays in addressing service disruptions and unnoticed issues. ONES 3.1 introduces a new widget that visually highlights Docker container state transitions, allowing operators to quickly spot changes and respond to disruptions. Widgets provides a “Connect” button for direct SSH access to the switch, enabling swift action when needed. Additionally, it offers a timeframe selection feature, allowing operators to view container state changes over a specified period.
Docker transitions in the managed network

Automatic IP Detection, Alerting and Rediscovery:

When a monitored device’s management IP changes, it’s crucial for the monitoring software to update the IP promptly to ensure smooth operations. Previously, detecting and updating a device’s management IP was a manual, time-consuming process, often causing communication breakdowns and delayed issue identification. ONES 3.1 introduces an automatic rediscovery mechanism that instantly detects when a device’s management IP changes and re-registers the switch with the controller. This enhancement eliminates manual intervention, ensuring continuous communication, real-time monitoring, and faster issue resolution, even when devices are reconfigured.

Additionally, IP Transition Widget allows operators to track all IP changes the device has undergone over a specific period and if it had conflicted with another IP in the monitored network. To further enhance visibility, an alert generation option using ONES Rule engine notifies operators of any management IP changes, ensuring they are always aware of network modifications and can respond swiftly to maintain seamless operations.

Device IP transitions summary

Rule Engine: Enhanced Alerts

The ONES Rule Engine has emerged as a preferred tool for automating network monitoring, allowing operators to configure custom rules based on their specific threshold levels for various parameters. When a defined condition is met, the system automatically generates an alert, enabling real-time, proactive responses to potential issues. These new metrics provide deeper insights and more precise control over network performance, ensuring smoother operations and quicker issue resolution.

ONES 3.1 takes SONiC network monitoring and troubleshooting to the next level with powerful enhancements like real-time failure detection, automated IP rediscovery, detailed system health insights, and advanced alerting.
Ready to see ONES 3.1 in action? Book a demo today and experience how it can transform your network management with smarter automation and deeper insights.

FAQs

1. How does ONES 3.1 improve SONiC infrastructure monitoring?

ONES 3.1 enhances SONiC observability by offering real-time visibility into system health, including CPU-intensive services, Docker container transitions, and device-level failures. This allows IT teams to proactively detect, investigate, and resolve issues faster than before.

The new Docker Down Status alerts in ONES 3.1 notify operators immediately when SONiC containers fail, ensuring service disruptions are caught and addressed before they escalate—minimizing downtime and improving operational resilience.

Yes. ONES 3.1 introduces automatic IP rediscovery that detects management IP changes and re-registers the switch seamlessly, ensuring uninterrupted telemetry and real-time monitoring without manual intervention.

ONES 3.1 provides granular visibility into top 10 CPU-consuming services, showing memory usage per process. This helps pinpoint root causes—like syncd, redis, or dockerd—behind performance spikes and allows quick remediation.

The ONES Rule Engine can detect and alert on:

  • CPU/memory overuse by Docker containers
  • Docker container downtime
  • Hardware or service failures in devices
  • Real-time management IP changes

This enables a proactive, rule-based monitoring strategy tailored to each network’s performance needs.

Advanced observability tools in ONES 3.1 help operators:

  • Spot unhealthy devices and see precise failure codes instantly
  • Visualize Docker container status changes over time
  • Correlate CPU spikes with top resource-heavy processes
  • Respond quickly using direct SSH access from widgets

Automatic IP rediscovery ensures:

  • Continuous real-time telemetry even if IPs change during maintenance
  • Zero manual reconfiguration for IP updates
  • Faster troubleshooting for re-addressed switches
  • Reduced risk of monitoring gaps in dynamic environments

A conversational AI assistant can:

  • Answer plain-language queries about device health and logs
  • Summarize Docker transitions and failure alerts in seconds
  • Suggest root cause hints based on system metrics
  • Minimize CLI reliance, making diagnostics faster for all skill levels

Container-level insights help operators:

  • Identify which SONiC service (like syncd or redis) is overloading resources
  • Set rule-based alerts when usage crosses safe thresholds
  • Optimize system performance proactively
  • Prevent unexpected container crashes due to resource exhaustion

The upgraded Rule Engine enables:

  • Custom alert rules for Docker, hardware failures, and IP changes
  • One-click rule activation for fast deployment
  • Detailed summaries of active rules for audit and tuning
  • Real-time anomaly detection that cuts downtime and improves SONiC resilience
Share the Post:

Contact Us

ONES 3.1 Boosts SONiC Support: Key Enhancements for Smarter Infrastructure Troubleshooting

In today’s fast-moving digital world, maintaining a stable and well-monitored infrastructure is crucial. The latest release of ONES 3.1 introduces key updates, including enhanced support for SONiC (Software for Open Networking in the Cloud). These enhancements boost visibility, automate critical processes, and strengthen system health monitoring. The improved SONiC support streamlines issue detection and response, […]