Exciting Announcement! In celebration of launching our AI Certification, we’re thrilled to offer a 50% discount exclusively. Seize this unique chance—don’t let it slip by!

Categories
All

ONES 2.0: Unleashes Rule Engine and Alerting System for Seamless SONiC Operations

ONES Rule Engine is an advanced feature that enhances your network management experience by providing a seamlessly integrated alert and notification system. It offers comprehensive monitoring metrics and allows you to create device and interface level rules with ease. With ONES Rule Engine, you can have tailored control over your network management. Upgrade your network management game and experience with ONES Rule Engine today!

10 Benefits of Using ONES Rule-Engine for Comprehensive Network Monitoring

  • Comprehensive Monitoring
    ONES Rule-Engine takes a holistic approach to network monitoring by keeping an eye on diverse metrics such as CPU utilization, Memory utilization, PSU status, fans speed, RX/TX, and more. This breadth of coverage ensures that no aspect of your network goes unnoticed, providing a comprehensive view for proactive issue resolution.
  • Device and Interface Level
    It allows the creation of rules at both device and interface levels. This fine grained rule management ensures that specific devices or interfaces can be targeted for rule application, allowing for a tailored approach to network optimization and issue handling.
  • Rule Customization
    Rule-Engine understands the unique requirements of different network components. With device-level rules based on Hardware SKU, Role, and OS version, administrators can fine-tune alerts to align with the specific characteristics of their network infrastructure.

Figure 1: Rule Configuration
Figure 1: Rule Configuration
  • Device Inclusion & Exclusion 
    Flexibility is key in network management. The rule engine provides the capability to include or exclude devices from rules, ensuring that the rule engine caters to the specific needs of your network architecture. This feature enables a dynamic response to changes in the network environment.
  • Severity-Based Alerting
    The Rule-Engine facilitates the creation of Critical and Warning severity alerts, allowing administrators to prioritize responses based on the urgency and impact of potential issues. This hierarchical alerting system ensures that critical problems are addressed promptly, minimizing downtime and optimizing network performance.
  • Alert Summary for Collaborative Issue Resolution
    The system enables users to generate a comprehensive report of all alerts, facilitating effortless sharing with the team. This feature simplifies the collaborative resolution process, promoting efficient communication and knowledge transfer among team members.

Figure 2: Alert Summary
Figure 2: Alert Summary
  • Integration with Slack for real-time notifications
    ONES’ Slack integration ensures that critical alerts are delivered directly to designated channels, keeping teams informed and in sync. Additionally, weekly Slack digests provide a comprehensive overview of alerts and Zendesk ticket details, streamlining communication and collaboration.
  • Zendesk Integration for Streamlined Ticketing
    The rule engine seamlessly integrates with Zendesk, automating the creation of tickets based on alerts. This integration simplifies the ticketing process, providing a centralized platform for tracking and managing network issues.
  • Preventing redundant alerts leads to efficient alerting
    During the rule creation process, administrators have the capability to specify the maximum number of alerts for a particular metric on a specific device, mitigating the occurrence of redundant notifications. This feature contributes to a streamlined and efficient alerting system, enhancing the overall effectiveness of network management within the ONES 2.0 ecosystem.
  • Strengthening Monitoring and Response Capabilities with detailed alert information
    Each alert is enriched with essential details, including Metric Name, Type(Critical or Warning), Triggered Time and Associated Rule Information. Alerts also includes a URL that will redirect users to associated visual representations for better understanding. In addition, device information such as IP address, role, region, SKU, serial number, NOS etc are the part of alert details. Interface specific alerts will have the related additional information like the interface name , speed , Transceiver details as shown in below image Fig 3.

Figure 3: Alerts details on Zendesk
Figure 3: Alerts details on Zendesk

Figure 4: Alerts Summary on Slack
Figure 4: Alerts Summary on Slack

Rule Engine coverage

  • System Health
    Rules can be created to monitor system health like device’s CPU utilization, Memory utilization and CPU core temperatures and alert if those values exceed the critical or warning thresholds. ONES UI also provides the recommended thresholds for CPU and memory usage.
  • Alert on Component Failures   
    Rule engine can be used to alert if a device FAN or a Power supply unit (PSU) goes faulty. ONES backend keeps continuous track of component health and triggers an alert in case of failure.
  • Capacity Monitoring 
    Hardware switching is an important aspect in today’s network for high speed data transmissions. Situations can develop where the switch ASIC hardware limits are utilized and forwarding happens in software causing system instability. ONES rule engines have these monitored as well and rules can be created to notify if the ASIC IPv4 / IPv6 utilization exceeds the warning and critical levels.
  • Traffic Monitoring
    Set the utilization levels for traffic links , acceptable thresholds for errors and discards and alerts will be generated for links crossing the set levels.
  • Transceiver Health
    Transceiver operational values like Voltage, Temperature and Power are critical for having error free and lossless transmissions. Rule engine monitors those metrics and alerts the transceivers that are on verge of going rogue or requiring attention.
  • SONiC Services Health
    In addition to all the above , alerts can be generated for any BGP neighboring going down and for monitoring synced and for container cpu utilization.

Conclusion

Embrace the power of ONES 2.0’s Rule Engine and Alerting system to elevate your network management experience. With real-time monitoring of hardware, network, components, counters and transceiver health to enhance your SONiC journey with unparalleled support and advanced alert management through Slack and Zendesk integrations.

The alerts system goes beyond Slack or Zendesk integrations and can be customized to fit any platform based on the requirements.

Stay tuned for our upcoming blog series, where we’ll dive deep into these insightful topics:

  • RoCE Traffic Visibility in AI Fabric
  • Detailed security compliance with ONES
  • In-depth analysis regarding the measurement of NWSLA

Take a ‘test drive’ with ONES Center before SONiC Deployments with our well known vendors in hardware, platforms, ASIC and OS at your ease. Make your informed decision by testing it out with our multi-vendor, including Cisco SONiC, NVIDIA SONiC, Celestica SONiC, Marvell SONiC, Wistron SONiC, Edgecore Community SONiC, Arista SONiC, Supermicro SONiC, Enterprise SONiC, and DELL SONiC.

How does ONES 2.0 Rule Engine enhance SONiC for alert management?

ONES 2.0: Unleashes Rule Engine and Alerting System for Seamless SONiC Operations

ONES Rule Engine is an advanced feature that enhances your network management experience by providing a seamlessly integrated alert and notification system. It offers comprehensive monitoring metrics and allows you to create device and interface level rules with ease. With ONES Rule Engine, you can have tailored control over your network management. Upgrade your network […]