Exciting Announcement! In celebration of launching our AI Certification, we’re thrilled to offer a 50% discount exclusively. Seize this unique chance—don’t let it slip by!

6 KPIs for your SONiC NetOps system to scale your networks efficiently – Learn how ONES telemetry module works to achieve it

This investigation explores the experiments conducted to evaluate how effectively the ONES (Open Networking Enterprise Suite) functions when tested at various scales. Scale testing of the telemetry collectors presents several challenges that need to be carefully addressed to ensure the reliability and effectiveness of the telemetry infrastructure. Addressing these challenges involves a combination of careful architecture design, efficient resource management, scalability testing, and ongoing optimization efforts. Successful scale testing of telemetry collectors requires a holistic approach to ensure that the system can handle the demands of large and complex network environments.

Following diagram lists some of the key factors that play a vital role in the scale testing.

Multiple facets of scale testing collectors
Figure-1: Multiple facets of scale testing collectors


6 Key Goals for High-Performance ONES Telemetry Data Collection

The Aviz`s approach to software development strives for excellence in performance and quality on all fronts. Achieving high scale numbers for ONES collectors involves setting specific goals to ensure the efficiency, reliability, and scalability of the telemetry infrastructure. Some of the key goals include:

  1. Performance Optimization: Optimize the performance of ONES to handle high volumes of data with minimal latency, ensuring real-time data processing and analysis.
  2. Resource Efficiency: Efficiently utilize system resources (CPU, memory, storage) to support high-scale telemetry data collection without causing resource bottlenecks or degradation in performance.
  3. Data Integrity and Quality: Maintain high data integrity and quality throughout the telemetry data collection process, ensuring accurate and reliable information for analytics and decision-making.
    1. Maintaining data integrity is like a good cup of coffee – you can’t start your day without it, and if it’s ever compromised, you might just spill the beans! That’s why, at ONES, we take our data integrity as seriously as that first sip in the morning – no compromises allowed!
  4. Fault Tolerance: ONES should be embedded with built-in fault tolerance mechanisms to ensure continuous operation, even in the event of component failures or disruptions.
  5. Resilience: ONES ability to maintain functionality, reliability, and performance even when subjected to various challenges, such as high data volumes, network congestion, device failures, or other adverse conditions. A resilient telemetry collector can withstand and recover from disruptions, ensuring continuous data collection, processing, and delivery under varying circumstances.
  6. Longevity: “Longevity” refers to the ability of the ONES to maintain consistent and reliable performance over an extended period. ONES should demonstrate stability and consistent performance over extended durations, ensuring that it can handle continuous data streams without experiencing deterioration in functionality.

How to Achieve Scalability Goals with ONES: A Success Story

ONES Scale topology dashboard
Figure-2: ONES Scale topology dashboard

The success story of our ONES is a testament to its remarkable journey in achieving the scalability goals. Through meticulous design and rigorous scale testing, ONES demonstrated the ability to handle high data volumes with exceptional speed and responsiveness. The implementation of advanced algorithms and optimization techniques allowed us to achieve real-time data processing, ensuring that insights are delivered promptly for timely decision-making. The collector’s resource efficiency became a hallmark of its success, intelligently managing CPU, memory, and storage to prevent bottlenecks and guarantee sustained performance. 

Data Integrity Tests: Pillars of Reliability

  • Seamlessly integrated into every phase, Data Integrity Tests maintain alignment between the Source of Truth (SOT) and delivered data, fortifying ONES’ resilience
  • This commitment bolsters confidence in the accuracy and reliability of ONES’ telemetry data insights

Resource Optimization with Datadog Integration

  • Seamless integration with Datadog for Resource Optimization verification enables meticulous monitoring and analytics of resource utilization
  • Engineers leverage real-time insights to fine-tune configurations, ensuring optimal performance thresholds for ONES

Scale simulator architecture
Figure-3: Scale simulator architecture

Fault Tolerance and Resilience Validation

  • ONES deploys an innovative Traffic Control mechanism in the Linux ecosystem, validating its response to varied network disruptions
  • This approach assesses ONES’ ability to navigate and overcome adverse conditions effectively
  • Milestone in Telemetry Solutions

ONES’ scaling journey marks a pivotal milestone, reflecting commitment towards unparalleled telemetry solutions for evolving modern network needs.

ONES`s Latency was measured under the scale conditions. Subject to scale conditions handling 1024 odd devices, the time taken by the collector to process the data received and push to the database was continuously monitored and remarkable consistency of low latency was achieved under the scale conditions.

How does the gNMI simulator support scale testing for gNMI-based systems?

A gNMI simulator developed internally to support scale testing is designed to emulate realistic scenarios for gNMI-based systems at large scales. This simulator allows engineers to generate a high volume of gNMI requests and responses, enabling comprehensive testing of the system’s performance, scalability, and resilience under various conditions. It facilitates the evaluation of how the gNMI-based system handles a significant number of concurrent connections, diverse telemetry data, and complex network scenarios. This tool is instrumental in identifying potential bottlenecks, optimizing resource utilization, and ensuring the reliability of gNMI implementations in large-scale environments.

Conclusion

Discovering the challenges and triumphs in scaling ONES for robust telemetry solutions has never been more accessible. Explore our journey, witness the strategies firsthand, and witness ONES in action by booking your One Center demo today. Engage with our team and experience the evolution of scalable solutions firsthand. 

Share the Post:

Related Posts

Explore the latest in AI network management with our ONES 3.0 series Future of Intelligent Networking for AI Fabric Optimization If you’re operating a high-performance data center or managing AI/ML workloads, ONES 3.0 offers advanced

Explore the latest in AI network management with our ONES 3.0 series ONES 3.0 introduces a range of exciting new features, with a focus on scaling data center deployments and support. In this blog post,

Explore the latest in AI network management with our ONES 3.0 series As the demand for high-performance parallel processing surges in the AI era, GPU clusters have become the heart of data-intensive workloads. But it’s

6 KPIs for your SONiC NetOps system to scale your networks efficiently – Learn how ONES telemetry module works to achieve it

This investigation explores the experiments conducted to evaluate how effectively the ONES (Open Networking Enterprise Suite) functions when tested at various scales. Scale testing of the telemetry collectors presents several challenges that need to be carefully addressed to ensure the reliability and effectiveness of the telemetry infrastructure. Addressing these challenges involves a combination of careful […]