--- 1/draft-ietf-opsawg-ntf-08.txt 2021-10-13 08:13:18.369888693 -0700 +++ 2/draft-ietf-opsawg-ntf-09.txt 2021-10-13 08:13:18.449890686 -0700 @@ -1,57 +1,57 @@ OPSAWG H. Song Internet-Draft Futurewei Intended status: Informational F. Qin -Expires: 10 April 2022 China Mobile +Expires: 16 April 2022 China Mobile P. Martinez-Julia NICT L. Ciavaglia Nokia A. Wang China Telecom - 7 October 2021 + 13 October 2021 Network Telemetry Framework - draft-ietf-opsawg-ntf-08 + draft-ietf-opsawg-ntf-09 Abstract Network telemetry is a technology for gaining network insight and facilitating efficient and automated network management. It encompasses various techniques for remote data generation, collection, correlation, and consumption. This document describes an architectural framework for network telemetry, motivated by challenges that are encountered as part of the operation of networks and by the requirements that ensue. This document clarifies the terminologies and classifies the modules and components of a network - telemetry system from several different perspectives. The framework - and taxonomy help to set a common ground for the collection of - related work and provide guidance for related technique and standard + telemetry system from different perspectives. The framework and + taxonomy help to set a common ground for the collection of related + work and provide guidance for related technique and standard developments. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." - This Internet-Draft will expire on 10 April 2022. + This Internet-Draft will expire on 16 April 2022. Copyright Notice Copyright (c) 2021 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/ license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights @@ -131,21 +131,21 @@ techniques and standard works. To fulfill such an undertaking, we first discuss some key characteristics of network telemetry which set a clear distinction from the conventional network OAM and show that some conventional OAM technologies can be considered a subset of the network telemetry technologies. We then provide an architectural framework for network telemetry which includes four modules, each concerned with a different category of telemetry data and corresponding procedures. All the modules are internally structured in the same way, including - components that allow to configure data sources with regards to what + components that allow to configure data sources in regard to what data to generate and how to make that available to client applications, components that instrument the underlying data sources, and components that perform the actual rendering, encoding, and exporting of the generated data. We show how the network telemetry framework can benefit the current and future network operations. Based on the distinction of modules and function components, we can map the existing and emerging techniques and protocols into the framework. The framework can also simplify the tasks for designing, maintaining, and understanding a network telemetry system. At last, we outline the evolution stages of the network telemetry system and @@ -180,22 +180,22 @@ DPI: Deep Packet Inspection, referring to the techniques that examines packet beyond packet L3/L4 headers. gNMI: gRPC Network Management Interface, a network management protocol from OpenConfig Operator Working Group, mainly contributed by Google. See [gnmi] for details. GPB: Google Protocol Buffer, an extensible mechanism for serializing structured data. - gRPC: gRPC Remote Procedure Call, a open source high performance RPC - framework that gNMI is based on. See [grpc] for details. + gRPC: gRPC Remote Procedure Call, an open source high performance + RPC framework that gNMI is based on. See [grpc] for details. IPFIX: IP Flow Information Export Protocol, specified in [RFC7011]. IOAM: In-situ OAM, a dataplane on-path telemetry technique. JSON: An open standard file format and data interchange format that uses human-readable text to store and transmit data objects. MIB: Management Information Base, a database used for managing the entities in a network. @@ -318,58 +318,60 @@ While the list is by no means exhaustive, it is enough to highlight the requirements for data velocity, variety, volume, and veracity in networks. * Security: Network intrusion detection and prevention systems need to monitor network traffic and activities and act upon anomalies. Given increasingly sophisticated attack vector coupled with increasingly severe consequences of security breaches, new tools and techniques need to be developed, relying on wider and deeper visibility into networks. The ultimate goal is to achieve the - ideal security with no or minimal human intervention. + ideal security with no, or only minimal, human intervention. * Policy and Intent Compliance: Network policies are the rules that constrain the services for network access, provide service differentiation, or enforce specific treatment on the traffic. For example, a service function chain is a policy that requires the selected flows to pass through a set of ordered network functions. Intent, as defined in [I-D.irtf-nmrg-ibn-concepts-definitions], is a set of operational goal that a network should meet and outcomes that a network is supposed to deliver, defined in a declarative manner without specifying how to achieve or implement them. An intent requires a complex translation and mapping process before being applied on - networks. While a policy or an intent is enforced, the compliance - needs to be verified and monitored continuously relying on - visibility that is provided through network telemetry data, any - violation needs to be reported immediately, and updates need to be - applied to ensure the intent remains in force. + networks. While a policy or intent is enforced, the compliance + needs to be verified and monitored continuously by relying on + visibility that is provided through network telemetry data. Any + violation must be notified immediately, potentially resulting in + updates to how the policy or intent is applied in the network to + ensure that it remains in force, or otherwise alerting the network + administrator to the policy or intent violation. * SLA Compliance: A Service-Level Agreement (SLA) defines the level of service a user expects from a network operator, which include the metrics for the service measurement and remedy/penalty procedures when the service level misses the agreement. Users need to check if they get the service as promised and network operators need to evaluate how they can deliver the services that can meet the SLA based on realtime network telemetry data, including data from network measurements. * Root Cause Analysis: Any network failure can be the effect of a sequence of chained events. Troubleshooting and recovery require quick identification of the root cause of any observable issues. However, the root cause is not always straightforward to identify, especially when the failure is sporadic and the number of event messages, both related and unrelated to the same cause, is overwhelming. While machine learning technologies can be used for root cause analysis, it up to the network to sense and provide the - relevant diagnostic data which are either actively fed into or - passively retrieved by machine learning applications. + relevant diagnostic data which are either actively fed into, or + passively retrieved by, machine learning applications. * Network Optimization: This covers all short-term and long-term network optimization techniques, including load balancing, Traffic Engineering (TE), and network planning. Network operators are motivated to optimize their network utilization and differentiate services for better Return On Investment (ROI) or lower Capital Expenditures (CAPEX). The first step is to know the real-time network conditions before applying policies for traffic manipulation. In some cases, micro-bursts need to be detected in a very short time-frame so that fine-grained traffic control can @@ -415,22 +417,22 @@ * Many application scenarios need to correlate network-wide data from multiple sources (i.e., from distributed network devices, different components of a network device, or different network planes). A piecemeal solution is often lacking the capability to consolidate the data from multiple sources. The composition of a complete solution, as partly proposed by Autonomic Resource Control Architecture(ARCA) [I-D.pedro-nmrg-anticipated-adaptation], will be empowered and guided by a comprehensive framework. - * Some of the conventional OAM techniques (e.g., CLI and Syslog) - lack a formal data model. The unstructured data hinder the tool + * Some conventional OAM techniques (e.g., CLI and Syslog) lack a + formal data model. The unstructured data hinder the tool automation and application extensibility. Standardized data models are essential to support the programmable networks. * Although some conventional OAM techniques support data push (e.g., SNMP Trap [RFC2981][RFC3877], Syslog, and sFlow), the pushed data are limited to only predefined management plane warnings (e.g., SNMP Trap) or sampled user packets (e.g., sFlow). Network operators require the data with arbitrary source, granularity, and precision which are beyond the capability of the existing techniques. @@ -585,21 +587,21 @@ make efficient use of network resources and reduce the impact of processing related to network telemetry on network performance. For example, routine network monitoring should cover the entire network with a low data sampling rate. Only when issues arise or critical trends emerge should telemetry data source be modified and telemetry data rates boosted as needed. * Efficient data fusion is critical for applications to reduce the overall quantity of data and improve the accuracy of analysis. - A telemetry framework collects together all of the telemetry-related + A telemetry framework collects together all the telemetry-related works from different sources and working groups within IETF. This makes it possible to assemble a comprehensive network telemetry system and to avoid repetitious or redundant work. The framework should cover the concepts and components from the standardization perspective. This document describes the modules which make up a network telemetry framework and decomposes the telemetry system into a set of distinct components that existing and future work can easily map to. 4. Network Telemetry Framework @@ -685,65 +687,64 @@ Because the locations that can export data have different capabilities, different choices of data model, encoding, and transport method are made to balance the performance and cost. For example, the forwarding chip has high throughput but limited capacity for processing complex data and maintaining states, while the main control CPU is capable of complex data and state processing, but has limited bandwidth for high throughput data. As a result, the suitable telemetry protocol for each module can be different. Some representative techniques are shown in the corresponding table blocks to highlight the technical diversity of these modules. Note that the - selected techniques just reflect the de-facto state of the art and + selected techniques just reflect the de facto state of the art and are not exhaustive. The key point is that one cannot expect to use a universal protocol to cover all the network telemetry requirements. +---------+--------------+--------------+---------------+-----------+ - | Module | Control | Management | Forwarding | External | + | Module | Management | Control | Forwarding | External | | | Plane | Plane | Plane | Data | +---------+--------------+--------------+---------------+-----------+ - |Object | control | config. & | flow & packet | terminal, | - | | protocol & | operation | QoS, traffic | social & | - | | signaling, | state | stat., buffer | environ- | - | | RIB, ACL | | & queue stat.,| mental | + |Object | config. & | control | flow & packet | terminal, | + | | operation | protocol & | QoS, traffic | social & | + | | state | signaling, | stat., buffer | environ- | + | | | RIB | & queue stat.,| mental | | | | | ACL, FIB | | +---------+--------------+--------------+---------------+-----------+ |Export | main control | main control | fwding chip | various | - |Location | CPU, | CPU | or linecard | | - | | linecard CPU | | CPU; main | | - | | or fwding | | control CPU | | - | | chip | | unlikely | | + |Location | CPU | CPU, | or linecard | | + | | | linecard CPU | CPU; main | | + | | | or forwarding| control CPU | | + | | | chip | unlikely | | +---------+--------------+--------------+---------------+-----------+ - |Data | YANG, | YANG, MIB, | template, | YANG, | - |Model | custom | syslog, | YANG, | custom | + |Data | YANG, MIB, | YANG, | template, | YANG, | + |Model | syslog | custom | YANG, | custom | | | | | custom | | +---------+--------------+--------------+---------------+-----------+ |Data | GPB, JSON, | GPB, JSON, | plain | GPB, JSON | - |Encoding | XML, plain | XML | | XML, plain| + |Encoding | XML | XML, plain | | XML, plain| +---------+--------------+--------------+---------------+-----------+ |Protocol | gRPC,NETCONF,| gRPC,NETCONF,| IPFIX, mirror,| gRPC | - | | IPFIX,mirror | | gRPC, NETFLOW | | + | | | IPFIX, mirror| gRPC, NETFLOW | | +---------+--------------+--------------+---------------+-----------+ - |Transport| HTTP, TCP, | HTTP, TCP | UDP | HTTP,TCP | - | | UDP | | | UDP | + |Transport| HTTP, TCP | HTTP, TCP, | UDP | HTTP,TCP | + | | | UDP | | UDP | +---------+--------------+--------------+---------------+-----------+ Figure 2: Comparison of the Data Object Modules Note that the interaction with the applications that consume network telemetry data can be indirect. Some in-device data transfer is possible. For example, in the management plane telemetry, the management plane will need to acquire data from the data plane. Some - of the operational states can only be derived from data plane data - sources such as the interface status and statistics. As another - example, obtaining control plane telemetry data may require the - ability to access the Forwarding Information Base (FIB) of the data - plane. + operational states can only be derived from data plane data sources + such as the interface status and statistics. As another example, + obtaining control plane telemetry data may require the ability to + access the Forwarding Information Base (FIB) of the data plane. On the other hand, an application may involve more than one plane and interact with multiple planes simultaneously. For example, an SLA compliance application may require both the data plane telemetry and the control plane telemetry. The requirements and challenges for each module are summarized as follows (note that the requirements may pertain across all telemetry modules; however, we emphasize those that are most pronounced for a particular plane). @@ -752,34 +753,34 @@ The management plane of network elements interacts with the Network Management System (NMS), and provides information such as performance data, network logging data, network warning and defects data, and network statistics and state data. The management plane includes many protocols, including some that are considered "legacy", such as SNMP and syslog. Regardless the protocol, management plane telemetry must address the following requirements: * Convenient Data Subscription: An application should have the - freedom to choose the data export means such as the data types (as - described in Figure 4) and the export means and frequency (e.g., - on-change or periodic subscription). + freedom to choose which data is exported (see section 4.3) and the + means and frequency of how that data is exported (e.g., on-change + or periodic subscription). * Structured Data: For automatic network operation, machines will replace human for network data comprehension. Data modeling languages, such as YANG, can efficiently describe structured data and normalize data encoding and transformation. * High Speed Data Transport: In order to keep up with the velocity of information, a server needs to be able to send large amounts of data at high frequency. Compact encoding formats or data - compression schemes are needed to compress the data and improve - the data transport efficiency. The subscription mode, by + compression schemes are needed to reduce the quantity of data and + improve the data transport efficiency. The subscription mode, by replacing the query mode, reduces the interactions between clients and servers and helps to improve the server's efficiency. 4.1.2. Control Plane Telemetry The control plane telemetry refers to the health condition monitoring of different network control protocols at all layers of the protocol stack. Keeping track of the operational status of these protocols is beneficial for detecting, localizing, and even predicting various network issues, as well as network optimization, in real-time and @@ -802,38 +803,39 @@ common issue behind these methods is that they only measure the KPIs instead of reflecting the actual running status of these protocols, making them less effective or efficient for control plane troubleshooting and network optimization. * An example of the control plane telemetry is the BGP monitoring protocol (BMP), it is currently used for monitoring the BGP routes and enables rich applications, such as BGP peer analysis, AS analysis, prefix analysis, and security analysis. However, the monitoring of other layers, protocols and the cross-layer, cross- - protocol KPI correlations are still in their infancy (e.g., the - IGP monitoring is not as exensive as BMP), which require further + protocol KPI correlations are still in their infancy (e.g., IGP + monitoring is not as extensive as BMP), which require further research. 4.1.3. Forwarding Plane Telemetry An effective forwarding plane telemetry system relies on the data that the network device can expose. The quality, quantity, and timeliness of data must meet some stringent requirements. This raises some challenges to the network data plane devices where the - first hand data originates. + first-hand data originates. * A data plane device's main function is user traffic processing and forwarding. While supporting network visibility is important, the telemetry is just an auxiliary function, and it should strive to not impede normal traffic processing and forwarding (i.e., the - forwarding behavior should not be altered and the tradeoff between - forwarding and telemtry should be well balanced). + forwarding behavior should not be altered and the trade-off + between forwarding performance and telemetry should be well- + balanced). * Network operation applications require end-to-end visibility across various sources, which can result in a huge volume of data. However, the sheer quantity of data must not exhaust the network bandwidth, regardless of the data delivery approach (i.e., whether through in-band or out-of-band channels). * The data plane devices must provide timely data with the minimum possible delay. Long processing, transport, storage, and analysis delay can impact the effectiveness of the control loop and even @@ -873,23 +875,23 @@ [I-D.ietf-ippm-ioam-data], Alternate-Marking (AM) [RFC8321], and Multipoint Alternate Marking [I-D.ietf-ippm-multipoint-alt-mark], provide a well-balanced and more flexible approach. However, these methods are also more complex to implement. * In-Band and Out-of-Band: Telemetry data carried in user packets before being exported to a data collector is considered in-band (e.g., in-situ OAM [I-D.ietf-ippm-ioam-data]). Telemetry data that is directly exported to a data collector without modifying user packets is considered out-of-band (e.g., the postcard-based - approach described in Appendix). It is also possible to have - hybrid methods, where only the telemetry instruction or partial - data is carried by user packets (e.g., AM [RFC8321]). + approach described in Appendix A.3.5). It is also possible to + have hybrid methods, where only the telemetry instruction or + partial data is carried by user packets (e.g., AM [RFC8321]). * End-to-End and In-Network: End-to-End methods start from, and end at, the network end hosts (e.g., Ping). In-Network methods work in networks and are transparent to end hosts. However, if needed, In-Network methods can be easily extended into end hosts. * Data Subject: Depending on the telemetry objective, the methods can be flow-based (e.g., in-situ OAM [I-D.ietf-ippm-ioam-data]), path-based (e.g., Traceroute), and node-based (e.g., IPFIX [RFC7011]). The various data objects can be packet, flow record, @@ -904,22 +906,22 @@ [I-D.pedro-nmrg-anticipated-adaptation], provides a strategic and functional advantage to management operations. As with other sources of telemetry information, the data and events must meet strict requirements, especially in terms of timeliness, which is essential to properly incorporate external event information into network management applications. The specific challenges are described as follows: * The role of the external event detector can be played by multiple - elements, including hardware (e.g. physical sensors, such as - seismometers) and software (e.g. Big Data sources that analyze + elements, including hardware (e.g., physical sensors, such as + seismometers) and software (e.g., Big Data sources that analyze streams of information, such as Twitter messages). Thus, the transmitted data must support different shapes but, at the same time, follow a common but extensible schema. * Since the main function of the external event detectors is to perform the notifications, their timeliness is assumed. However, once messages have been dispatched, they must be quickly collected and inserted into the control plane with variable priority, which is higher for important sources and events and lower for secondary ones. @@ -929,21 +931,21 @@ be easily mapped to current data models, such as in terms of YANG. Organizing both internal and external telemetry information together will be key for the general exploitation of the management possibilities of current and future network systems, as reflected in the incorporation of cognitive capabilities to new hardware and software (virtual) elements. 4.2. Second Level Function Components - The telemetry module as each plane can be further partitioned into + The telemetry module at each plane can be further partitioned into five distinct conceptual components: * Data Query, Analysis, and Storage: This component works at the application layer. It is normally a part of the network management system at the receiver side. On the one hand, it is responsible for issuing data requirements. The data of interest can be modeled data through configuration or custom data through programming. The data requirements can be queries for one-shot data or subscriptions for events or streaming data. On the other hand, it receives, stores, and processes the returned data from @@ -964,22 +966,22 @@ access control. The data encoding and the transport protocol may vary due to the data export location. * Data Generation and Processing: The requested data needs to be captured, filtered, processed, and formatted in network devices from raw data sources. This may involve in-network computing and processing on either the fast path or the slow path in network devices. * Data Object and Source: This component determines the monitoring - objects and original data sources provisioned in device. A data - source usually just provides raw data which needs further + objects and original data sources provisioned in the device. A + data source usually just provides raw data which needs further processing. Each data source can be considered a probe. Some data sources can be dynamically installed, while others will be more static. +----------------------------------------+ +----------------------------------------+ | | | | | Data Query, Analysis, & Storage | | | | + +-------+++ -----------------------------+ @@ -1032,23 +1034,23 @@ * Simple Data: The data that are steadily available from some datastore or static probes in network devices. * Derived Data: The data need to be synthesized or processed in network from raw data from one or more network devices. The data processing function can be statically or dynamically loaded into network devices. * Event-triggered Data: The data are conditionally acquired based on - the occurrence of some events. For example, a network interface - changing its operational state from up to down can be a trigger - event. Such data can be actively pushed through subscription or + the occurrence of some events. An example of event-triggered data + could be an interface changing operational state between up and + down. Such data can be actively pushed through subscription or passively polled through query. There are many ways to model events, including using Finite State Machine (FSM) or Event Condition Action (ECA) [I-D.wwx-netmod-event-yang]. * Streaming Data: The data are continuously generated. It can be time series or the dump of databases. For example, an interface packet counter is exported every second. The streaming data reflect realtime network states and metrics and require large bandwidth and processing power. The streaming data are always actively pushed to the subscribers. @@ -1099,38 +1101,38 @@ +-------------+-----------------+---------------+--------------+ | data config.| gNMI, NETCONF, | gNMI, NETCONF,| NETCONF, | | & subscribe | SNMP, YANG-Push | YANG-Push | YANG-Push | +-------------+-----------------+---------------+--------------+ | data gen. & | MIB, | YANG | IOAM, PSAMP | | process | YANG | | PBT, AM, | +-------------+-----------------+---------------+--------------+ | data encode.| gRPC, HTTP, TCP | BMP, TCP | IPFIX, UDP | | & export | | | | +-------------+-----------------+---------------+--------------+ - Figure 5: Existing Work Mapping II + Figure 5: Existing Work Mapping 5. Evolution of Network Telemetry Applications Network telemetry is an evolving technical area. As the network moves towards the automated operation, network telemetry applications undergo several stages of evolution which add new layer of requirements to the underlying network telemetry techniques. Each stage is built upon the techniques adopted by the previous stages plus some new requirements. Stage 0 - Static Telemetry: The telemetry data source and type are determined at design time. The network operator can only configure how to use it with limited flexibility. Stage 1 - Dynamic Telemetry: The custom telemetry data can be dynamically programmed or configured at runtime without - interrupting the network operation, allowing a tradeoff among + interrupting the network operation, allowing a trade-off among resource, performance, flexibility, and coverage. Stage 2 - Interactive Telemetry: The network operator can continuously customize and fine tune the telemetry data in real time to reflect the network operation's visibility requirements. Compared with Stage 1, the changes are frequent based on the real- time feedback. At this stage, some tasks can be automated, but human operators still need to sit in the middle to make decisions. Stage 3 - Closed-loop Telemetry: The telemetry is free from the @@ -1144,29 +1146,29 @@ future autonomic networks may need a comprehensive operation management system which works at stage 2 and stage 3 to cover all the network operation tasks. A well-defined network telemetry framework is the first step towards this direction. 6. Security Considerations The complexity of network telemetry raises significant security implications. For example, telemetry data can be manipulated to exhaust various network resources at each plane as well as the data - consumer; falsified or tampered data can mislead the decision making + consumer; falsified or tampered data can mislead the decision-making and paralyze networks; wrong configuration and programming for telemetry is equally harmful. The telemetry data is highly sensitive, which exposes a lot of information about the network and its configuration. Some of that information can make designing attacks against the network much easier (e.g., exact details of what software and patches have been installed), and allows an attacker to determine whether a device may be subject to unprotected security - vulnerability. + vulnerabilities. Given that this document has proposed a framework for network telemetry and the telemetry mechanisms discussed are more extensive (in both message frequency and traffic amount) than the conventional network OAM concepts, we must also reflect that various new security considerations may also arise. A number of techniques already exist for securing the forwarding plane, the control plane, and the management plane in a network, but it is important to consider if any new threat vectors are now being enabled via the use of network telemetry procedures and mechanisms. @@ -1188,32 +1190,32 @@ identify malicious attacks using telemetry interfaces. * Authentication and signing of telemetry data to make data more trustworthy. * Segregating the telemetry data traffic from the data traffic carried over the network (e.g., historically management access and management data may be carried via an independent management network). - Some of the security considerations highlighted above may be - minimized or negated with policy management of network telemetry. In - a network telemetry deployment it would be advantageous to separate - telemetry capabilities into different classes of policies, i.e., Role - Based Access Control and Event-Condition-Action policies. Also, - potential conflicts between network telemetry mechanisms must be - detected accurately and resolved quickly to avoid unnecessary network + Some security considerations highlighted above may be minimized or + negated with policy management of network telemetry. In a network + telemetry deployment it would be advantageous to separate telemetry + capabilities into different classes of policies, i.e., Role Based + Access Control and Event-Condition-Action policies. Also, potential + conflicts between network telemetry mechanisms must be detected + accurately and resolved quickly to avoid unnecessary network telemetry traffic propagation escalating into an unintended or intended denial of service attack. Further study of the security issues will be required, and it is - expected that the secuirty mechanisms and protocols are developed and + expected that the security mechanisms and protocols are developed and deployed along with a network telemetry system. In addition to security, privacy is also an important issue. Network telemetry means to improve the network operation which can ultimately benefit end user's quality of experience. The network operators must be held accountable and strive for a balance between managing the network and maintaining the user privacy of that network. 7. IANA Considerations @@ -1264,23 +1266,23 @@ Evens, T., Bayraktar, S., Bhardwaj, M., and P. Lucente, "Support for Local RIB in BGP Monitoring Protocol (BMP)", Work in Progress, Internet-Draft, draft-ietf-grow-bmp- local-rib-13, 31 August 2021, . [I-D.ietf-ippm-ioam-data] Brockners, F., Bhandari, S., and T. Mizrahi, "Data Fields for In-situ OAM", Work in Progress, Internet-Draft, draft- - ietf-ippm-ioam-data-14, 24 June 2021, + ietf-ippm-ioam-data-15, 3 October 2021, . + data-15.txt>. [I-D.ietf-ippm-multipoint-alt-mark] Fioccola, G., Cociglio, M., Sapio, A., and R. Sisto, "Multipoint Alternate-Marking Method for Passive and Hybrid Performance Monitoring", Work in Progress, Internet-Draft, draft-ietf-ippm-multipoint-alt-mark-09, 23 March 2020, . [I-D.ietf-netconf-distributed-notif] @@ -1486,21 +1488,21 @@ channel [I-D.ietf-netconf-udp-notif] provides enhanced efficiency for the NETCONF based telemetry. A.1.2. gRPC Network Management Interface gRPC Network Management Interface (gNMI) [I-D.openconfig-rtgwg-gnmi-spec] is a network management protocol based on the gRPC [I-D.kumar-rtgwg-grpc-protocol] RPC (Remote Procedure Call) framework. With a single gRPC service definition, both configuration and telemetry can be covered. gRPC is an HTTP/2 - [RFC7540] based open source micro service communication framework. + [RFC7540] based open-source micro-service communication framework. It provides a number of capabilities which are well-suited for network telemetry, including: * Full-duplex streaming transport model combined with a binary encoding mechanism provides good telemetry efficiency. * gRPC provides higher-level features consistency across platforms that common HTTP/2 libraries typically do not. This characteristic is especially valuable for the fact that telemetry data collectors normally reside on a large variety of platforms. @@ -1513,21 +1515,21 @@ BGP Monitoring Protocol (BMP) [RFC7854] is used to monitor BGP sessions and is intended to provide a convenient interface for obtaining route views. The BGP routing information is collected from the monitored device(s) to the BMP monitoring station by setting up the BMP TCP session. The BGP peers are monitored by the BMP Peer Up and Peer Down Notifications. The BGP routes (including Adjacency_RIB_In [RFC7854], Adjacency_RIB_out [I-D.ietf-grow-bmp-adj-rib-out], and Local_Rib - [I-D.ietf-grow-bmp-local-rib] are encapsulated in the BMP Route + [I-D.ietf-grow-bmp-local-rib]) are encapsulated in the BMP Route Monitoring Message and the BMP Route Mirroring Message, providing both an initial table dump and real-time route updates. In addition, BGP statistics are reported through the BMP Stats Report Message, which could be either timer triggered or event-driven. Future BMP extensions could further enrich BGP monitoring applications. A.3. Data Plane Telemetry A.3.1. The Alternate Marking (AM) technology @@ -1543,21 +1545,21 @@ the packet loss calculation. The same idea can be applied to delay measurement by selecting ad hoc packets with a marking bit dedicated for delay measurements. Alternate Marking method needs two counters each marking period for each flow under monitor. For instance, by considering n measurement points and m monitored flows, the order of magnitude of the packet counters for each time interval is n*m*2 (1 per color). Since networks offer rich sets of network performance measurement - data (e.g packet counters), traditional approaches run into + data (e.g., packet counters), traditional approaches run into limitations. The bottleneck is the generation and export of the data and the amount of data that can be reasonably collected from the network. In addition, management tasks related to determining and configuring which data to generate lead to significant deployment challenges. The Multipoint Alternate Marking approach, described in [I-D.ietf-ippm-multipoint-alt-mark], aims to resolve this issue and make the performance monitoring more flexible in case a detailed analysis is not needed. @@ -1656,21 +1658,21 @@ management and match it to the connectors and/or interfaces required to connect them. Categories of external event sources that may be of interest to network management include:: * Smart objects and sensors. With the consolidation of the Internet of Things~(IoT) any network system will have many smart objects attached to its physical surroundings and logical operation environments. Most of these objects will be essentially based on - sensors of many kinds (e.g. temperature, humidity, presence) and + sensors of many kinds (e.g., temperature, humidity, presence) and the information they provide can be very useful for the management of the network, even when they are not specifically deployed for such purpose. Elements of this source type will usually provide a specific protocol for interaction, especially one of those protocols related to IoT, such as the Constrained Application Protocol (CoAP). * Online news reporters. Several online news services have the ability to provide enormous quantity of information about different events occurring in the world. Some of those events can @@ -1685,29 +1687,29 @@ be part of both the ontology and information model of the telemetry framework. * Global event analyzers. The advance of Big Data analyzers provides a huge amount of information and, more interestingly, the identification of events detected by analyzing many data streams from different origins. In contrast with the other types of sources, which are focused on specific events, the detectors of this source type will detect generic events. For example, a sports event takes place and some unexpected movement makes it - highly interesting and many people connects to sites that are - reporting on the event. The underlying networks supporting the - services that cover the event can be affected by such situation so - their management solutions should be aware of it. In contrast - with the other source types, a new information model, format, and - reporting protocol is required to integrate the detectors of this - type with the management solution. + fascinating and many people connect to sites that are reporting on + the event. The underlying networks supporting the services that + cover the event can be affected by such situation so their + management solutions should be aware of it. In contrast with the + other source types, a new information model, format, and reporting + protocol is required to integrate the detectors of this type with + the management solution. - Additional types of detector types can be added to the system but + Additional types of detector types can be added to the system, but they will be generally the result of composing the properties offered by these main classes. A.4.2. Connectors and Interfaces For allowing external event detectors to be properly integrated with other management solutions, both elements must expose interfaces and protocols that are subject to their particular objective. Since external event detectors will be focused on providing their information to their main consumers, which generally will not be