Network Telemetry: Pushing Boundaries

10
Network Telemetry: Pushing Boundaries Ramki Krishnan, Distinguished Engineer, SP & NFV OSM Plenary, Dell EMC Campus Santa Clara, CA

Transcript of Network Telemetry: Pushing Boundaries

Page 1: Network Telemetry: Pushing Boundaries

Network Telemetry:

Pushing Boundaries

Ramki Krishnan, Distinguished Engineer, SP & NFV

OSM Plenary, Dell EMC Campus

Santa Clara, CA

Page 2: Network Telemetry: Pushing Boundaries

2

Network Telemetry: Where are we today?

• Primary focus on end-to-end aspects

• Relies on injected packets

• Virtualization challenges are poorly addressed

• Popular standards – Y.1731 (ITU-T), TWAMP (IETF)

• Not ready for @scale hyper-converged SP Infrastrcuture

Page 3: Network Telemetry: Pushing Boundaries

3

User Facing Converged Infrastrcuture Evolution

Key Takeaway: Low-latency is key for HFT, VR gaming, Connected Cars, AR

Source: https://www.ciscoknowledgenetwork.com/files/584_04-26-16-CKN_Webinar_v2.pdf?PRIORITY_CODE=194542_20

Page 4: Network Telemetry: Pushing Boundaries

4

User Facing Converged Infrastrcuture Evolution (2)

Goal: Low Latency Edge cloud app Service Assurance

Gaps: Real-time Per hop & per network function visibility, Hotspot identification

Source: https://www.ciscoknowledgenetwork.com/files/584_04-26-16-CKN_Webinar_v2.pdf?PRIORITY_CODE=194542_20User Plane – Packet data plane, SGi-LAN – service chaining of virtualized network functions such as video transcoder

Page 5: Network Telemetry: Pushing Boundaries

5

Real-time Network Monitoring – Emerging Solutions

• Initial focus on data plane latency monitoring – high value and not difficult to

implement

• Appends timestamp information at each hop in Layer 2/3/4 header

• Benefits

• Amenable to HW implementation in programmable ASICs

• Can compute per-hop latency

• Issues

• Packet size varies across intermediate hops leading to non-deterministic

performance

• Key infrastructure requirement

• Real-time network timing synchronization – IEEE 1588 PTP

Page 6: Network Telemetry: Pushing Boundaries

6

Real-time Network Monitoring – New Directions

• Real-time end-to-end data collection for selective flows (e.g. live video) across NICs and routers - @scale with no intermediate node monitoring

• Monitor programmable number of nodes with pre-defined header size –deterministic performance

• Hierarchical monitoring framework – service chain, overlay, underlay etc.

Rack Servers R630, R730 etc.

40G

Spine Z9500

Leaf S6000

L3 Network Fabric OS9/10

40G

ToR S4048

Virtual Network

Function (A)

Virtual Network

Function (B)

- Pre-construct programmable timestamp header for all hops

- Use timestamp append model in NICs, switches/routers

- Mirror packet with entire timestamp information in the last hop

- Examine latency deviation over baseline for anomalies

Page 7: Network Telemetry: Pushing Boundaries

7

Real-time Network Monitoring – New Directions (2)

• Latency typically follows a long-tail distribution • Average latency not a useful metric for anomaly baseline

• Start with an approximate value of anomaly baseline• Refine the baseline using simple predictive analytics techniques, e.g. Holt-Winters

time series forecasting algorithm

• Advanced predictive analytics techniques, e.g. machine learning – research area• Automatic dependency/clustering detection between latency and other events such as

packet drops, queue depth, poor video QoE, noisy neighbors etc.

Source: https://www.ietf.org/proceedings/96/slides/slides-96-bmwg-8.pdf

Page 8: Network Telemetry: Pushing Boundaries

8

Real-time Network Monitoring – New Directions (3)• Beyond network latency …

• Other key aspects to monitor are queue depth, ingress/egress port bandwidth etc.

• These are not as straightforward to implement in the packet data path as latency

• Last but not the least …

• Orchestration is a key piece of the overall solution

• Goal is to align with OSM and other orchestrators

• Dell EMC Industry Leadership …

• P4 In-band Network Telemetry: http://p4.org/wp-content/uploads/fixed/INT/INT-current-spec.pdf

• IETF NFVRG Leadership (https://irtf.org/nfvrg): Real-time properties work item

Page 9: Network Telemetry: Pushing Boundaries

9

Acknowledgements

• Co-conspirator

• Anoop Ghanwani, Dell EMC

Page 10: Network Telemetry: Pushing Boundaries