Network Telemetry: Pushing Boundaries

Click here to load reader

  • date post

    13-Apr-2017
  • Category

    Technology

  • view

    71
  • download

    0

Embed Size (px)

Transcript of Network Telemetry: Pushing Boundaries

  • Network Telemetry:

    Pushing Boundaries

    Ramki Krishnan, Distinguished Engineer, SP & NFV

    OSM Plenary, Dell EMC Campus

    Santa Clara, CA

  • 2

    Network Telemetry: Where are we today?

    Primary focus on end-to-end aspects

    Relies on injected packets

    Virtualization challenges are poorly addressed

    Popular standards Y.1731 (ITU-T), TWAMP (IETF)

    Not ready for @scale hyper-converged SP Infrastrcuture

  • 3

    User Facing Converged Infrastrcuture Evolution

    Key Takeaway: Low-latency is key for HFT, VR gaming, Connected Cars, AR

    Source: https://www.ciscoknowledgenetwork.com/files/584_04-26-16-CKN_Webinar_v2.pdf?PRIORITY_CODE=194542_20

  • 4

    User Facing Converged Infrastrcuture Evolution (2)

    Goal: Low Latency Edge cloud app Service Assurance

    Gaps: Real-time Per hop & per network function visibility, Hotspot identification

    Source: https://www.ciscoknowledgenetwork.com/files/584_04-26-16-CKN_Webinar_v2.pdf?PRIORITY_CODE=194542_20User Plane Packet data plane, SGi-LAN service chaining of virtualized network functions such as video transcoder

  • 5

    Real-time Network Monitoring Emerging Solutions

    Initial focus on data plane latency monitoring high value and not difficult to

    implement

    Appends timestamp information at each hop in Layer 2/3/4 header

    Benefits

    Amenable to HW implementation in programmable ASICs

    Can compute per-hop latency

    Issues

    Packet size varies across intermediate hops leading to non-deterministic

    performance

    Key infrastructure requirement

    Real-time network timing synchronization IEEE 1588 PTP

  • 6

    Real-time Network Monitoring New Directions

    Real-time end-to-end data collection for selective flows (e.g. live video) across NICs and routers - @scale with no intermediate node monitoring

    Monitor programmable number of nodes with pre-defined header size deterministic performance

    Hierarchical monitoring framework service chain, overlay, underlay etc.

    Rack Servers R630, R730 etc.

    40G

    Spine Z9500

    Leaf S6000

    L3 Network Fabric OS9/10

    40G

    ToR S4048

    Virtual Network

    Function (A)

    Virtual Network

    Function (B)

    - Pre-construct programmable timestamp header for all hops

    - Use timestamp append model in NICs, switches/routers

    - Mirror packet with entire timestamp information in the last hop

    - Examine latency deviation over baseline for anomalies

  • 7

    Real-time Network Monitoring New Directions (2)

    Latency typically follows a long-tail distribution Average latency not a useful metric for anomaly baseline

    Start with an approximate value of anomaly baseline Refine the baseline using simple predictive analytics techniques, e.g. Holt-Winters

    time series forecasting algorithm

    Advanced predictive analytics techniques, e.g. machine learning research area Automatic dependency/clustering detection between latency and other events such as

    packet drops, queue depth, poor video QoE, noisy neighbors etc.

    Source: https://www.ietf.org/proceedings/96/slides/slides-96-bmwg-8.pdf

  • 8

    Real-time Network Monitoring New Directions (3) Beyond network latency

    Other key aspects to monitor are queue depth, ingress/egress port bandwidth etc.

    These are not as straightforward to implement in the packet data path as latency

    Last but not the least

    Orchestration is a key piece of the overall solution

    Goal is to align with OSM and other orchestrators

    Dell EMC Industry Leadership

    P4 In-band Network Telemetry: http://p4.org/wp-content/uploads/fixed/INT/INT-current-spec.pdf

    IETF NFVRG Leadership (https://irtf.org/nfvrg): Real-time properties work item

  • 9

    Acknowledgements

    Co-conspirator

    Anoop Ghanwani, Dell EMC