NET1777BE Troubleshooting Methodology for VMware or ......•This presentation may contain product...

38
Hammad Alam, VCDX#248 Shahzad Ali, VCDX#264 NET1777BE #VMworld #NET1777BE Troubleshooting Methodology for VMware NSX for vSphere VMworld 2017 Content: Not for publication or distribution

Transcript of NET1777BE Troubleshooting Methodology for VMware or ......•This presentation may contain product...

Page 1: NET1777BE Troubleshooting Methodology for VMware or ......•This presentation may contain product features that are currently under development. ... 192.168.255.x .1 .2 .3 192.168.255.5

Hammad Alam, VCDX#248Shahzad Ali, VCDX#264

NET1777BE

#VMworld #NET1777BE

Troubleshooting Methodology for VMware NSX for vSphere

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 2: NET1777BE Troubleshooting Methodology for VMware or ......•This presentation may contain product features that are currently under development. ... 192.168.255.x .1 .2 .3 192.168.255.5

• This presentation may contain product features that are currently under development.

• This overview of new technology represents no commitment from VMware to deliver these features in any generally available product.

• Features are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind.

• Technical feasibility and market demand will affect final delivery.

• Pricing and packaging for any new technologies or features discussed or presented have not been determined.

Disclaimer

2#NET1777BE CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 3: NET1777BE Troubleshooting Methodology for VMware or ......•This presentation may contain product features that are currently under development. ... 192.168.255.x .1 .2 .3 192.168.255.5

3

NSX ExpertCustomer

Session Objective: An Interactive Conversation

#NET1777BE CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 4: NET1777BE Troubleshooting Methodology for VMware or ......•This presentation may contain product features that are currently under development. ... 192.168.255.x .1 .2 .3 192.168.255.5

NSX Operations Landscape

44

NSX Native Options

NSX Dashboard Endpoint MonitoringNSX Central CLI Protocols

IPFIX

SNMP

Syslog

Port Mirroring

NSX Manager GUI NSX RESTful API Traceflow

Partner Ecosystem

vRealize Log Insight vRealize Network Insight

VMware – Eco System

OpenSource

PowerNSX

PyNSXv

PowerOps

#NET1777BE CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 5: NET1777BE Troubleshooting Methodology for VMware or ......•This presentation may contain product features that are currently under development. ... 192.168.255.x .1 .2 .3 192.168.255.5

Troubleshooting NSX

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 6: NET1777BE Troubleshooting Methodology for VMware or ......•This presentation may contain product features that are currently under development. ... 192.168.255.x .1 .2 .3 192.168.255.5

Troubleshooting Methodology

6

• NSX Understanding

• Implementation Knowledge

• vCenter NSX Web Plug-in

• NSX Native Tools

• vRealize Log Insight / Network Insight

• Central CLI

• Packet Capture

• Support for Automation

Documentation

+ Environment

Knowledge

Level-1/

Level-2

Level-3/

Level-4

#NET1777BE CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 7: NET1777BE Troubleshooting Methodology for VMware or ......•This presentation may contain product features that are currently under development. ... 192.168.255.x .1 .2 .3 192.168.255.5

Understanding NSX Components

7

Lo

gic

al N

etw

ork

Ph

ys

ica

l

Ne

two

rk

Management

Plane

▪ Single configuration portal

▪ REST API entry-point

NSX Manager

Control

Plane

▪ Manages Logical networks

▪ Control-Plane Protocol

▪ Separation of Control and Data Plane

Controller Cluster

Data

Plane

NSX Edge VM

ESXi Hypervisor Kernel Modules

Distributed Services

▪ High – Performance Data Plane

▪ Scale-out Distributed Forwarding

ModelDFWDLRLogical

Switch

VPN

Reference

DLR Control VM

#NET1777BE CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 8: NET1777BE Troubleshooting Methodology for VMware or ......•This presentation may contain product features that are currently under development. ... 192.168.255.x .1 .2 .3 192.168.255.5

Host1 Host2Host1 Host1 Host1 Host2

DLR

ESG ECMP

Software

L2 Bridge DLR

ESG HA Mode

Software

L2 Bridge

Implementation Knowledge

8#NET1777BE CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 9: NET1777BE Troubleshooting Methodology for VMware or ......•This presentation may contain product features that are currently under development. ... 192.168.255.x .1 .2 .3 192.168.255.5

UI Based TroubleshootingLevel 1/Level 2

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 10: NET1777BE Troubleshooting Methodology for VMware or ......•This presentation may contain product features that are currently under development. ... 192.168.255.x .1 .2 .3 192.168.255.5

NSX UI

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 11: NET1777BE Troubleshooting Methodology for VMware or ......•This presentation may contain product features that are currently under development. ... 192.168.255.x .1 .2 .3 192.168.255.5

Flow Monitoring

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 12: NET1777BE Troubleshooting Methodology for VMware or ......•This presentation may contain product features that are currently under development. ... 192.168.255.x .1 .2 .3 192.168.255.5

Endpoint Monitoring

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 13: NET1777BE Troubleshooting Methodology for VMware or ......•This presentation may contain product features that are currently under development. ... 192.168.255.x .1 .2 .3 192.168.255.5

Traceflow

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 14: NET1777BE Troubleshooting Methodology for VMware or ......•This presentation may contain product features that are currently under development. ... 192.168.255.x .1 .2 .3 192.168.255.5

Log Insight – Dashboards

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 15: NET1777BE Troubleshooting Methodology for VMware or ......•This presentation may contain product features that are currently under development. ... 192.168.255.x .1 .2 .3 192.168.255.5

vRNI: Object Path

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 16: NET1777BE Troubleshooting Methodology for VMware or ......•This presentation may contain product features that are currently under development. ... 192.168.255.x .1 .2 .3 192.168.255.5

vRNI: Alerts for New Events

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 17: NET1777BE Troubleshooting Methodology for VMware or ......•This presentation may contain product features that are currently under development. ... 192.168.255.x .1 .2 .3 192.168.255.5

CLI Based TroubleshootingLevel 3/Level 4

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 18: NET1777BE Troubleshooting Methodology for VMware or ......•This presentation may contain product features that are currently under development. ... 192.168.255.x .1 .2 .3 192.168.255.5

NSX Routing Data Plane

20

NSX Edges in ECMP

10.145.225.2/26 10.145.225.66/26

VM-A VM-B

10.145.225.1 10.145.225.65

Logical

Router

192.168.255.x .1 .2 .3

192.168.255.5

External NetworkDLR instance on each host maintains the

Forwarding Table

1

For North-South traffic, ESG’s next-hop is

DLR’s Forwarding IP Address

3

East-West routing happens between VMs

on different Logical Switches

2

ESXi

Destination GenMask Gateway Interface----------- ------- ------- ---------0.0.0.0 0.0.0.0 192.168.255.3 1b58000000020.0.0.0 0.0.0.0 192.168.255.2 1b58000000020.0.0.0 0.0.0.0 192.168.255.1 1b580000000210.145.225.0 255.255.255.192 0.0.0.0 1b580000000e10.145.225.64 255.255.255.192 0.0.0.0 1b580000000f{truncated..}

1

2

E-W

N-S

3O N2 10.145.225.0/26 via 192.168.255.5O N2 10.145.225.64/26 via 192.168.255.5{truncated …}

3

#NET1777BE CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 19: NET1777BE Troubleshooting Methodology for VMware or ......•This presentation may contain product features that are currently under development. ... 192.168.255.x .1 .2 .3 192.168.255.5

Routing Table – NSX Edge (ESG)

22

nsx-mgr-east> show edge allCLI commands for Edge ServiceGateway(ESG) start with 'show edge'CLI commands for Distributed Logical Router(DLR) Control VM start with 'show edge'CLI commands for Distributed Logical Router(DLR) start with 'show logical-router'

Edge ID Name Size Version Statusedge-1 esg-east-prm-1 Q 6.2.3 GREENedge-2 east-dlr-1 C 6.2.3 GREEN{truncated …}

nsx-mgr-east> show edge edge-1 ip route

Codes: O - OSPF derived, i - IS-IS derived, B - BGP derived,C - connected, S - static, L1 - IS-IS level-1, L2 - IS-IS level-2,Total number of routes: 14

S 0.0.0.0/0 [0/0] via 10.155.171.126O N2 10.145.225.0/26 [110/1] via 192.168.255.5O N2 10.145.225.64/26 [110/1] via 192.168.255.5

{truncated …}

List of all ESGs

Routing Table at a

specific ESG

ESG

DLR

CVM

Cont

Host

#NET1777BE CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 20: NET1777BE Troubleshooting Methodology for VMware or ......•This presentation may contain product features that are currently under development. ... 192.168.255.x .1 .2 .3 192.168.255.5

Routing Table – DLR Control VM & Controllers

23

nsx-mgr-east> show edge edge-2 ip route

Codes: O - OSPF derived, N2 - OSPF NSSA external type 2

O N2 0.0.0.0/0 [110/0] via 192.168.255.1 O N2 0.0.0.0/0 [110/0] via 192.168.255.2 O N2 0.0.0.0/0 [110/0] via 192.168.255.3 O N2 10.140.54.0/26 [110/1] via 192.168.255.1 O N2 10.140.54.0/26 [110/1] via 192.168.255.2 {truncated...}

nsx-mgr-east> show logical-router controller master dlr edge-2 route Destination Next-Hop[] Preference Source 0.0.0.0/0 192.168.255.2 110 CONTROL_VM

192.168.255.3 192.168.255.1

10.140.54.192/26 192.168.255.2 110 CONTROL_VM 192.168.255.3 192.168.255.1

{truncated...}

Routes at the

Controller to be

pushed to hosts

Routes at the DLR

Control VM

ESG

DLR

CVM

Cont

Host

#NET1777BE CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 21: NET1777BE Troubleshooting Methodology for VMware or ......•This presentation may contain product features that are currently under development. ... 192.168.255.x .1 .2 .3 192.168.255.5

Route Table on Host DLR Instance

24

nsx-mgr-east> show logical-router host host-454 dlr edge-2 route

VDR default+edge-2 Route TableLegend: [U: Up], [G: Gateway], [C: Connected], [I: Interface]Legend: [H: Host], [F: Soft Flush] [!: Reject] [E: ECMP]

Destination GenMask Gateway Flags Interface----------- ------- ------- ----- ---------0.0.0.0 0.0.0.0 192.168.255.3 UGE 1b58000000020.0.0.0 0.0.0.0 192.168.255.2 UGE 1b58000000020.0.0.0 0.0.0.0 192.168.255.1 UGE 1b580000000210.145.225.0 255.255.255.192 0.0.0.0 UCI 1b580000000e10.145.225.64 255.255.255.192 0.0.0.0 UCI 1b580000000f{truncated..}

See the Routing

Table for a DLR at

a Host

ESG

DLR

CVM

Cont

Host

show cluster <cluster-id>

#NET1777BE CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 22: NET1777BE Troubleshooting Methodology for VMware or ......•This presentation may contain product features that are currently under development. ... 192.168.255.x .1 .2 .3 192.168.255.5

Logical Router’s ARP Table – At NSX Edge

25

nsx-mgr-east> show edge edge-1 arphaIndex: 0-----------------------------------------------------------------------vShield Edge ARP Cache:IP Address Interface MAC Address State192.168.254.5 vNic_2 02:50:56:56:44:52 REACHABLE192.168.255.2 vNic_1 00:50:56:a0:df:a8 STALE192.168.254.2 vNic_2 00:50:56:a0:10:6b STALE192.168.255.4 vNic_1 00:50:56:a0:0f:12 STALE10.155.171.125 vNic_0 cc:46:d6:64:94:85 STALE192.168.255.3 vNic_1 00:50:56:a0:e9:99 STALE192.168.254.4 vNic_2 00:50:56:a0:b8:54 STALE192.168.254.3 vNic_2 00:50:56:a0:4c:50 STALE

Look at the ARP

Table at an Edge

ESG

DLR

CVM

Cont

Host

#NET1777BE CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 23: NET1777BE Troubleshooting Methodology for VMware or ......•This presentation may contain product features that are currently under development. ... 192.168.255.x .1 .2 .3 192.168.255.5

Logical Switch’s MAC, ARP & VTEP Table – At Host

26

nsx-mgr-east> show logical-switch host host-454 vni 7006 vtepVTEP count: 3

Segment ID: 10.155.171.32VTEP IP: 10.155.171.36

Segment ID: 10.155.171.32VTEP IP: 10.155.171.33

Segment ID: 10.155.171.32VTEP IP: 10.155.171.35

nsx-mgr-east> show logical-switch host host-454 vni 7006 arpARP entry count: 1

IP: 10.145.225.66MAC: 00:0b:0b:0b:0b:0b

nsx-mgr-east> show logical-switch host host-454 vni 7006 macMAC entry count: 1

Inner MAC: 00:0b:0b:0b:0b:0bOuter MAC: 00:50:56:6d:ee:e1Outer IP: 10.155.171.36

Look at the MAC

Table for a VNI at

a Host

Look at the ARP

Table for a VNI at

a Host

Look at all the

VTEPs joined to a

VNI

ESG

DLR

CVM

Cont

Host

#NET1777BE CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 24: NET1777BE Troubleshooting Methodology for VMware or ......•This presentation may contain product features that are currently under development. ... 192.168.255.x .1 .2 .3 192.168.255.5

Packet CapturingLevel 3/Level 4

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 25: NET1777BE Troubleshooting Methodology for VMware or ......•This presentation may contain product features that are currently under development. ... 192.168.255.x .1 .2 .3 192.168.255.5

Packet Capture – Follow the Packet

29

host-454

VDS

vmk4

host-456

vmk4

VM-A

VXLAN 7005

VM-A IP: 10.145.225.2

VM-B

VTEPVTEP

vMAC = 02:50:56:56:44:52

VXLAN 7006

VM-B IP: 10.145.225.66

@DLR (Ingress)@DLR (Egress)@Src VTEP Uplink @Dest VTEP Uplink

@Dest VM Switchport

@SrcVM Switchport

#NET1777BE CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 26: NET1777BE Troubleshooting Methodology for VMware or ......•This presentation may contain product features that are currently under development. ... 192.168.255.x .1 .2 .3 192.168.255.5

Nsx-mgr-1> debug packet capture display session 133bc131-3563-4222-ac43 parameters -e

reading from file /tmp/pktcap/133bc131-3563-4222-ac43.pcap

Packet Capture – Source VM Switchport

30

Session ID

Display Packet Capture

VM-A IP VM-B IP

VM-A MAC vDR MAC

20:02:16.806394 0a:0a:0a:0a:0a:0a > 02:50:56:56:44:52, ethertype IPv4 (0x0800), length 98: 10.145.225.2 > 10.145.225.66: ICMP echo request, id 23815, seq 18550, length 64

#NET1777BE CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 27: NET1777BE Troubleshooting Methodology for VMware or ......•This presentation may contain product features that are currently under development. ... 192.168.255.x .1 .2 .3 192.168.255.5

Support for Automation

31

RestAPI(NET1305)

PowerNSX (NET2119)

PyNSXv(MTE4863)

PowerOps (NET2532)

#NET1777BE CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 28: NET1777BE Troubleshooting Methodology for VMware or ......•This presentation may contain product features that are currently under development. ... 192.168.255.x .1 .2 .3 192.168.255.5

NSX-PowerOps: Under the Hood

32

Modular Platform

PowerNSX Central CLI

RestAPI PowerCLI

SSH Module

(Posh-SSH)

Testing Framework

(Pester)

Microsoft Excel &

Visio

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 29: NET1777BE Troubleshooting Methodology for VMware or ......•This presentation may contain product features that are currently under development. ... 192.168.255.x .1 .2 .3 192.168.255.5

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 30: NET1777BE Troubleshooting Methodology for VMware or ......•This presentation may contain product features that are currently under development. ... 192.168.255.x .1 .2 .3 192.168.255.5

NSX PowerOps - Documentation

34#NET1777BE CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 31: NET1777BE Troubleshooting Methodology for VMware or ......•This presentation may contain product features that are currently under development. ... 192.168.255.x .1 .2 .3 192.168.255.5

NSX PowerOps – Health Checks

35#NET1777BE CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 32: NET1777BE Troubleshooting Methodology for VMware or ......•This presentation may contain product features that are currently under development. ... 192.168.255.x .1 .2 .3 192.168.255.5

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 33: NET1777BE Troubleshooting Methodology for VMware or ......•This presentation may contain product features that are currently under development. ... 192.168.255.x .1 .2 .3 192.168.255.5

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 34: NET1777BE Troubleshooting Methodology for VMware or ......•This presentation may contain product features that are currently under development. ... 192.168.255.x .1 .2 .3 192.168.255.5

Bits and Contact

• GITLAB: https://gitlab.com/NSXPowerOps/NSX-PowerOps

• Contact: New Features, Issues etc. - Connect via Gitlab

38#NET1777BE CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 35: NET1777BE Troubleshooting Methodology for VMware or ......•This presentation may contain product features that are currently under development. ... 192.168.255.x .1 .2 .3 192.168.255.5

Take Away

39

Partner Eco-System

Integration with VMware Products

vSphere Tools

NSX Native Tools

NSX Technology

• Logical Switching

• Distributed + Centralized

Routing

• Load Balancing

• Distributed + Centralized

FW

• and more…

• UI Dashboards

• Flow Monitoring

• Endpoint Monitoring

• Traceflow

• Central CLI

• Packet Capturing

• And more…..

• Netflow

• Esxtop

• Port Mirroring

• Syslog

• Pktcap-uw

• And more…..

• vRealize Log Insight

• vRealize Network Insight

• vRealize Operations

Manager

• And more ..

• EMC Smarts

• Gigamon

• HyTrust

• Tufin

• Riverbed

• AlgoSec

• And more…

#NET1777BE CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 36: NET1777BE Troubleshooting Methodology for VMware or ......•This presentation may contain product features that are currently under development. ... 192.168.255.x .1 .2 .3 192.168.255.5

References

• NSX-v Operations Guide, rev 1.5: https://communities.vmware.com/docs/DOC-30079

• NSX Troubleshooting Guide: https://docs.vmware.com/en/VMware-NSX-for-vSphere/6.3/nsx_63_troubleshooting.pdf

• NSX Administration Guide:https://docs.vmware.com/en/VMware-NSX-for-vSphere/6.3/nsx_63_admin.pdf

• NSX Command Line Quick Reference:https://docs.vmware.com/en/VMware-NSX-for-vSphere/6.3/com.vmware.nsx.troubleshooting.doc/GUID-18EDB577-1903-4110-8A0B-FE9647ED82B6.html

• Trending Support Issues in NSX for vSphere: https://kb.vmware.com/kb/2131154

40#NET1777BE CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 37: NET1777BE Troubleshooting Methodology for VMware or ......•This presentation may contain product features that are currently under development. ... 192.168.255.x .1 .2 .3 192.168.255.5

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 38: NET1777BE Troubleshooting Methodology for VMware or ......•This presentation may contain product features that are currently under development. ... 192.168.255.x .1 .2 .3 192.168.255.5

VMworld 2017 Content: Not fo

r publication or distri

bution