Networking for Enterprise Private Clouds - Meetup · 2016-03-29 · Integration with existing...
Transcript of Networking for Enterprise Private Clouds - Meetup · 2016-03-29 · Integration with existing...
© ZeroStack Inc. | zerostack.com© ZeroStack Inc. | zerostack.com 1
Networking for Enterprise Private Clouds
Gautam Kulkarni, Ph.D.ZeroStackMarch 24, 2016
© ZeroStack Inc. | zerostack.com 2
About Us
● ZeroStack ○ SaaS managed private cloud solution for Enterprises
■ Hassle-free installation and operations, fully integrated stack (hyper-converged appliance, OS, hypervisor, OpenStack, telemetry, analytics)
○ Diverse set of expertise: VMware, AMD, Cisco, Microsoft, Google and many others
○ www.zerostack.com
● Gautam Kulkarni ○ Working on distributed systems and networking at ZeroStack ○ Professional background: Insieme Networks (acquired by Cisco), Cisco,
Broadcom, UCLA (Ph.D. in EE), IIT Bombay (B.Tech. in EE)
© ZeroStack Inc. | zerostack.com 3
ZeroStack Cloud Overview
Two-part architecture: 1. Hyper-converged on-premises
infrastructure and management software
2. SaaS platform for cloud consumption, operations and analytics
Ease of use of public clouds with control and performance of a private cloud!
© ZeroStack Inc. | zerostack.com 4
Use Cases● General purpose private cloud
○ Support a broad spectrum of legacy and cloud native applications ○ Hyper-converged appliance with HDDs and SSDs ○ 100% OpenStack API compliant ○ Support Windows and Linux VMs
● SaaS based system consumption for cloud admins and users ○ Agility of public clouds with on-premise VMs and data
● Cloud Admin ○ Manage cloud resources ○ Integrate with the IT environment (switches, routers, firewalls, etc.) ○ Capacity planning, respond to alerts
● Cloud user ○ Create private networks, volumes, VMs etc. ○ Deploy applications
© ZeroStack Inc. | zerostack.com 5
Integration with existing networking infrastructure
● Start small, grow incrementally ● Impose minimal requirements on the physical network ● Deal with legacy Layer 2 networks
○ Port speeds of 10 Gbps and 1 Gbps ○ Bonded or non-bonded links to the Top-of-Rack (ToR) switch ○ Work with trunk or access ports ○ Connectivity to a single ToR switch or two ToR switches serving as Virtual
Port Channel (VPC) or Multichassis Link Aggregation (MLAG) peers
○ Different VLANs for each service, e.g., management, tunnel, storage, etc.
○ Support a variety of switches, legacy and modern
© ZeroStack Inc. | zerostack.com 6
High Availability (HA) of the Cloud
● Starts with HA of the host and the physical network
● Two-tier reference topology using VPC/MLAG
● Suitable for medium sized clusters
© ZeroStack Inc. | zerostack.com 7
HA Considerations for OpenStack Components
● MySQL database - source of all state
● Message broker (RabbitMQ) - central nervous system
● All OpenStack services
OpenStack Architecture diagram from the official documentation
© ZeroStack Inc. | zerostack.com 8
Distributed Control Plane
● Distributed control plane for managing OpenStack services ○ Schedule OpenStack services across available nodes ○ Leader election based on consensus ○ Consensus relies on a robust underlying network
● Ensure services are always alive and healthy ○ Restart upon service crash ○ Migrate services on node failures ○ Done via RPCs, which require a robust underlying network
● Distributed storage ○ Requires a reliable underlying network
● Management of the above networks ○ Done using the distributed control plane
Networks and distributed systems - two sides of the same coin
© ZeroStack Inc. | zerostack.com 9
HA of the Service Networks
● Decouple RPC endpoints from the physical network ○ Virtual IP for each service ○ Helps with service migration
● Deterministic MAC address for each virtual IP ○ No inconsistent ARP caches when services migrate
● Use static ARPs whenever possible ○ Assuming all endpoints are known everywhere
● Increase ARP cache timeout whenever static ARP is not feasible ● Influence Linux to be interface centric (default behavior is host centric)
○ Use ARP filter ○ Use per interface source routing tables
© ZeroStack Inc. | zerostack.com 10
Network Visibility and Metrics
● Never log in to a server and run commands ● Gather the “right” metrics periodically
○ Packet loss rates for pings from each node to each service IP ○ TCP stats (/proc/net/snmp) from each node ○ Per host TX/RX/drop counters from each NIC ○ Per VM TX/RX/drop counters from libvirt ○ Metrics from the key-value store ○ RabbitMQ statistics
● Asynchronous notifications for human intervention ○ Link failures/flakiness ○ Host failures
● Automation of user VMs’ network health checks
© ZeroStack Inc. | zerostack.com 11
Ping Automation
© ZeroStack Inc. | zerostack.com 12
TCP Metrics
© ZeroStack Inc. | zerostack.com 13
OpenStack Workload Network Health
● Do not debug this manually
● There are better ways to spend your nights and weekends
Diagram from Packet Pushers
© ZeroStack Inc. | zerostack.com 14
Summary
● Networking is more than switches and routers
● Have a holistic view of networks and distributed systems
● Collect data and metrics - they will be useful in unexpected ways
● Automate all common tasks
● Hopefully never login to a server again