OpenContrail Cloudwatt Feedback

21
OpenContrail deployment experience at Cloudwatt

Transcript of OpenContrail Cloudwatt Feedback

Page 1: OpenContrail Cloudwatt Feedback

OpenContrail deployment experience

at Cloudwatt

Page 2: OpenContrail Cloudwatt Feedback

About me

● Network engineer since 2006● Working on OpenStack since the beginning

2010● Working on OpenContrail since a year as a

developer and integrator

Page 3: OpenContrail Cloudwatt Feedback

Cloudwatt IaaS

● French public cloud provider● 3 years experience with OpenStack● 1 year experience with OpenContrail

○ 1 data center ■ 200 compute nodes■ 3 peta of raw swift storage

○ OpenStack IceHouse release

Page 4: OpenContrail Cloudwatt Feedback

Contrail in Cloudwatt

● Started with Contrail release 1.06 in June 2014

● Run onto a Cisco Nexus fabricpath● Terminate l2vpn tunnel with two Juniper MX

Page 5: OpenContrail Cloudwatt Feedback

Contrail in Cloudwatt

Page 6: OpenContrail Cloudwatt Feedback

Contrail logical view

Config

Neutron API

Analytics

Control

IF-MAP

vrouter vrouter vrouter

Page 7: OpenContrail Cloudwatt Feedback

Contrail in Cloudwatt

● 2 Neutron API: neutron server with Contrail plugin

● 2 config nodes: discovery, API, SVC monitor, schema, IF-MAP server

● 2 control nodes● 2 analytics nodes● 2 webUI nodes

Page 8: OpenContrail Cloudwatt Feedback

Contrail in Cloudwatt

Config Config

Neutron API Neutron API

Analytics Analytics

Control Control

vrouter vrouter vrouter

IF-MAPIF-MAP

WebUIWebUI

XMPP

Page 9: OpenContrail Cloudwatt Feedback

Contrail in Cloudwatt

● Load balancing front of APIs and WebUI● 2 Cassandra clusters of 3 nodes each● RabbitMQ cluster of 2 nodes● Cluster Zookeeper compose of 3 nodes

Page 10: OpenContrail Cloudwatt Feedback

Contrail in Cloudwatt

Config Config

Neutron API Neutron API

Analytics Analytics

Control Control

vrouter vrouter vrouter

IF-MAP

XMPP

Cassandra

Cassandra

AMQP + ZK

IF-MAP

WebUIWebUI

Page 11: OpenContrail Cloudwatt Feedback

Issue on 1.06

● Difficulty to operate it and upgrade/maintain it without down time

● Stabilize/compatibility Neutron to Contrail translator API

● Analytics does not work● Some memories leak on the compute node

Page 12: OpenContrail Cloudwatt Feedback

Upgrade to 1.10

● After nine month with 1.06● New version to fix issues and bring new

features (SNAT/LBaaS)● Following the upstream

Page 13: OpenContrail Cloudwatt Feedback

Upgrade to 1.10Create a tool to monitor the contrail cluster status

Page 14: OpenContrail Cloudwatt Feedback

Upgrade to 1.10

We deviced to do it in 2 steps:1. Control plane (in a night)

○ Config (slave schema before)○ Control○ Analytics ○ WebUI○ Neutron API

Page 15: OpenContrail Cloudwatt Feedback

Upgrade to 1.10

2. Data plane (during few days)○ upgrade/bootstrap spare compute node in 1.10 and

add them in the available compute pools○ remove all running 1.06 compute nodes to the

available pool○ let a time slot to clients on that 1.06 nodes to move

their VM before upgrade that node to 1.10 (no live migration)

○ then open champagne bottles!

Page 16: OpenContrail Cloudwatt Feedback

Bug met during the upgrade● vrouter 1.06 cannot live with 1.10 with MPLSoUDP

encapsulation => pass to MPLSoGRE during the cohabitation

● SNAT/LBaaS stuff does not take care of the vrouter version

● Slow all the contrail API due to the move of the Neutron Contrail plugin code from neutron-server to Contrail API

● Zookeeper timeout

Page 17: OpenContrail Cloudwatt Feedback

Bug met after upgrade

● Data kernel module path memory leak● Data kernel module path hold flows count

leak (workaround: restart the vrouter agent)● 13 Cloudwatt patches added to the 1.10

upstream release:https://review.opencontrail.org/#/q/status:open+branch:R1.10,n,z

Page 18: OpenContrail Cloudwatt Feedback

Bug still persist on 1.10

● Schema slave->master ~20 mins● Logging stuff configuration● Some 5xx error still appears on the Contrail

API● Live upgrade a compute node without

downtime (do we need it?)

Page 19: OpenContrail Cloudwatt Feedback

My wishlist to Santa SDN

● That people use more https://blueprints.launchpad.net/opencontrail

● Stable master before pulling new branch● Use http://semver.org to number releases● The Contrail team to be more community

oriented

Page 20: OpenContrail Cloudwatt Feedback

2015S2 todo● Improve Neutron Contrail plugin code https://review.opencontrail.org/10123● Upgrade to 2.x branch● Build a CI/CD on master

○ build and deploy daily○ run opencontrail sanity○ run functional no-reg○ run performance no-reg

● OpenStack L3VPN integration

Page 21: OpenContrail Cloudwatt Feedback

Questions ?