Networking approaches in a container world · Disclaimer There a many container engines, I’m...

Post on 20-May-2020

5 views 0 download

Transcript of Networking approaches in a container world · Disclaimer There a many container engines, I’m...

Networking Approaches in

a Container World

Flavio CastelliEngineering Managerfcastelli@suse.com

Disclaimer● There a many container engines, I’m going to focus on Docker

● Multiple networking solutions are available:

– Introduce the core concepts

– Many projects → cover only some of them

● Container orchestration engines:

– Tightly coupled with networking

– I’m going to focus on Docker Swarm and Kubernetes

● Remember: the container ecosystem moves at a fast peace, things can suddenly change

2

The problem● Given:

– Containers are lightweight

– Containers are great for microservices

– Microservices: multiple distributed processes communicating

→ Lots of containers that need to be connected together

3

Single host

4

Reuse the host network

5

host

container-01

eth0lo ...

Container has full access to host’s interfaces!

Reuse the host network

6

$ docker run --rm --name container-01 --net=host -ti busybox /bin/sh/ # ifconfig docker0 Link encap:Ethernet HWaddr xx:xx:xx:xx:xx:xx inet addr:172.17.0.1 Bcast:0.0.0.0 Mask:255.255.0.0 inet6 addr: fe80::x/64 Scope:Link UP BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:19888 errors:0 dropped:0 overruns:0 frame:0 TX packets:19314 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:3063342 (2.9 MiB) TX bytes:29045336 (27.6 MiB)

eth0 Link encap:Ethernet HWaddr xx:xx:xx:xx:xx:xx inet addr:192.168.1.121 Bcast:192.168.1.255 Mask:255.255.255.0 inet6 addr: fe80::x/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:135513 errors:0 dropped:0 overruns:0 frame:0 TX packets:109723 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:102680118 (97.9 MiB) TX bytes:22766730 (21.7 MiB)

lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:230 errors:0 dropped:0 overruns:0 frame:0 TX packets:230 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:37871 (36.9 KiB) TX bytes:37871 (36.9 KiB)

Warning: the container can see and control all the host interfaces

Network bridge

7

host

container-01

eth0

docker0

container-02

172.17.0.0/16

● An internal, virtual switch● Containers are plugged in that switch● Containers on the same bridge can talk to each other● Users can create multiple bridges

Network bridge: as seen by the host

8

$ ifconfig docker0docker0 Link encap:Ethernet HWaddr xx:xx:xx:xx:xx:xx inet addr:172.17.0.1 Bcast:0.0.0.0 Mask:255.255.0.0 inet6 addr: fe80::42:a2ff:fe10:ccf7/64 Scope:Link UP BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:7 errors:0 dropped:0 overruns:0 frame:0 TX packets:30 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:480 (480.0 B) TX bytes:5025 (5.0 KB)

$ ip route default via 192.168.1.1 dev wlan0 proto static metric 600 172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1...

route handling traffic from host to containers

docker0 is by default at 172.17.0.1

How to expose a service

9

host

container-01 container-02

172.17.0.0/16

eth0

docker0

port 80

port 8080

● Port 80 of container-02 is mapped to port 8080 of the host● Risk: port exhaustion on the host

Multi-host networking

10

Multi-host networking scenarios

11

host-A host-B host-C

container-02

container-01

container-03

container-04

container-05

container-06

frontendnetwork

applicationnetwork

databasenetwork

eth0 eth0 eth0

Multi-host networking scenarios

12

a big host-A

frontendnetwork

container-02

container-01

container-03

container-04

container-05

container-06

applicationnetwork

databasenetwork

Multi-host networking scenarios

13

a big host-A

frontendnetwork

container-02

container-01

container-03

container-04

container-05

container-06

applicationnetwork

databasenetwork

VM-1 VM-2 VM-3

Routing solutions

14

Routing approach● Create a common IP space at container level

● Assign a /24 subnet to each host

● Setup IP routes between the hosts

● Main projects:

– Calico

– Flannel

– Romana

15

Routing approach

16

192.168.1.2

host-A

container-01

eth0

docker0

10.0.9.0/24

10.0.9.4 10.0.9.5

10.0.9.1

container-02

host-B

container-03

eth0

docker0

10.0.10.0/24

10.0.10.8 10.0.10.9

10.0.10.1

container-04

Routing rule: 10.0.10.* goes through eth0

192.168.1.3

Routing rule: 10.0.10.* goes through docker0

Calico's approach

17

192.168.1.2

host-A

container-01

eth0

docker0

10.0.9.0/24

10.0.9.4 10.0.9.5

10.0.9.1

container-02

host-B

container-03

eth0

docker0

10.0.10.0/24

10.0.10.8 10.0.10.9

10.0.10.1

container-04

192.168.1.3

BGP● One of the protocols used to

build the Internet● Used to advertise routes

● Felix agent● Uses kernel’s L3 forwarding

capabilities● Handles ACLs

Flannel's approach

18

192.168.1.2

host-A

container-01

eth0

docker0

10.0.9.0/24

10.0.9.4 10.0.9.5

10.0.9.1

container-02

host-B

container-03

eth0

docker0

10.0.10.0/24

10.0.10.8 10.0.10.9

10.0.10.1

container-04

192.168.1.3

etcd● flanneld process● Keep routes up-to-date

● Network configuration● Network topology

Calico + flannel = Canal● Collaboration announced on May 9th 2016

● Use Calico and flannel together

● Project still in its early days

19

Overlay solutions

20

Overlay network approach● Create a parallel network for cross communication

● Connect hosts with encapsulation tunnels

● Connect containers to the virtual networks

● Main projects:

– Docker (native)

– Flannel

– Weave

21

Overlay network

22

192.168.1.2

host-A

container-01

eth0

docker0

10.0.9.0/24

container-02

host-B

container-03

eth0

docker0

10.0.9.0/24

container-04

192.168.1.3

capture traffic leaving to some other container in 10.0.9.X

outerEther

Header (src/dst)

outerIP

Header (src/dst)

outerUDP

Header

VXLAN Header

Inner Ether FrameEncapsulated traffic (eg. VXLAN)

Overlay traffic

Overlay network and k/v storeNetwork state and configuration can be saved into a k/v store:

● Docker < 1.12: supports etcd, consul and zookeeper via libkv

● Docker >= 1.12: no external dependency, built-in component

● Flannel: etcd

● Weave: no external dependency, doesn’t use k/v store at all

23

Overlay backends

● VXLAN:

– faster than UDP, traffic doesn't go to userspace

– Some hardware acceleration available

● UDP: can add encryption easily24

VXLAN UDP

Docker X -

Weave X X*

Flannel X X*

* backed by custom protocol

Routing vs Overlay

25

Good Bad

Routing

● Native performance● Easy debugging

● Requires control over the infrastructure

● Hybrid cloud more complicated (requires VPN)

● Can run out of addresses

Overlay

● Easier inter-cloud● Doesn’t require control over

the infrastructure

● Inferior performances● Debugging more

complicated● No IP multicast (except for

weave)

How to use these projects● Container Network Module (CNM):

– Specification used by Docker

– Plugins for: calico, weave, …

– Note well: Docker 1.12+ Swarm mode works only with the native overlay network driver

● Container Network Interface (CNI):

– Derived from rkt networking proposal

– Supported by rkt, kubernetes, Cloud Foundry, Mesos,…

– Support for: calico, flannel, weave,...

26

More troubles...

27

Are we done?● Now we can:

– Connect containers running on different hosts

– React to network changes

● Is that enough? Unfortunately not...

28

Service discovery● A container runs a service: producer

● A container accesses this service: consumer

● The consumer needs to find where the producer is located (IP address, in some cases even port #)

29

Challenge #1: find the producer

30

host-A host-B

web-01 redis-01

eth0 eth0

Where is redis?

host-A host-B

web-01 redis-01

eth0 eth0

Challenge #2: react to changes

31

host-A host-B

web-01 redis-01

eth0 eth0

host-C

eth0

“web” is already connected to “redis”

Challenge #2: react to changes

32

host-A host-B

web-01 redis-01

eth0 eth0

host-C

redis-02

eth0

“web” points to to the old location → it’s broken

“redis” is moved to another host → different IP

Challenge #2: react to changes

33

The link has to be reestablished

host-A host-B

web-01

eth0 eth0

host-C

redis-02

eth0

Containers can be moved at any time:● The producer can be moved to a different host● The consumer should keep working

Challenge #3: multiple choices

34

Multiple instances of the “redis” image

host-A host-B

web-01 redis-01

eth0 eth0

host-C

redis-02

eth0

Which redis?

Workloads can be scaled:● More instances of the same producer● How to choose between all of them?

Addressing service discovery

35

Use DNS● Not a good solution:

– Containers can die/be moved somewhere more often

– Return DNS responses with a short TTL → more load on the server

– Some clients ignore TTL → old entries are cached

Note well:

● Docker < 1.11: updates /etc/hosts dynamically

● Docker >= 1.11: integrates a DNS server

36

Key-value store● Rely on a k/v store (etcd, consul, zookeeper)

● Producer register itself: IP, port #

● Orchestration engine handles this data to the consumer

● At run time either:

– Change your application to read data straight from the k/v

– Rely on some helper that exposes the values via environment file or configuration file

37

Handing changes &multiple choices

38

DIY solution● Use a load balancer

● Point all the consumers to a load balancer

● Expose the producer(s) using the load balancer

● Configure the load balancer to react to changes

→ More moving pieces

39

Rely on the orchestration engine● Service has an unique and stable IP address

● Consumers are pointed to the service

● Service redirects the request to one of the containers running the producer

● Traditional DNS can be added on top of it → no changes to legacy applications

● Feature offered by Kubernetes and Docker >= 1.12

40

Kubernetes and Swarm services

41

host-B

redis-01

eth0

host-C

redis-02

eth0

host-A

web-01

eth0

redisservice

VIP

● User declares a service● Orchestration engine allocates a virtual IP address for it● On each container node:

– iptables rules to handle VIP ↔ container IP translation– A process keeps the iptables rules up-to-date

Are we really done?

42

Ingress traffic● Your production application is running inside of a container

cluster

● How to route customers’ requests to these containers?

● How to react to changes (containers moved, scaling,…)?

43

Kubernetes’ approachServices can be of three different types:

● ClusterIP: virtual IP reachable only by containers inside of the cluster

● NodePort: ClusterIP + the service is exposed on all the nodes of the cluster on a specific port → <NodeIP>:<NodePort>

● LoadBalancer: NodePort + k8s allocates a load balancer using the underlying cloud provider. Then it configures it and it keep it up-to-date

44

Docker 1.12 approach● Define a service using the `--publish` flag

● The service is exposed on all the nodes of the cluster on a specific port → <NodeIP>:<ServicePort>

45

Ingress traffic flow

46

Load balancer

http://guestbook.com

host-B

guestbook-01

8081

blog-01

8080

host-A

guestbook-01

80818080

host-C

8081

blog-01

8080

● Load balancer picks a container host● Traffic is handled by the internal service● Works even when the node chosen by the load balancer is not running the

container

Recap

47

Calico Docker built-in Flannel Weave

Approach routing overlay routing, overlay overlay

Specification CNI, CNM CNM CNI, CNM CNI, CNM

It’s not just a matter of connecting containers:● Service discovery● Handling changes & multiple choices● Handling ingress traffic

Questions?

48