[2C4]Clustered computing with CoreOS, fleet and etcd
-
Upload
naver-d2 -
Category
Technology
-
view
995 -
download
1
description
Transcript of [2C4]Clustered computing with CoreOS, fleet and etcd
![Page 1: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/1.jpg)
Clustered computing with CoreOS, fleet and etcd
DEVIEW 2014
Jonathan BoulleCoreOS, Inc.
![Page 2: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/2.jpg)
@baronboulle
jonboulle
Who am I?
Jonathan Boulle
![Page 3: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/3.jpg)
South Africa -> Australia -> London -> San Francisco
@baronboulle
jonboulle
Who am I?
Jonathan Boulle
![Page 4: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/4.jpg)
South Africa -> Australia -> London -> San Francisco
Red Hat -> Twitter -> CoreOS
@baronboulle
jonboulle
Who am I?
Jonathan Boulle
![Page 5: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/5.jpg)
South Africa -> Australia -> London -> San Francisco
Red Hat -> Twitter -> CoreOS
Linux, Python, Go, FOSS
@baronboulle
jonboulle
Who am I?
Jonathan Boulle
![Page 6: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/6.jpg)
Agenda
![Page 7: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/7.jpg)
Agenda
● CoreOS Linux
![Page 8: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/8.jpg)
Agenda
● CoreOS Linux– Securing the internet
![Page 9: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/9.jpg)
Agenda
● CoreOS Linux– Securing the internet– Application containers
![Page 10: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/10.jpg)
Agenda
● CoreOS Linux– Securing the internet– Application containers– Automatic updates
![Page 11: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/11.jpg)
Agenda
● CoreOS Linux– Securing the internet– Application containers– Automatic updates
● fleet
![Page 12: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/12.jpg)
Agenda
● CoreOS Linux– Securing the internet– Application containers– Automatic updates
● fleet– cluster-level init system
![Page 13: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/13.jpg)
Agenda
● CoreOS Linux– Securing the internet– Application containers– Automatic updates
● fleet– cluster-level init system– etcd + systemd
![Page 14: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/14.jpg)
Agenda
● CoreOS Linux– Securing the internet– Application containers– Automatic updates
● fleet– cluster-level init system– etcd + systemd
● fleet and...
![Page 15: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/15.jpg)
Agenda
● CoreOS Linux– Securing the internet– Application containers– Automatic updates
● fleet– cluster-level init system– etcd + systemd
● fleet and...– systemd: the good, the bad
![Page 16: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/16.jpg)
Agenda
● CoreOS Linux– Securing the internet– Application containers– Automatic updates
● fleet– cluster-level init system– etcd + systemd
● fleet and...– systemd: the good, the bad
– etcd: the good, the bad
![Page 17: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/17.jpg)
Agenda
● CoreOS Linux– Securing the internet– Application containers– Automatic updates
● fleet– cluster-level init system– etcd + systemd
● fleet and...– systemd: the good, the bad
– etcd: the good, the bad
– Golang: the good, the bad
![Page 18: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/18.jpg)
Agenda
● CoreOS Linux– Securing the internet– Application containers– Automatic updates
● fleet– cluster-level init system– etcd + systemd
● fleet and...– systemd: the good, the bad
– etcd: the good, the bad
– Golang: the good, the bad
● Q&A
![Page 19: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/19.jpg)
CoreOS Linux
![Page 20: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/20.jpg)
CoreOS Linux
![Page 21: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/21.jpg)
CoreOS Linux
A minimal, automatically-updated Linux distribution,
designed for distributed systems.
![Page 22: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/22.jpg)
Why?
![Page 23: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/23.jpg)
Why?
● CoreOS mission: “Secure the internet”
![Page 24: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/24.jpg)
Why?
● CoreOS mission: “Secure the internet”● Status quo: set up a server and never touch it
![Page 25: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/25.jpg)
Why?
● CoreOS mission: “Secure the internet”● Status quo: set up a server and never touch it● Internet is full of servers running years-old
software with dozens of vulnerabilities
![Page 26: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/26.jpg)
Why?
● CoreOS mission: “Secure the internet”● Status quo: set up a server and never touch it● Internet is full of servers running years-old
software with dozens of vulnerabilities● CoreOS: make updating the default, seamless
option
![Page 27: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/27.jpg)
Why?
● CoreOS mission: “Secure the internet”● Status quo: set up a server and never touch it● Internet is full of servers running years-old
software with dozens of vulnerabilities● CoreOS: make updating the default, seamless
option● Regular
![Page 28: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/28.jpg)
Why?
● CoreOS mission: “Secure the internet”● Status quo: set up a server and never touch it● Internet is full of servers running years-old software
with dozens of vulnerabilities● CoreOS: make updating the default, seamless option
● Regular● Reliable
![Page 29: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/29.jpg)
Why?
● CoreOS mission: “Secure the internet”● Status quo: set up a server and never touch it● Internet is full of servers running years-old software
with dozens of vulnerabilities● CoreOS: make updating the default, seamless option
● Regular● Reliable● Automatic
![Page 30: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/30.jpg)
How do we achieve this?
![Page 31: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/31.jpg)
How do we achieve this?
● Containerization of applications
![Page 32: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/32.jpg)
How do we achieve this?
● Containerization of applications● Self-updating operating system
![Page 33: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/33.jpg)
How do we achieve this?
● Containerization of applications● Self-updating operating system● Distributed systems tooling to make applications
resilient to updates
![Page 34: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/34.jpg)
KERNELSYSTEMDSSHDOCKER
PYTHONJAVANGINXMYSQLOPENSSL
dist
ro d
istr
o di
stro
dis
tro
dist
ro d
istr
o di
stro
dis
tro
dist
ro d
istr
o di
stro
APP
![Page 35: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/35.jpg)
KERNELSYSTEMDSSHDOCKER
PYTHONJAVANGINXMYSQLOPENSSL
dist
ro d
istr
o di
stro
dis
tro
dist
ro d
istr
o di
stro
dis
tro
dist
ro d
istr
o di
stro
APP
![Page 36: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/36.jpg)
KERNELSYSTEMDSSHDOCKER
PYTHONJAVANGINXMYSQLOPENSSL
dist
ro d
istr
o di
stro
dis
tro
dist
ro d
istr
o di
stro
dis
tro
dist
ro d
istr
o di
stro
APP
Application Containers (e.g. Docker)
![Page 37: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/37.jpg)
KERNELSYSTEMDSSHDOCKER
- Minimal base OS (~100MB)- Vanilla upstream components wherever possible- Decoupled from applications- Automatic, atomic updates
![Page 38: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/38.jpg)
Automatic updates
![Page 39: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/39.jpg)
Automatic updates
How do updates work?
![Page 40: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/40.jpg)
Automatic updates
How do updates work?● Omaha protocol (check-in/retrieval)
![Page 41: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/41.jpg)
Automatic updates
How do updates work?● Omaha protocol (check-in/retrieval)
– Simple XML-over-HTTP protocol developed by Google to facilitate polling and pulling updates from a server
![Page 42: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/42.jpg)
Client sends application id andcurrent version to the update server
Omaha protocol
<request protocol="3.0" version="CoreOSUpdateEngine-0.1.0.0"> <app appid="{e96281a6-d1af-4bde-9a0a-97b76e56dc57}" version="410.0.0" track="alpha" from_track="alpha"> <event eventtype="3"></event> </app></request>
![Page 43: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/43.jpg)
Client sends application id andcurrent version to the update server
Update server responds with theURL of an update to be applied
Omaha protocol
<url codebase="https://commondatastorage.googleapis.com/update-storage.core-os.net/amd64-usr/452.0.0/"></url> <package hash="D0lBAMD1Fwv8YqQuDYEAjXw6YZY=" name="update.gz" size="103401155" required="false">
![Page 44: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/44.jpg)
Client sends application id andcurrent version to the update server
Update server responds with theURL of an update to be applied
Client downloads data, verifieshash & cryptographic signature,and applies the update
Omaha protocol
![Page 45: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/45.jpg)
Client sends application id andcurrent version to the update server
Update server responds with theURL of an update to be applied
Client downloads data, verifieshash & cryptographic signature,and applies the update
Updater exits with response codethen reports the update to theupdate server
Omaha protocol
![Page 46: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/46.jpg)
Automatic updates
![Page 47: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/47.jpg)
Automatic updates
How do updates work?● Omaha protocol (check-in/retrieval)
– Simple XML-over-HTTP protocol developed by Google to facilitate polling and pulling updates from a server
● Active/passive read-only root partitions
![Page 48: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/48.jpg)
Automatic updates
How do updates work?● Omaha protocol (check-in/retrieval)
– Simple XML-over-HTTP protocol developed by Google to facilitate polling and pulling updates from a server
● Active/passive read-only root partitions– One for running the live system, one for updates
![Page 49: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/49.jpg)
Booted off partition A.Download update and committo partition B. Change GPT.
Active/passive root partitions
![Page 50: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/50.jpg)
Reboot (into Partition B).If tests succeed, continue normal operation and mark success in GPT.
Active/passive root partitions
![Page 51: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/51.jpg)
But what if partition B fails update tests...
Active/passive root partitions
![Page 52: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/52.jpg)
Change GPT to point to previous partition, reboot.Try update again later.
Active/passive root partitions
![Page 53: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/53.jpg)
Active/passive root partitions
core-01 ~ # cgpt show /dev/sda3 start size contents 264192 2097152 Label: "USR-A" Type: Alias for coreos-rootfs UUID: 7130C94A-213A-4E5A-8E26-6CCE9662 Attr: priority=1 tries=0 successful=1
core-01 ~ # cgpt show /dev/sda4 start size contents2492416 2097152 Label: "USR-B" Type: Alias for coreos-rootfs UUID: E03DD35C-7C2D-4A47-B3FE-27F15780A Attr: priority=2 tries=1 successful=0
![Page 54: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/54.jpg)
Active/passive root partitions
core-01 ~ # cgpt show /dev/sda3 start size contents 264192 2097152 Label: "USR-A" Type: Alias for coreos-rootfs UUID: 7130C94A-213A-4E5A-8E26-6CCE9662 Attr: priority=1 tries=0 successful=1
core-01 ~ # cgpt show /dev/sda4 start size contents2492416 2097152 Label: "USR-B" Type: Alias for coreos-rootfs UUID: E03DD35C-7C2D-4A47-B3FE-27F15780A Attr: priority=2 tries=1 successful=0
![Page 55: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/55.jpg)
Active/passive root partitions
core-01 ~ # cgpt show /dev/sda3 start size contents 264192 2097152 Label: "USR-A" Type: Alias for coreos-rootfs UUID: 7130C94A-213A-4E5A-8E26-6CCE9662 Attr: priority=1 tries=0 successful=1
core-01 ~ # cgpt show /dev/sda4 start size contents2492416 2097152 Label: "USR-B" Type: Alias for coreos-rootfs UUID: E03DD35C-7C2D-4A47-B3FE-27F15780A Attr: priority=2 tries=1 successful=0
![Page 56: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/56.jpg)
Active/passive /usr partitions
core-01 ~ # cgpt show /dev/sda3 start size contents 264192 2097152 Label: "USR-A" Type: Alias for coreos-rootfs UUID: 7130C94A-213A-4E5A-8E26-6CCE9662 Attr: priority=1 tries=0 successful=1
core-01 ~ # cgpt show /dev/sda4 start size contents2492416 2097152 Label: "USR-B" Type: Alias for coreos-rootfs UUID: E03DD35C-7C2D-4A47-B3FE-27F15780A Attr: priority=2 tries=1 successful=0
![Page 57: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/57.jpg)
Active/passive /usr partitions
![Page 58: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/58.jpg)
Active/passive /usr partitions
● Single image containing most of the OS
![Page 59: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/59.jpg)
Active/passive /usr partitions
● Single image containing most of the OS– Mounted read-only onto /usr
![Page 60: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/60.jpg)
Active/passive /usr partitions
● Single image containing most of the OS– Mounted read-only onto /usr
– / is mounted read-write on top (persistent data)
![Page 61: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/61.jpg)
Active/passive /usr partitions
● Single image containing most of the OS– Mounted read-only onto /usr
– / is mounted read-write on top (persistent data)– Parts of /etc generated dynamically at boot
![Page 62: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/62.jpg)
Active/passive /usr partitions
● Single image containing most of the OS– Mounted read-only onto /usr
– / is mounted read-write on top (persistent data)– Parts of /etc generated dynamically at boot– A lot of work moving default configs from /etc to /usr
![Page 63: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/63.jpg)
Active/passive /usr partitions
● Single image containing most of the OS– Mounted read-only onto /usr
– / is mounted read-write on top (persistent data)– Parts of /etc generated dynamically at boot– A lot of work moving default configs from /etc to /usr
# /etc/nsswitch.conf:
passwd: files usrfilesshadow: files usrfilesgroup: files usrfiles
![Page 64: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/64.jpg)
Active/passive /usr partitions
● Single image containing most of the OS– Mounted read-only onto /usr
– / is mounted read-write on top (persistent data)– Parts of /etc generated dynamically at boot– A lot of work moving default configs from /etc to /usr
# /etc/nsswitch.conf:
passwd: files usrfilesshadow: files usrfilesgroup: files usrfiles
/usr/share/baselayout/passwd/usr/share/baselayout/shadow/usr/share/baselayout/group
![Page 65: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/65.jpg)
Atomic updates
![Page 66: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/66.jpg)
Atomic updates
● Entire OS is a single read-only imagecore-01 ~ # touch /usr/bin/foo touch: cannot touch '/usr/bin/foo': Read-only file system
![Page 67: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/67.jpg)
Atomic updates
● Entire OS is a single read-only imagecore-01 ~ # touch /usr/bin/foo touch: cannot touch '/usr/bin/foo': Read-only file system
– Easy to verify cryptographically● sha1sum on AWS or bare metal gives the same result
![Page 68: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/68.jpg)
Atomic updates
● Entire OS is a single read-only imagecore-01 ~ # touch /usr/bin/foo touch: cannot touch '/usr/bin/foo': Read-only file system
– Easy to verify cryptographically● sha1sum on AWS or bare metal gives the same result
– No chance of inconsistencies due to partial updates● e.g. pull a plug on a CentOS system during a yum update● At large scale, such events are inevitable
![Page 69: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/69.jpg)
Automatic, atomic updates are great! But...
![Page 70: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/70.jpg)
● Problem:
Automatic, atomic updates are great! But...
![Page 71: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/71.jpg)
● Problem: – updates still require a reboot (to use new kernel and
mount new filesystem)
Automatic, atomic updates are great! But...
![Page 72: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/72.jpg)
● Problem: – updates still require a reboot (to use new kernel and
mount new filesystem)– Reboots cause application downtime...
Automatic, atomic updates are great! But...
![Page 73: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/73.jpg)
● Problem: – updates still require a reboot (to use new kernel and
mount new filesystem)– Reboots cause application downtime...
● Solution: fleet
Automatic, atomic updates are great! But...
![Page 74: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/74.jpg)
● Problem: – updates still require a reboot (to use new kernel and
mount new filesystem)– Reboots cause application downtime...
● Solution: fleet– Highly available, fault tolerant, distributed process
scheduler
Automatic, atomic updates are great! But...
![Page 75: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/75.jpg)
● Problem: – updates still require a reboot (to use new kernel and
mount new filesystem)– Reboots cause application downtime...
● Solution: fleet– Highly available, fault tolerant, distributed process
scheduler ● ... The way to run applications on a CoreOS cluster
Automatic, atomic updates are great! But...
![Page 76: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/76.jpg)
● Problem: – updates still require a reboot (to use new kernel and mount
new filesystem)– Reboots cause application downtime...
● Solution: fleet– Highly available, fault tolerant, distributed process scheduler
● ... The way to run applications on a CoreOS cluster– fleet keeps applications running during server downtime
Automatic, atomic updates are great! But...
![Page 77: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/77.jpg)
fleet – the “cluster-level init system”
![Page 78: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/78.jpg)
fleet – the “cluster-level init system”
fleet is the abstraction between machine and application:
![Page 79: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/79.jpg)
fleet – the “cluster-level init system”
fleet is the abstraction between machine and application:– init system manages process on a machine
– fleet manages applications on a cluster of machines
![Page 80: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/80.jpg)
fleet – the “cluster-level init system”
fleet is the abstraction between machine and application:– init system manages process on a machine
– fleet manages applications on a cluster of machines Similar to Mesos, but very different architecture
(e.g. based on etcd/Raft, not Zookeeper/Paxos)
![Page 81: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/81.jpg)
fleet – the “cluster-level init system”
fleet is the abstraction between machine and application:– init system manages process on a machine– fleet manages applications on a cluster of machines
Similar to Mesos, but very different architecture (e.g. based on etcd/Raft, not Zookeeper/Paxos)
Uses systemd for machine-level process management, etcd for cluster-level co-ordination
![Page 82: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/82.jpg)
fleet – low level view
![Page 83: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/83.jpg)
fleet – low level view
fleetd binary (running on all CoreOS nodes)
![Page 84: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/84.jpg)
fleet – low level view
fleetd binary (running on all CoreOS nodes)
– encapsulates two roles:
![Page 85: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/85.jpg)
fleet – low level view
fleetd binary (running on all CoreOS nodes)
– encapsulates two roles:
• engine (cluster-level unit scheduling – talks to etcd)
![Page 86: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/86.jpg)
fleet – low level view
fleetd binary (running on all CoreOS nodes)
– encapsulates two roles:
• engine (cluster-level unit scheduling – talks to etcd)
• agent (local unit management – talks to etcd and systemd)
![Page 87: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/87.jpg)
fleet – low level view
fleetd binary (running on all CoreOS nodes)
– encapsulates two roles:
• engine (cluster-level unit scheduling – talks to etcd)
• agent (local unit management – talks to etcd and systemd) fleetctl command-line administration tool
![Page 88: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/88.jpg)
fleet – low level view
fleetd binary (running on all CoreOS nodes)
– encapsulates two roles:
• engine (cluster-level unit scheduling – talks to etcd)
• agent (local unit management – talks to etcd and systemd) fleetctl command-line administration tool
– create, destroy, start, stop units
![Page 89: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/89.jpg)
fleet – low level view
fleetd binary (running on all CoreOS nodes)
– encapsulates two roles:
• engine (cluster-level unit scheduling – talks to etcd)
• agent (local unit management – talks to etcd and systemd) fleetctl command-line administration tool
– create, destroy, start, stop units
– Retrieve current status of units/machines in the cluster
![Page 90: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/90.jpg)
fleet – low level view
fleetd binary (running on all CoreOS nodes)
– encapsulates two roles:
• engine (cluster-level unit scheduling – talks to etcd)• agent (local unit management – talks to etcd and systemd)
fleetctl command-line administration tool
– create, destroy, start, stop units
– Retrieve current status of units/machines in the cluster HTTP API
![Page 91: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/91.jpg)
fleet – high level view
Single machine
Cluster
![Page 92: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/92.jpg)
systemd
![Page 93: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/93.jpg)
systemd
Linux init system (PID 1) – manages processes
![Page 94: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/94.jpg)
systemd
Linux init system (PID 1) – manages processes– Relatively new, replaces SysVinit, upstart, OpenRC, ...
![Page 95: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/95.jpg)
systemd
Linux init system (PID 1) – manages processes– Relatively new, replaces SysVinit, upstart, OpenRC, ...
– Being adopted by all major Linux distributions
![Page 96: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/96.jpg)
systemd
Linux init system (PID 1) – manages processes– Relatively new, replaces SysVinit, upstart, OpenRC, ...
– Being adopted by all major Linux distributions Fundamental concept is the unit
![Page 97: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/97.jpg)
systemd
Linux init system (PID 1) – manages processes– Relatively new, replaces SysVinit, upstart, OpenRC, ...
– Being adopted by all major Linux distributions Fundamental concept is the unit
– Units include services (e.g. applications), mount points, sockets, timers, etc.
![Page 98: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/98.jpg)
systemd
Linux init system (PID 1) – manages processes– Relatively new, replaces SysVinit, upstart, OpenRC, ...– Being adopted by all major Linux distributions
Fundamental concept is the unit– Units include services (e.g. applications), mount points,
sockets, timers, etc.– Each unit configured with a simple unit file
![Page 99: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/99.jpg)
Quick comparison
![Page 100: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/100.jpg)
fleet + systemd
![Page 101: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/101.jpg)
fleet + systemd
systemd exposes a D-Bus interface
![Page 102: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/102.jpg)
fleet + systemd
systemd exposes a D-Bus interface– D-Bus: message bus system for IPC on Linux
![Page 103: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/103.jpg)
fleet + systemd
systemd exposes a D-Bus interface– D-Bus: message bus system for IPC on Linux
– One-to-one messaging (methods), plus pub/sub abilities
![Page 104: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/104.jpg)
fleet + systemd
systemd exposes a D-Bus interface– D-Bus: message bus system for IPC on Linux
– One-to-one messaging (methods), plus pub/sub abilities fleet uses godbus to communicate with systemd
![Page 105: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/105.jpg)
fleet + systemd
systemd exposes a D-Bus interface– D-Bus: message bus system for IPC on Linux
– One-to-one messaging (methods), plus pub/sub abilities fleet uses godbus to communicate with systemd
– Sending commands: StartUnit, StopUnit
![Page 106: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/106.jpg)
fleet + systemd
systemd exposes a D-Bus interface– D-Bus: message bus system for IPC on Linux– One-to-one messaging (methods), plus pub/sub abilities
fleet uses godbus to communicate with systemd– Sending commands: StartUnit, StopUnit
– Retrieving current state of units (to publish to the cluster)
![Page 107: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/107.jpg)
systemd is great!
![Page 108: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/108.jpg)
systemd is great!
Automatically handles:
![Page 109: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/109.jpg)
systemd is great!
Automatically handles:– Process daemonization
![Page 110: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/110.jpg)
systemd is great!
Automatically handles:– Process daemonization
– Resource isolation/containment (cgroups)
![Page 111: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/111.jpg)
systemd is great!
Automatically handles:– Process daemonization
– Resource isolation/containment (cgroups)• e.g. MemoryLimit=512M
![Page 112: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/112.jpg)
systemd is great!
Automatically handles:– Process daemonization
– Resource isolation/containment (cgroups)• e.g. MemoryLimit=512M
– Health-checking, restarting failed services
![Page 113: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/113.jpg)
systemd is great!
Automatically handles:– Process daemonization
– Resource isolation/containment (cgroups)• e.g. MemoryLimit=512M
– Health-checking, restarting failed services
– Logging (journal)
![Page 114: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/114.jpg)
systemd is great!
Automatically handles:– Process daemonization
– Resource isolation/containment (cgroups)• e.g. MemoryLimit=512M
– Health-checking, restarting failed services
– Logging (journal) • applications can just write to stdout, systemd adds metadata
![Page 115: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/115.jpg)
systemd is great!
Automatically handles:– Process daemonization– Resource isolation/containment (cgroups)
• e.g. MemoryLimit=512M
– Health-checking, restarting failed services– Logging (journal)
• applications can just write to stdout, systemd adds metadata
– Timers, inter-unit dependencies, socket activation, ...
![Page 116: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/116.jpg)
fleet + systemd
![Page 117: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/117.jpg)
fleet + systemd
systemd takes care of things so we don't have to
![Page 118: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/118.jpg)
fleet + systemd
systemd takes care of things so we don't have to fleet configuration is just systemd unit files
![Page 119: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/119.jpg)
fleet + systemd
systemd takes care of things so we don't have to fleet configuration is just systemd unit files fleet extends systemd to the cluster-level, and adds
some features of its own (using [X-Fleet])
![Page 120: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/120.jpg)
fleet + systemd
systemd takes care of things so we don't have to fleet configuration is just systemd unit files fleet extends systemd to the cluster-level, and adds
some features of its own (using [X-Fleet])– Template units (run n identical copies of a unit)
![Page 121: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/121.jpg)
fleet + systemd
systemd takes care of things so we don't have to fleet configuration is just systemd unit files fleet extends systemd to the cluster-level, and adds
some features of its own (using [X-Fleet])– Template units (run n identical copies of a unit)
– Global units (run a unit everywhere in the cluster)
![Page 122: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/122.jpg)
fleet + systemd
systemd takes care of things so we don't have to fleet configuration is just systemd unit files fleet extends systemd to the cluster-level, and adds
some features of its own (using [X-Fleet])– Template units (run n identical copies of a unit)– Global units (run a unit everywhere in the cluster)– Machine metadata (run only on certain machines)
![Page 123: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/123.jpg)
systemd is... not so great
![Page 124: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/124.jpg)
systemd is... not so great
Problem: unreliable pub/sub
![Page 125: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/125.jpg)
systemd is... not so great
Problem: unreliable pub/sub– fleet agent initially used a systemd D-Bus subscription
to track unit status
![Page 126: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/126.jpg)
systemd is... not so great
Problem: unreliable pub/sub– fleet agent initially used a systemd D-Bus subscription
to track unit status
– Every change in unit state in systemd triggers an event in fleet (e.g. “publish this new state to the cluster”)
![Page 127: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/127.jpg)
systemd is... not so great
Problem: unreliable pub/sub– fleet agent initially used a systemd D-Bus subscription
to track unit status
– Every change in unit state in systemd triggers an event in fleet (e.g. “publish this new state to the cluster”)
– Under heavy load, or byzantine conditions, unit state changes would be dropped
![Page 128: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/128.jpg)
systemd is... not so great
Problem: unreliable pub/sub– fleet agent initially used a systemd D-Bus subscription
to track unit status– Every change in unit state in systemd triggers an event
in fleet (e.g. “publish this new state to the cluster”)– Under heavy load, or byzantine conditions, unit state
changes would be dropped– As a result, unit state in the cluster became stale
![Page 129: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/129.jpg)
systemd is... not so great
![Page 130: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/130.jpg)
systemd is... not so great
Problem: unreliable pub/sub
![Page 131: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/131.jpg)
systemd is... not so great
Problem: unreliable pub/sub Solution: polling for unit states
![Page 132: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/132.jpg)
systemd is... not so great
Problem: unreliable pub/sub Solution: polling for unit states
– Every n seconds, retrieve state of units from systemd, and synchronize with cluster
![Page 133: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/133.jpg)
systemd is... not so great
Problem: unreliable pub/sub Solution: polling for unit states
– Every n seconds, retrieve state of units from systemd, and synchronize with cluster
– Less efficient, but much more reliable
![Page 134: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/134.jpg)
systemd is... not so great
Problem: unreliable pub/sub Solution: polling for unit states
– Every n seconds, retrieve state of units from systemd, and synchronize with cluster
– Less efficient, but much more reliable
– Optimize by caching state and only publishing changes
![Page 135: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/135.jpg)
systemd is... not so great
Problem: unreliable pub/sub Solution: polling for unit states
– Every n seconds, retrieve state of units from systemd, and synchronize with cluster
– Less efficient, but much more reliable– Optimize by caching state and only publishing changes– Any state inconsistencies are quickly fixed
![Page 136: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/136.jpg)
systemd (and docker) are... not so great
![Page 137: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/137.jpg)
systemd (and docker) are... not so great
Problem: poor integration with Docker
![Page 138: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/138.jpg)
systemd (and docker) are... not so great
Problem: poor integration with Docker– Docker is de facto application container manager
![Page 139: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/139.jpg)
systemd (and docker) are... not so great
Problem: poor integration with Docker– Docker is de facto application container manager
– Docker and systemd do not always play nicely together..
![Page 140: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/140.jpg)
systemd (and docker) are... not so great
Problem: poor integration with Docker– Docker is de facto application container manager– Docker and systemd do not always play nicely together..– Both Docker and systemd manage cgroups and
processes, and when the two are trying to manage the same thing, the results are mixed
![Page 141: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/141.jpg)
systemd (and docker) are... not so great
![Page 142: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/142.jpg)
systemd (and docker) are... not so great
Example: sending signals to a container
![Page 143: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/143.jpg)
systemd (and docker) are... not so great
Example: sending signals to a container– Given a simple container:
[Service]ExecStart=/usr/bin/docker run busybox /bin/bash -c \ "while true; do echo Hello World; sleep 1; done"
![Page 144: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/144.jpg)
systemd (and docker) are... not so great
Example: sending signals to a container– Given a simple container:
[Service]ExecStart=/usr/bin/docker run busybox /bin/bash -c \ "while true; do echo Hello World; sleep 1; done"
– Try to kill it with systemctl kill hello.service
![Page 145: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/145.jpg)
systemd (and docker) are... not so great
Example: sending signals to a container– Given a simple container:
[Service]ExecStart=/usr/bin/docker run busybox /bin/bash -c \ "while true; do echo Hello World; sleep 1; done"
– Try to kill it with systemctl kill hello.service– ... Nothing happens
![Page 146: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/146.jpg)
systemd (and docker) are... not so great
Example: sending signals to a container– Given a simple container:
[Service]ExecStart=/usr/bin/docker run busybox /bin/bash -c \ "while true; do echo Hello World; sleep 1; done"
– Try to kill it with systemctl kill hello.service– ... Nothing happens– Kill command sends SIGTERM, but bash in a Docker
container has PID1, which happily ignores the signal...
![Page 147: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/147.jpg)
systemd (and docker) are... not so great
![Page 148: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/148.jpg)
systemd (and docker) are... not so great
Example: sending signals to a container
![Page 149: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/149.jpg)
systemd (and docker) are... not so great
Example: sending signals to a container OK, SIGTERM didn't work, so escalate to SIGKILL:
systemctl kill -s SIGKILL hello.service
![Page 150: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/150.jpg)
systemd (and docker) are... not so great
Example: sending signals to a container OK, SIGTERM didn't work, so escalate to SIGKILL:
systemctl kill -s SIGKILL hello.service
● Now the systemd service is gone: hello.service: main process exited, code=killed, status=9/KILL
![Page 151: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/151.jpg)
systemd (and docker) are... not so great
Example: sending signals to a container OK, SIGTERM didn't work, so escalate to SIGKILL:
systemctl kill -s SIGKILL hello.service
● Now the systemd service is gone: hello.service: main process exited, code=killed, status=9/KILL
● But... the Docker container still exists?# docker ps CONTAINER ID COMMAND STATUS NAMES7c7cf8ffabb6 /bin/sh -c 'while tr Up 31 seconds hello # ps -ef|grep '[d]ocker run'root 24231 1 0 03:49 ? 00:00:00 /usr/bin/docker run -name hello ...
![Page 152: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/152.jpg)
systemd (and docker) are... not so great
![Page 153: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/153.jpg)
systemd (and docker) are... not so great
Why?
![Page 154: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/154.jpg)
systemd (and docker) are... not so great
Why? – Docker client does not run containers itself; it just sends
a command to the Docker daemon which actually forks and starts running the container
![Page 155: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/155.jpg)
systemd (and docker) are... not so great
Why? – Docker client does not run containers itself; it just sends
a command to the Docker daemon which actually forks and starts running the container
– systemd expects processes to fork directly so they will be contained under the same cgroup tree
![Page 156: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/156.jpg)
systemd (and docker) are... not so great
Why? – Docker client does not run containers itself; it just sends
a command to the Docker daemon which actually forks and starts running the container
– systemd expects processes to fork directly so they will be contained under the same cgroup tree
– Since the Docker daemon's cgroup is entirely separate, systemd cannot keep track of the forked container
![Page 157: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/157.jpg)
systemd (and docker) are... not so great
# systemctl cat hello.service[Service]ExecStart=/bin/bash -c 'while true; do echo Hello World; sleep 1; done'
# systemd-cgls... ├─hello.service │ ├─23201 /bin/bash -c while true; do echo Hello World; sleep 1; done │ └─24023 sleep 1
![Page 158: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/158.jpg)
systemd (and docker) are... not so great
# systemctl cat hello.service[Service]ExecStart=/usr/bin/docker run -name hello busybox /bin/sh -c \"while true; do echo Hello World; sleep 1; done"
# systemd-cgls...│ ├─hello.service│ │ └─24231 /usr/bin/docker run -name hello busybox /bin/sh -c while true; do echo Hello World; sleep 1; done...│ ├─docker-51a57463047b65487ec80a1dc8b8c9ea14a396c7a49c1e23919d50bdafd4fefb.scope│ │ ├─24240 /bin/sh -c while true; do echo Hello World; sleep 1; done│ │ └─24553 sleep 1
![Page 159: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/159.jpg)
systemd (and docker) are... not so great
![Page 160: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/160.jpg)
systemd (and docker) are... not so great
Problem: poor integration with Docker Solution: ... work in progress
![Page 161: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/161.jpg)
systemd (and docker) are... not so great
Problem: poor integration with Docker Solution: ... work in progress
– systemd-docker – small application that moves cgroups of Docker containers back under systemd's cgroup
![Page 162: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/162.jpg)
systemd (and docker) are... not so great
Problem: poor integration with Docker Solution: ... work in progress
– systemd-docker – small application that moves cgroups of Docker containers back under systemd's cgroup
– Use Docker for image management, but systemd-nspawn for runtime (e.g. CoreOS's toolbox)
![Page 163: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/163.jpg)
systemd (and docker) are... not so great
Problem: poor integration with Docker Solution: ... work in progress
– systemd-docker – small application that moves cgroups of Docker containers back under systemd's cgroup
– Use Docker for image management, but systemd-nspawn for runtime (e.g. CoreOS's toolbox)
– (proposed) Docker standalone mode: client starts container directly rather than through daemon
![Page 164: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/164.jpg)
fleet – high level view
Single machine
Cluster
![Page 165: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/165.jpg)
etcd
![Page 166: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/166.jpg)
etcd
A consistent, highly available key/value store
![Page 167: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/167.jpg)
etcd
A consistent, highly available key/value store– Shared configuration, distributed locking, ...
![Page 168: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/168.jpg)
etcd
A consistent, highly available key/value store– Shared configuration, distributed locking, ...
Driven by Raft
![Page 169: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/169.jpg)
etcd
A consistent, highly available key/value store– Shared configuration, distributed locking, ...
Driven by Raft Consensus algorithm similar to Paxos
![Page 170: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/170.jpg)
etcd
A consistent, highly available key/value store– Shared configuration, distributed locking, ...
Driven by Raft Consensus algorithm similar to Paxos Designed for understandability and simplicity
![Page 171: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/171.jpg)
etcd
A consistent, highly available key/value store– Shared configuration, distributed locking, ...
Driven by Raft Consensus algorithm similar to Paxos Designed for understandability and simplicity
Popular and widely used
![Page 172: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/172.jpg)
etcd
A consistent, highly available key/value store– Shared configuration, distributed locking, ...
Driven by Raft Consensus algorithm similar to Paxos Designed for understandability and simplicity
Popular and widely used– Simple HTTP API + libraries in Go, Java, Python, Ruby, ...
![Page 173: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/173.jpg)
etcd connects CoreOS hosts
![Page 174: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/174.jpg)
fleet + etcd
![Page 175: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/175.jpg)
● fleet needs a consistent view of the cluster to make scheduling decisions: etcd provides this view
fleet + etcd
![Page 176: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/176.jpg)
● fleet needs a consistent view of the cluster to make scheduling decisions: etcd provides this view– What units exist in the cluster?
fleet + etcd
![Page 177: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/177.jpg)
● fleet needs a consistent view of the cluster to make scheduling decisions: etcd provides this view– What units exist in the cluster?– What machines exist in the cluster?
fleet + etcd
![Page 178: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/178.jpg)
● fleet needs a consistent view of the cluster to make scheduling decisions: etcd provides this view– What units exist in the cluster?– What machines exist in the cluster?– What are their current states?
fleet + etcd
![Page 179: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/179.jpg)
● fleet needs a consistent view of the cluster to make scheduling decisions: etcd provides this view– What units exist in the cluster?– What machines exist in the cluster?– What are their current states?
● All unit files, unit state, machine state and scheduling information is stored in etcd
fleet + etcd
![Page 180: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/180.jpg)
etcd is great!
![Page 181: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/181.jpg)
● Fast and simple API
etcd is great!
![Page 182: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/182.jpg)
● Fast and simple API● Handles all cluster-level/inter-machine
communication so we don't have to
etcd is great!
![Page 183: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/183.jpg)
● Fast and simple API● Handles all cluster-level/inter-machine
communication so we don't have to● Powerful primitives:
etcd is great!
![Page 184: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/184.jpg)
● Fast and simple API● Handles all cluster-level/inter-machine
communication so we don't have to● Powerful primitives:
– Compare-and-Swap allows for atomic operations and implementing locking behaviour
etcd is great!
![Page 185: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/185.jpg)
● Fast and simple API● Handles all cluster-level/inter-machine
communication so we don't have to● Powerful primitives:
– Compare-and-Swap allows for atomic operations and implementing locking behaviour
– Watches (like pub-sub) provide event-driven behaviour
etcd is great!
![Page 186: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/186.jpg)
etcd is... not so great
![Page 187: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/187.jpg)
● Problem: unreliable watches
etcd is... not so great
![Page 188: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/188.jpg)
● Problem: unreliable watches– fleet initially used a purely event-driven architecture
etcd is... not so great
![Page 189: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/189.jpg)
● Problem: unreliable watches– fleet initially used a purely event-driven architecture– watches in etcd used to trigger events
etcd is... not so great
![Page 190: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/190.jpg)
● Problem: unreliable watches– fleet initially used a purely event-driven architecture– watches in etcd used to trigger events
● e.g. “machine down” event triggers unit rescheduling
etcd is... not so great
![Page 191: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/191.jpg)
● Problem: unreliable watches– fleet initially used a purely event-driven architecture– watches in etcd used to trigger events
● e.g. “machine down” event triggers unit rescheduling– Highly efficient: only take action when necessary
etcd is... not so great
![Page 192: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/192.jpg)
● Problem: unreliable watches– fleet initially used a purely event-driven architecture– watches in etcd used to trigger events
● e.g. “machine down” event triggers unit rescheduling– Highly efficient: only take action when necessary– Highly responsive: as soon as a user submits a new
unit, an event is triggered to schedule that unit
etcd is... not so great
![Page 193: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/193.jpg)
● Problem: unreliable watches– fleet initially used a purely event-driven architecture– watches in etcd used to trigger events
● e.g. “machine down” event triggers unit rescheduling– Highly efficient: only take action when necessary– Highly responsive: as soon as a user submits a new
unit, an event is triggered to schedule that unit– Unfortunately, many places for things to go wrong...
etcd is... not so great
![Page 194: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/194.jpg)
Unreliable watches
![Page 195: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/195.jpg)
● Example:
Unreliable watches
![Page 196: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/196.jpg)
● Example:– etcd is undergoing a leader election or otherwise
unavailable, watches do not work during this period
Unreliable watches
![Page 197: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/197.jpg)
● Example:– etcd is undergoing a leader election or otherwise
unavailable, watches do not work during this period– Change occurs (e.g. a machine leaves the cluster)
Unreliable watches
![Page 198: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/198.jpg)
● Example:– etcd is undergoing a leader election or otherwise
unavailable, watches do not work during this period– Change occurs (e.g. a machine leaves the cluster)– Event is missed
Unreliable watches
![Page 199: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/199.jpg)
● Example:– etcd is undergoing a leader election or otherwise
unavailable, watches do not work during this period– Change occurs (e.g. a machine leaves the cluster)– Event is missed – fleet doesn't know machine is lost!
Unreliable watches
![Page 200: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/200.jpg)
● Example:– etcd is undergoing a leader election or otherwise
unavailable, watches do not work during this period– Change occurs (e.g. a machine leaves the cluster)– Event is missed – fleet doesn't know machine is lost!– Now fleet doesn't know to reschedule units that were
running on that machine
Unreliable watches
![Page 201: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/201.jpg)
etcd is... not so great
![Page 202: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/202.jpg)
● Problem: limited event history
etcd is... not so great
![Page 203: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/203.jpg)
● Problem: limited event history– etcd retains a history of all events that occur
etcd is... not so great
![Page 204: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/204.jpg)
● Problem: limited event history– etcd retains a history of all events that occur– Can “watch” from an arbitrary point in the past, but..
etcd is... not so great
![Page 205: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/205.jpg)
● Problem: limited event history– etcd retains a history of all events that occur– Can “watch” from an arbitrary point in the past, but..– History is a limited window!
etcd is... not so great
![Page 206: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/206.jpg)
● Problem: limited event history– etcd retains a history of all events that occur– Can “watch” from an arbitrary point in the past, but..– History is a limited window!– With a busy cluster, watches can fall out of this window
etcd is... not so great
![Page 207: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/207.jpg)
● Problem: limited event history– etcd retains a history of all events that occur– Can “watch” from an arbitrary point in the past, but..– History is a limited window!– With a busy cluster, watches can fall out of this window– Can't always replay event stream from the point we
want to
etcd is... not so great
![Page 208: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/208.jpg)
Limited event history
![Page 209: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/209.jpg)
● Example:– etcd holds history of last 1000 events
Limited event history
![Page 210: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/210.jpg)
● Example:– etcd holds history of last 1000 events– fleet sets watch at i=100 to watch for machine loss
Limited event history
![Page 211: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/211.jpg)
● Example:– etcd holds history of last 1000 events– fleet sets watch at i=100 to watch for machine loss– Meanwhile, many changes occur in other parts of the
keyspace, advancing index to i=1500
Limited event history
![Page 212: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/212.jpg)
● Example:– etcd holds history of last 1000 events– fleet sets watch at i=100 to watch for machine loss– Meanwhile, many changes occur in other parts of the
keyspace, advancing index to i=1500– Leader election/network hiccup occurs and severs
watch
Limited event history
![Page 213: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/213.jpg)
● Example:– etcd holds history of last 1000 events– fleet sets watch at i=100 to watch for machine loss– Meanwhile, many changes occur in other parts of the
keyspace, advancing index to i=1500– Leader election/network hiccup occurs and severs
watch– fleet tries to recreate watch at i=100 and fails:
Limited event history
![Page 214: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/214.jpg)
● Example:– etcd holds history of last 1000 events– fleet sets watch at i=100 to watch for machine loss– Meanwhile, many changes occur in other parts of the
keyspace, advancing index to i=1500– Leader election/network hiccup occurs and severs watch– fleet tries to recreate watch at i=100 and fails:
err="401: The event in requested index is outdated and cleared (the requested history has been cleared [1500/100])
Limited event history
![Page 215: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/215.jpg)
etcd is... not so great
![Page 216: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/216.jpg)
● Problem: unreliable watches– Missed events lead to unrecoverable situations
● Problem: limited event history– Can't always replay entire event stream
etcd is... not so great
![Page 217: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/217.jpg)
● Problem: unreliable watches– Missed events lead to unrecoverable situations
● Problem: limited event history– Can't always replay entire event stream
● Solution: move to “reconciler” model
etcd is... not so great
![Page 218: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/218.jpg)
Reconciler model
![Page 219: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/219.jpg)
Reconciler model
In a loop, run periodically until stopped:
![Page 220: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/220.jpg)
Reconciler model
In a loop, run periodically until stopped:
1. Retrieve current state (how the world is) and desired state (how the world should be) from datastore (etcd)
![Page 221: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/221.jpg)
Reconciler model
In a loop, run periodically until stopped:
1. Retrieve current state (how the world is) and desired state (how the world should be) from datastore (etcd)
2. Determine necessary actions to transform current state --> desired state
![Page 222: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/222.jpg)
Reconciler model
In a loop, run periodically until stopped:
1. Retrieve current state (how the world is) and desired state (how the world should be) from datastore (etcd)
2. Determine necessary actions to transform current state --> desired state
3. Perform actions and save results as new current state
![Page 223: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/223.jpg)
Reconciler model
Example: fleet's engine (scheduler) looks something like:for { // loop forever
select {
case <- stopChan: // if stopped, exit return
case <- time.After(5 * time.Minute): units = fetchUnits() machines = fetchMachines() schedule(units, machines)
}
}
![Page 224: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/224.jpg)
etcd is... not so great
![Page 225: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/225.jpg)
● Problem: unreliable watches– Missed events lead to unrecoverable situations
● Problem: limited event history– Can't always replay entire event stream
● Solution: move to “reconciler” model
etcd is... not so great
![Page 226: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/226.jpg)
● Problem: unreliable watches– Missed events lead to unrecoverable situations
● Problem: limited event history– Can't always replay entire event stream
● Solution: move to “reconciler” model– Less efficient, but extremely robust
etcd is... not so great
![Page 227: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/227.jpg)
● Problem: unreliable watches– Missed events lead to unrecoverable situations
● Problem: limited event history– Can't always replay entire event stream
● Solution: move to “reconciler” model– Less efficient, but extremely robust– Still many paths for optimisation (e.g. using watches to
trigger reconciliations)
etcd is... not so great
![Page 228: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/228.jpg)
fleet – high level view
Single machine
Cluster
![Page 229: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/229.jpg)
golang
![Page 230: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/230.jpg)
● Standard language for all CoreOS projects (above OS)
golang
![Page 231: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/231.jpg)
● Standard language for all CoreOS projects (above OS)
– etcd
golang
![Page 232: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/232.jpg)
● Standard language for all CoreOS projects (above OS)
– etcd– fleet
golang
![Page 233: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/233.jpg)
● Standard language for all CoreOS projects (above OS)
– etcd– fleet– locksmith (semaphore for reboots during updates)
golang
![Page 234: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/234.jpg)
● Standard language for all CoreOS projects (above OS)
– etcd– fleet– locksmith (semaphore for reboots during updates)– etcdctl, updatectl, coreos-cloudinit, ...
golang
![Page 235: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/235.jpg)
● Standard language for all CoreOS projects (above OS)
– etcd– fleet– locksmith (semaphore for reboots during updates)– etcdctl, updatectl, coreos-cloudinit, ...
● fleet is ~10k LOC (and another ~10k LOC tests)
golang
![Page 236: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/236.jpg)
Go is great!
![Page 237: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/237.jpg)
● Fast!
Go is great!
![Page 238: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/238.jpg)
● Fast!– to write (concise syntax)
Go is great!
![Page 239: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/239.jpg)
● Fast!– to write (concise syntax)– to compile (builds typically <1s)
Go is great!
![Page 240: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/240.jpg)
● Fast!– to write (concise syntax)– to compile (builds typically <1s)– to run tests (O(seconds), including with race detection)
Go is great!
![Page 241: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/241.jpg)
● Fast!– to write (concise syntax)– to compile (builds typically <1s)– to run tests (O(seconds), including with race detection)– Never underestimate power of rapid iteration
Go is great!
![Page 242: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/242.jpg)
● Fast!– to write (concise syntax)– to compile (builds typically <1s)– to run tests (O(seconds), including with race detection)– Never underestimate power of rapid iteration
● Simple, powerful tooling
Go is great!
![Page 243: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/243.jpg)
● Fast!– to write (concise syntax)– to compile (builds typically <1s)– to run tests (O(seconds), including with race detection)– Never underestimate power of rapid iteration
● Simple, powerful tooling– Built in package management, code coverage, etc.
Go is great!
![Page 244: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/244.jpg)
Go is great!
![Page 245: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/245.jpg)
● Rich standard library
Go is great!
![Page 246: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/246.jpg)
● Rich standard library– “Batteries are included”
Go is great!
![Page 247: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/247.jpg)
● Rich standard library– “Batteries are included”– e.g.: completely self-hosted HTTP server, no need for
reverse proxies or worker systems to serve many concurrent HTTP requests
Go is great!
![Page 248: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/248.jpg)
● Rich standard library– “Batteries are included”– e.g.: completely self-hosted HTTP server, no need for
reverse proxies or worker systems to serve many concurrent HTTP requests
● Static compilation into a single binary
Go is great!
![Page 249: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/249.jpg)
● Rich standard library– “Batteries are included”– e.g.: completely self-hosted HTTP server, no need for
reverse proxies or worker systems to serve many concurrent HTTP requests
● Static compilation into a single binary– Ideal for a minimal OS with no libraries
Go is great!
![Page 250: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/250.jpg)
Go is... not so great
![Page 251: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/251.jpg)
● Problem: managing third-party dependencies
Go is... not so great
![Page 252: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/252.jpg)
● Problem: managing third-party dependencies– modular package management but: no versioning
Go is... not so great
![Page 253: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/253.jpg)
● Problem: managing third-party dependencies– modular package management but: no versioning– import “github.com/coreos/fleet” - which SHA?
Go is... not so great
![Page 254: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/254.jpg)
● Problem: managing third-party dependencies– modular package management but: no versioning– import “github.com/coreos/fleet” - which SHA?
● Solution: vendoring :-/
Go is... not so great
![Page 255: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/255.jpg)
● Problem: managing third-party dependencies– modular package management but: no versioning– import “github.com/coreos/fleet” - which SHA?
● Solution: vendoring :-/– Copy entire source tree of dependencies into repository
Go is... not so great
![Page 256: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/256.jpg)
● Problem: managing third-party dependencies– modular package management but: no versioning– import “github.com/coreos/fleet” - which SHA?
● Solution: vendoring :-/– Copy entire source tree of dependencies into repository– Slowly maturing tooling: goven, third_party.go, Godep
Go is... not so great
![Page 257: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/257.jpg)
Go is... not so great
![Page 258: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/258.jpg)
● Problem: (relatively) large binary sizes
Go is... not so great
![Page 259: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/259.jpg)
● Problem: (relatively) large binary sizes– “relatively”, but... CoreOS is the minimal OS
Go is... not so great
![Page 260: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/260.jpg)
● Problem: (relatively) large binary sizes– “relatively”, but... CoreOS is the minimal OS– ~10MB per binary, many tools, quickly adds up
Go is... not so great
![Page 261: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/261.jpg)
● Problem: (relatively) large binary sizes– “relatively”, but... CoreOS is the minimal OS– ~10MB per binary, many tools, quickly adds up
● Solutions:
Go is... not so great
![Page 262: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/262.jpg)
● Problem: (relatively) large binary sizes– “relatively”, but... CoreOS is the minimal OS– ~10MB per binary, many tools, quickly adds up
● Solutions: – upgrading golang!
Go is... not so great
![Page 263: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/263.jpg)
● Problem: (relatively) large binary sizes– “relatively”, but... CoreOS is the minimal OS– ~10MB per binary, many tools, quickly adds up
● Solutions: – upgrading golang!
● go1.2 to go1.3 = ~25% reduction
Go is... not so great
![Page 264: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/264.jpg)
● Problem: (relatively) large binary sizes– “relatively”, but... CoreOS is the minimal OS– ~10MB per binary, many tools, quickly adds up
● Solutions: – upgrading golang!
● go1.2 to go1.3 = ~25% reduction
– sharing the binary between tools
Go is... not so great
![Page 265: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/265.jpg)
Sharing a binary
![Page 266: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/266.jpg)
● client/daemon often share much of the same code
Sharing a binary
![Page 267: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/267.jpg)
● client/daemon often share much of the same code– Encapsulate multiple tools in one binary, symlink the
different command names, switch off command name
Sharing a binary
![Page 268: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/268.jpg)
● client/daemon often share much of the same code– Encapsulate multiple tools in one binary, symlink the
different command names, switch off command name– Example: fleetd/fleetctl
Sharing a binary
![Page 269: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/269.jpg)
● client/daemon often share much of the same code– Encapsulate multiple tools in one binary, symlink the
different command names, switch off command name– Example: fleetd/fleetctl
Sharing a binary
func main() { switch os.Args[0] { case “fleetctl”: Fleetctl() case “fleetd”: Fleetd() }}
![Page 270: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/270.jpg)
● client/daemon often share much of the same code– Encapsulate multiple tools in one binary, symlink the
different command names, switch off command name– Example: fleetd/fleetctl
Sharing a binary
func main() { switch os.Args[0] { case “fleetctl”: Fleetctl() case “fleetd”: Fleetd() }}
Before:
9150032 fleetctl 8567416 fleetd
After:
11052256 fleetctl 8 fleetd -> fleetctl
![Page 271: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/271.jpg)
Go is... not so great
![Page 272: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/272.jpg)
● Problem: young language => immature libraries
Go is... not so great
![Page 273: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/273.jpg)
● Problem: young language => immature libraries– CLI frameworks
Go is... not so great
![Page 274: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/274.jpg)
● Problem: young language => immature libraries– CLI frameworks– godbus, go-systemd, go-etcd :-(
Go is... not so great
![Page 275: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/275.jpg)
● Problem: young language => immature libraries– CLI frameworks– godbus, go-systemd, go-etcd :-(
● Solutions:
Go is... not so great
![Page 276: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/276.jpg)
● Problem: young language => immature libraries– CLI frameworks– godbus, go-systemd, go-etcd :-(
● Solutions:– Keep it simple
Go is... not so great
![Page 277: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/277.jpg)
● Problem: young language => immature libraries– CLI frameworks– godbus, go-systemd, go-etcd :-(
● Solutions:– Keep it simple– Roll your own (e.g. fleetctl's command line)
Go is... not so great
![Page 278: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/278.jpg)
type Command struct {
Name string
Summary string
Usage string
Description string
Flags flag.FlagSet
Run func(args []string) int
}
fleetctl CLI
![Page 279: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/279.jpg)
Wrap up/recap
![Page 280: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/280.jpg)
Wrap up/recap
CoreOS Linux
![Page 281: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/281.jpg)
Wrap up/recap
CoreOS Linux– Minimal OS with cluster capabilities built-in
![Page 282: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/282.jpg)
Wrap up/recap
CoreOS Linux– Minimal OS with cluster capabilities built-in
– Containerized applications --> a[u]tom[at]ic updates
![Page 283: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/283.jpg)
Wrap up/recap
CoreOS Linux– Minimal OS with cluster capabilities built-in
– Containerized applications --> a[u]tom[at]ic updates fleet
![Page 284: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/284.jpg)
Wrap up/recap
CoreOS Linux– Minimal OS with cluster capabilities built-in
– Containerized applications --> a[u]tom[at]ic updates fleet
– Simple, powerful cluster-level application manager
![Page 285: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/285.jpg)
Wrap up/recap
CoreOS Linux– Minimal OS with cluster capabilities built-in
– Containerized applications --> a[u]tom[at]ic updates fleet
– Simple, powerful cluster-level application manager
– Glue between local init system (systemd) and cluster-level awareness (etcd)
![Page 286: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/286.jpg)
Wrap up/recap
CoreOS Linux– Minimal OS with cluster capabilities built-in– Containerized applications --> a[u]tom[at]ic updates
fleet– Simple, powerful cluster-level application manager– Glue between local init system (systemd) and cluster-level
awareness (etcd)– golang++
![Page 287: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/287.jpg)
Questions?
![Page 288: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/288.jpg)
Thank you :-)
● Everything is open source – join us!– https://github.com/coreos
● Any more questions, feel free to email– [email protected]
● CoreOS stickers!
![Page 289: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/289.jpg)
References
● CoreOS updates https://coreos.com/using-coreos/updates/
● Omaha protocol https://code.google.com/p/omaha/wiki/ServerProtocol
● Raft algorithmhttp://raftconsensus.github.io/
![Page 290: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/290.jpg)
References
● fleet https://github.com/coreos/fleet
● etcd https://github.com/coreos/etcd
● toolboxhttps://github.com/coreos/toolbox
● systemd-dockerhttps://github.com/ibuildthecloud/systemd-docker
● systemd-nspawn
http://0pointer.de/public/systemd-man/systemd-nspawn.html
![Page 291: [2C4]Clustered computing with CoreOS, fleet and etcd](https://reader034.fdocuments.us/reader034/viewer/2022051109/547e89055806b5f45e8b46ce/html5/thumbnails/291.jpg)
Brief side-note: locksmith
● Reboot manager for CoreOS● Uses a semaphore in etcd to co-ordinate reboots● Each machine in the cluster:
1. Downloads and applies update
2. Takes lock in etcd (using Compare-And-Swap)
3. Reboots and releases lock