Storing VMs with Cinder and Ceph RBD.pdf
-
Upload
openstack-foundation -
Category
Documents
-
view
4.738 -
download
4
description
Transcript of Storing VMs with Cinder and Ceph RBD.pdf
![Page 1: Storing VMs with Cinder and Ceph RBD.pdf](https://reader034.fdocuments.us/reader034/viewer/2022051610/54809acfb479595e578b46fa/html5/thumbnails/1.jpg)
Storing VMs with Cinder and
Ceph RBD
![Page 2: Storing VMs with Cinder and Ceph RBD.pdf](https://reader034.fdocuments.us/reader034/viewer/2022051610/54809acfb479595e578b46fa/html5/thumbnails/2.jpg)
Growing With Hardware Appliances
First PB
• Proprietary storage hardware
• Well-known storage vendor
$14 b’zillion
Second PB
• Proprietary storage hardware
• Same storage vendor
Another
$14 b’zillion
47
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
![Page 3: Storing VMs with Cinder and Ceph RBD.pdf](https://reader034.fdocuments.us/reader034/viewer/2022051610/54809acfb479595e578b46fa/html5/thumbnails/3.jpg)
52
DC
DC
DC
DC
D
C
DC
DC
DC
DC
DC
DC
DC
C++
![Page 4: Storing VMs with Cinder and Ceph RBD.pdf](https://reader034.fdocuments.us/reader034/viewer/2022051610/54809acfb479595e578b46fa/html5/thumbnails/4.jpg)
53
DC
DC
DC
DC
D
C
DC
DC
DC
DC
DC
DC
DC
C++ X
![Page 5: Storing VMs with Cinder and Ceph RBD.pdf](https://reader034.fdocuments.us/reader034/viewer/2022051610/54809acfb479595e578b46fa/html5/thumbnails/5.jpg)
54
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
HUMAN [DEVELOPER]
!!
![Page 6: Storing VMs with Cinder and Ceph RBD.pdf](https://reader034.fdocuments.us/reader034/viewer/2022051610/54809acfb479595e578b46fa/html5/thumbnails/6.jpg)
Hard Drives Are Tiny Record Players and They Fail Often jon_a_ross, Flickr / CC BY 2.0 71
![Page 7: Storing VMs with Cinder and Ceph RBD.pdf](https://reader034.fdocuments.us/reader034/viewer/2022051610/54809acfb479595e578b46fa/html5/thumbnails/7.jpg)
72
D
55 times / day
= D
D D
x 1 MILLION
D D
D D
![Page 8: Storing VMs with Cinder and Ceph RBD.pdf](https://reader034.fdocuments.us/reader034/viewer/2022051610/54809acfb479595e578b46fa/html5/thumbnails/8.jpg)
73
![Page 9: Storing VMs with Cinder and Ceph RBD.pdf](https://reader034.fdocuments.us/reader034/viewer/2022051610/54809acfb479595e578b46fa/html5/thumbnails/9.jpg)
OPEN SOURCE
COMMUNITY-FOCUSED
SCALABLE
NO SINGLE POINT OF FAILURE
SOFTWARE BASED
SELF-MANAGING
philosophy design
![Page 10: Storing VMs with Cinder and Ceph RBD.pdf](https://reader034.fdocuments.us/reader034/viewer/2022051610/54809acfb479595e578b46fa/html5/thumbnails/10.jpg)
79
RADOS A reliable, autonomous, distributed object store comprised of self-healing, self-managing,
intelligent storage nodes
LIBRADOS
A library allowing
apps to directly
access RADOS, with support for
C, C++, Java,
Python, Ruby,
and PHP
RBD A reliable and fully-
distributed block device, with a Linux
kernel client and a
QEMU/KVM driver
CEPH FS A POSIX-compliant
distributed file system, with a Linux
kernel client and
support for FUSE
RADOSGW A bucket-based REST
gateway, compatible with S3 and Swift
APP APP HOST/VM CLIENT
![Page 11: Storing VMs with Cinder and Ceph RBD.pdf](https://reader034.fdocuments.us/reader034/viewer/2022051610/54809acfb479595e578b46fa/html5/thumbnails/11.jpg)
81
DISK
FS
DISK DISK
OSD
DISK DISK
OSD OSD OSD OSD
FS FS FS FS btrfs xfs
ext4
M M M
![Page 12: Storing VMs with Cinder and Ceph RBD.pdf](https://reader034.fdocuments.us/reader034/viewer/2022051610/54809acfb479595e578b46fa/html5/thumbnails/12.jpg)
82
M
M
M
HUMAN
![Page 13: Storing VMs with Cinder and Ceph RBD.pdf](https://reader034.fdocuments.us/reader034/viewer/2022051610/54809acfb479595e578b46fa/html5/thumbnails/13.jpg)
83
Monitors:
• Maintain cluster map
• Provide consensus for distributed decision-making
• Must have an odd number
• These do not serve stored objects to clients
M
OSDs: • One per disk (recommended)
• At least three in a cluster
• Serve stored objects to clients
• Intelligently peer to perform replication tasks
• Supports object classes
![Page 14: Storing VMs with Cinder and Ceph RBD.pdf](https://reader034.fdocuments.us/reader034/viewer/2022051610/54809acfb479595e578b46fa/html5/thumbnails/14.jpg)
APP??
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
![Page 15: Storing VMs with Cinder and Ceph RBD.pdf](https://reader034.fdocuments.us/reader034/viewer/2022051610/54809acfb479595e578b46fa/html5/thumbnails/15.jpg)
APP
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
![Page 16: Storing VMs with Cinder and Ceph RBD.pdf](https://reader034.fdocuments.us/reader034/viewer/2022051610/54809acfb479595e578b46fa/html5/thumbnails/16.jpg)
APP
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
DC
A-G
H-N
O-T
U-Z
F*
![Page 17: Storing VMs with Cinder and Ceph RBD.pdf](https://reader034.fdocuments.us/reader034/viewer/2022051610/54809acfb479595e578b46fa/html5/thumbnails/17.jpg)
107
10 10 01 01 10 10 01 11 01 10
10 10 01 01 10 10 01 11 01 10
hash(object name) % num pg
CRUSH(pg, cluster state, rule set)
![Page 18: Storing VMs with Cinder and Ceph RBD.pdf](https://reader034.fdocuments.us/reader034/viewer/2022051610/54809acfb479595e578b46fa/html5/thumbnails/18.jpg)
108
10 10 01 01 10 10 01 11 01 10
10 10 01 01 10 10 01 11 01 10
![Page 19: Storing VMs with Cinder and Ceph RBD.pdf](https://reader034.fdocuments.us/reader034/viewer/2022051610/54809acfb479595e578b46fa/html5/thumbnails/19.jpg)
109
CRUSH
• Pseudo-random placement algorithm
• Ensures even distribution
• Repeatable, deterministic
• Rule-based configuration
• Replica count
• Infrastructure topology
• Weighting
![Page 20: Storing VMs with Cinder and Ceph RBD.pdf](https://reader034.fdocuments.us/reader034/viewer/2022051610/54809acfb479595e578b46fa/html5/thumbnails/20.jpg)
110
CLIENT
??
![Page 21: Storing VMs with Cinder and Ceph RBD.pdf](https://reader034.fdocuments.us/reader034/viewer/2022051610/54809acfb479595e578b46fa/html5/thumbnails/21.jpg)
112
![Page 22: Storing VMs with Cinder and Ceph RBD.pdf](https://reader034.fdocuments.us/reader034/viewer/2022051610/54809acfb479595e578b46fa/html5/thumbnails/22.jpg)
113
CLIENT
??
![Page 23: Storing VMs with Cinder and Ceph RBD.pdf](https://reader034.fdocuments.us/reader034/viewer/2022051610/54809acfb479595e578b46fa/html5/thumbnails/23.jpg)
111
![Page 24: Storing VMs with Cinder and Ceph RBD.pdf](https://reader034.fdocuments.us/reader034/viewer/2022051610/54809acfb479595e578b46fa/html5/thumbnails/24.jpg)
84
RADOS A reliable, autonomous, distributed object store comprised of self-healing, self-managing,
intelligent storage nodes
LIBRADOS
A library allowing
apps to directly
access RADOS, with support for
C, C++, Java,
Python, Ruby,
and PHP
RBD A reliable and fully-
distributed block device, with a Linux
kernel client and a
QEMU/KVM driver
CEPH FS A POSIX-compliant
distributed file system, with a Linux
kernel client and
support for FUSE
RADOSGW A bucket-based REST
gateway, compatible with S3 and Swift
APP APP HOST/VM CLIENT
![Page 25: Storing VMs with Cinder and Ceph RBD.pdf](https://reader034.fdocuments.us/reader034/viewer/2022051610/54809acfb479595e578b46fa/html5/thumbnails/25.jpg)
LIBRADOS
M
M
M
85
APP
native
![Page 26: Storing VMs with Cinder and Ceph RBD.pdf](https://reader034.fdocuments.us/reader034/viewer/2022051610/54809acfb479595e578b46fa/html5/thumbnails/26.jpg)
L
LIBRADOS
• Provides direct access to RADOS for applications
• C, C++, Python, PHP, Java
• No HTTP overhead
![Page 27: Storing VMs with Cinder and Ceph RBD.pdf](https://reader034.fdocuments.us/reader034/viewer/2022051610/54809acfb479595e578b46fa/html5/thumbnails/27.jpg)
87
RADOS A reliable, autonomous, distributed object store comprised of self-healing, self-managing,
intelligent storage nodes
LIBRADOS
A library allowing
apps to directly
access RADOS, with support for
C, C++, Java,
Python, Ruby,
and PHP
RBD A reliable and fully-
distributed block device, with a Linux
kernel client and a
QEMU/KVM driver
CEPH FS A POSIX-compliant
distributed file system, with a Linux
kernel client and
support for FUSE
RADOSGW A bucket-based REST
gateway, compatible with S3 and Swift
APP APP HOST/VM CLIENT
![Page 28: Storing VMs with Cinder and Ceph RBD.pdf](https://reader034.fdocuments.us/reader034/viewer/2022051610/54809acfb479595e578b46fa/html5/thumbnails/28.jpg)
88
M
M
M
LIBRADOS
RADOSGW
APP
native
REST
LIBRADOS
RADOSGW
APP
![Page 29: Storing VMs with Cinder and Ceph RBD.pdf](https://reader034.fdocuments.us/reader034/viewer/2022051610/54809acfb479595e578b46fa/html5/thumbnails/29.jpg)
89
RADOS Gateway:
• REST-based interface to RADOS
• Supports buckets, accounting
• Compatible with S3 and Swift applications
![Page 30: Storing VMs with Cinder and Ceph RBD.pdf](https://reader034.fdocuments.us/reader034/viewer/2022051610/54809acfb479595e578b46fa/html5/thumbnails/30.jpg)
90
RADOS A reliable, autonomous, distributed object store comprised of self-healing, self-managing,
intelligent storage nodes
LIBRADOS
A library allowing
apps to directly
access RADOS, with support for
C, C++, Java,
Python, Ruby,
and PHP
CEPH FS A POSIX-compliant
distributed file system, with a Linux
kernel client and
support for FUSE
RADOSGW A bucket-based REST
gateway, compatible with S3 and Swift
APP APP HOST/VM CLIENT
RBD A reliable and fully-
distributed block device, with a Linux
kernel client and a
QEMU/KVM driver
![Page 31: Storing VMs with Cinder and Ceph RBD.pdf](https://reader034.fdocuments.us/reader034/viewer/2022051610/54809acfb479595e578b46fa/html5/thumbnails/31.jpg)
91
M
M
M
VM
LIBRADOS LIBRBD
VIRTUALIZATION CONTAINER
![Page 32: Storing VMs with Cinder and Ceph RBD.pdf](https://reader034.fdocuments.us/reader034/viewer/2022051610/54809acfb479595e578b46fa/html5/thumbnails/32.jpg)
LIBRADOS
92
M
M
M
LIBRBD
CONTAINER
LIBRADOS LIBRBD
CONTAINER VM
![Page 33: Storing VMs with Cinder and Ceph RBD.pdf](https://reader034.fdocuments.us/reader034/viewer/2022051610/54809acfb479595e578b46fa/html5/thumbnails/33.jpg)
LIBRADOS
93
M
M
M
KRBD (KERNEL MODULE)
HOST
![Page 34: Storing VMs with Cinder and Ceph RBD.pdf](https://reader034.fdocuments.us/reader034/viewer/2022051610/54809acfb479595e578b46fa/html5/thumbnails/34.jpg)
RADOS Block Device:
• Storage of virtual disks in RADOS
• Allows decoupling of VMs and
containers
• Live migration!
• Images are striped across the
cluster
• Thin-provisioning
• Snapshots and cloning
![Page 35: Storing VMs with Cinder and Ceph RBD.pdf](https://reader034.fdocuments.us/reader034/viewer/2022051610/54809acfb479595e578b46fa/html5/thumbnails/35.jpg)
LIBRADOS
115
M
M
M
VM
LIBRBD
VIRTUALIZATION CONTAINER
![Page 36: Storing VMs with Cinder and Ceph RBD.pdf](https://reader034.fdocuments.us/reader034/viewer/2022051610/54809acfb479595e578b46fa/html5/thumbnails/36.jpg)
HOW DO YOU
SPIN UP
THOUSANDS OF VMs
INSTANTLY
AND
EFFICIENTLY?
116
![Page 37: Storing VMs with Cinder and Ceph RBD.pdf](https://reader034.fdocuments.us/reader034/viewer/2022051610/54809acfb479595e578b46fa/html5/thumbnails/37.jpg)
144
117
0 0 0 0
instant copy
= 144
![Page 38: Storing VMs with Cinder and Ceph RBD.pdf](https://reader034.fdocuments.us/reader034/viewer/2022051610/54809acfb479595e578b46fa/html5/thumbnails/38.jpg)
4 144
118
CLIENT
write
write
write
= 148
write
![Page 39: Storing VMs with Cinder and Ceph RBD.pdf](https://reader034.fdocuments.us/reader034/viewer/2022051610/54809acfb479595e578b46fa/html5/thumbnails/39.jpg)
4 144
119
CLIENT read
read
read
= 148
![Page 40: Storing VMs with Cinder and Ceph RBD.pdf](https://reader034.fdocuments.us/reader034/viewer/2022051610/54809acfb479595e578b46fa/html5/thumbnails/40.jpg)
29
local disk(VM images)
Novacompute
Glance(templates)
read X
X
X'
old-style VM image creation
● ephemeral
● expensive to create
![Page 41: Storing VMs with Cinder and Ceph RBD.pdf](https://reader034.fdocuments.us/reader034/viewer/2022051610/54809acfb479595e578b46fa/html5/thumbnails/41.jpg)
Why use block storage?
• Persistent• More familiar to users
• Not tied to a single host• Decouples compute and storage• Enables Live migration
• Extra capabilities of storage system• Efficient snapshots• Different types of storage available• Cloning for fast restore or scaling
![Page 42: Storing VMs with Cinder and Ceph RBD.pdf](https://reader034.fdocuments.us/reader034/viewer/2022051610/54809acfb479595e578b46fa/html5/thumbnails/42.jpg)
31
CinderAPI
Cindervolume
create image from X
X
Cinder volume creation
Glance(templates)
volume driver
locate X
location of X
read X
X'
reference to X'
flexibility in where VM images are stored
![Page 43: Storing VMs with Cinder and Ceph RBD.pdf](https://reader034.fdocuments.us/reader034/viewer/2022051610/54809acfb479595e578b46fa/html5/thumbnails/43.jpg)
32
CinderAPI
Cindervolume
create image from X
X
Efficient volume creation
Glance(templates)
volume driver
locate X
location of X
clone X to X'
X'
reference to X'
fast CoW clone
X' complete
![Page 44: Storing VMs with Cinder and Ceph RBD.pdf](https://reader034.fdocuments.us/reader034/viewer/2022051610/54809acfb479595e578b46fa/html5/thumbnails/44.jpg)
Questions?
Josh Durgin
jdurgin on freenode
inktank.com | ceph.com