A Storage Story #ChefConf2013
-
Upload
kyle-bader -
Category
Technology
-
view
1.846 -
download
0
Transcript of A Storage Story #ChefConf2013
![Page 2: A Storage Story #ChefConf2013](https://reader033.fdocuments.us/reader033/viewer/2022052905/5585678cd8b42a970b8b4fd4/html5/thumbnails/2.jpg)
About Me
Dad, husband, technologist.
Sr. Systems Engineer
@DreamHost
free software linux internals storage
networking security monitoring
distributed systems automation
DreamHost
![Page 3: A Storage Story #ChefConf2013](https://reader033.fdocuments.us/reader033/viewer/2022052905/5585678cd8b42a970b8b4fd4/html5/thumbnails/3.jpg)
Outline
DreamHost Storage History
Anatomy of Ceph
Automating Storage
DreamHost
![Page 4: A Storage Story #ChefConf2013](https://reader033.fdocuments.us/reader033/viewer/2022052905/5585678cd8b42a970b8b4fd4/html5/thumbnails/4.jpg)
Destro
- DreamHost's first web server
- Pentium 100
- SCSI storage
- Shared T1 line
DreamHost
![Page 5: A Storage Story #ChefConf2013](https://reader033.fdocuments.us/reader033/viewer/2022052905/5585678cd8b42a970b8b4fd4/html5/thumbnails/5.jpg)
DH2DreamHost
![Page 6: A Storage Story #ChefConf2013](https://reader033.fdocuments.us/reader033/viewer/2022052905/5585678cd8b42a970b8b4fd4/html5/thumbnails/6.jpg)
NetApp
NetApp Fabric Attached Storage
- 15k Fiber channel drives
- Filer heads serve NFS
- Fast failover
- Large failure domains
- Expensive
- Low density
DreamHost
![Page 7: A Storage Story #ChefConf2013](https://reader033.fdocuments.us/reader033/viewer/2022052905/5585678cd8b42a970b8b4fd4/html5/thumbnails/7.jpg)
Coraid
Coraid SAN
- Shelves carry SATA devices, provide AoE volumes
- Head units mount AoE volumes, XFS, NFS shares
- Linux!
- Fast failover
- Large failure domains (single L2 segment)
DreamHost
![Page 8: A Storage Story #ChefConf2013](https://reader033.fdocuments.us/reader033/viewer/2022052905/5585678cd8b42a970b8b4fd4/html5/thumbnails/8.jpg)
Thumper
Sun Sunfire X4500
- 45 Drives in a 4U chassis
- Legendary hardware
- Fast failover
- High density
- Large failure domains
- SATA
- Heavy
DreamHost
![Page 9: A Storage Story #ChefConf2013](https://reader033.fdocuments.us/reader033/viewer/2022052905/5585678cd8b42a970b8b4fd4/html5/thumbnails/9.jpg)
DreamHost Solution: Hybrid
![Page 10: A Storage Story #ChefConf2013](https://reader033.fdocuments.us/reader033/viewer/2022052905/5585678cd8b42a970b8b4fd4/html5/thumbnails/10.jpg)
BlueArc "Titanic"
- Switched fiber channel
- Head units serve NFS
- Tiered storage, FC/SATA
- Fast failover
- Larger failure domain (than NetApp)
- Software bugs :(
- Tiering: find -atime
DreamHost BlueArc
![Page 11: A Storage Story #ChefConf2013](https://reader033.fdocuments.us/reader033/viewer/2022052905/5585678cd8b42a970b8b4fd4/html5/thumbnails/11.jpg)
DreamHost UNLIMITED
![Page 12: A Storage Story #ChefConf2013](https://reader033.fdocuments.us/reader033/viewer/2022052905/5585678cd8b42a970b8b4fd4/html5/thumbnails/12.jpg)
DreamHost Thoughting..
![Page 13: A Storage Story #ChefConf2013](https://reader033.fdocuments.us/reader033/viewer/2022052905/5585678cd8b42a970b8b4fd4/html5/thumbnails/13.jpg)
Mixed Strategy
- Separate email and web storage
- Email IO is heavy random, lots of small files
- Web storage needs to be dense
- FC NAS for email
- SATA RAID for web storage
- SATA ZRAID for backups
DreamHost
![Page 14: A Storage Story #ChefConf2013](https://reader033.fdocuments.us/reader033/viewer/2022052905/5585678cd8b42a970b8b4fd4/html5/thumbnails/14.jpg)
Local RAID
Local RAID
- RAID6, RAID10, RAID6
- SATA, SAS disks
- ext3, XFS
- Shrink failure domain
- Great density
- Slower failover
- RAID Controllers..
DreamHost
![Page 15: A Storage Story #ChefConf2013](https://reader033.fdocuments.us/reader033/viewer/2022052905/5585678cd8b42a970b8b4fd4/html5/thumbnails/15.jpg)
SighDreamHost
![Page 16: A Storage Story #ChefConf2013](https://reader033.fdocuments.us/reader033/viewer/2022052905/5585678cd8b42a970b8b4fd4/html5/thumbnails/16.jpg)
Ceph
Ceph
- Open source
- Build with COTS hardware
- Distributed and replicated
- No single point of failure
- Consist
- Self healing and self managing
DreamHost
![Page 17: A Storage Story #ChefConf2013](https://reader033.fdocuments.us/reader033/viewer/2022052905/5585678cd8b42a970b8b4fd4/html5/thumbnails/17.jpg)
Building Blocks
Monitors:
- Maintain cluster map
- Provide consensus for distributed decision making
- Must have an odd number
- These do not serve stored objects to clients
OSDs:
- One per disk (recommended)
- Serve stored objects to clients
- Intelligently peer to perform replication tasks
- Supports object classes
DreamHost
![Page 18: A Storage Story #ChefConf2013](https://reader033.fdocuments.us/reader033/viewer/2022052905/5585678cd8b42a970b8b4fd4/html5/thumbnails/18.jpg)
Building Blocks
OSD States
Up available and ready
Down not available
In current member of cluster
Out not member of cluster
DreamHost
![Page 19: A Storage Story #ChefConf2013](https://reader033.fdocuments.us/reader033/viewer/2022052905/5585678cd8b42a970b8b4fd4/html5/thumbnails/19.jpg)
CephstoreDreamHost
XFSBTRFSEXT4
![Page 20: A Storage Story #ChefConf2013](https://reader033.fdocuments.us/reader033/viewer/2022052905/5585678cd8b42a970b8b4fd4/html5/thumbnails/20.jpg)
ClusterDreamHost
![Page 21: A Storage Story #ChefConf2013](https://reader033.fdocuments.us/reader033/viewer/2022052905/5585678cd8b42a970b8b4fd4/html5/thumbnails/21.jpg)
Ceph ConsumersDreamHost
![Page 22: A Storage Story #ChefConf2013](https://reader033.fdocuments.us/reader033/viewer/2022052905/5585678cd8b42a970b8b4fd4/html5/thumbnails/22.jpg)
Creating a MapDreamHost
![Page 23: A Storage Story #ChefConf2013](https://reader033.fdocuments.us/reader033/viewer/2022052905/5585678cd8b42a970b8b4fd4/html5/thumbnails/23.jpg)
CRUSHDreamHost
CRUSH
- Pseudo-random placement algorithm
- Ensures statistically even distribution
- Repeatable, deterministic
- Rule based configuration
- Replica count
- Infrastructure Topology
- Weighting
![Page 24: A Storage Story #ChefConf2013](https://reader033.fdocuments.us/reader033/viewer/2022052905/5585678cd8b42a970b8b4fd4/html5/thumbnails/24.jpg)
CRUSHDreamHost
![Page 25: A Storage Story #ChefConf2013](https://reader033.fdocuments.us/reader033/viewer/2022052905/5585678cd8b42a970b8b4fd4/html5/thumbnails/25.jpg)
OSD DOWN!DreamHost
![Page 26: A Storage Story #ChefConf2013](https://reader033.fdocuments.us/reader033/viewer/2022052905/5585678cd8b42a970b8b4fd4/html5/thumbnails/26.jpg)
Remap and BackfillDreamHost
![Page 27: A Storage Story #ChefConf2013](https://reader033.fdocuments.us/reader033/viewer/2022052905/5585678cd8b42a970b8b4fd4/html5/thumbnails/27.jpg)
Ceph AnatomyDreamHost
![Page 28: A Storage Story #ChefConf2013](https://reader033.fdocuments.us/reader033/viewer/2022052905/5585678cd8b42a970b8b4fd4/html5/thumbnails/28.jpg)
RESTful Storage ServiceDreamHost
![Page 29: A Storage Story #ChefConf2013](https://reader033.fdocuments.us/reader033/viewer/2022052905/5585678cd8b42a970b8b4fd4/html5/thumbnails/29.jpg)
DreamObjects
DreamObjects:
- Ceph Storage Cluster
- Ubuntu Linux (12.04)
- Managed by Opscode Chef
- S3 and Swift RESTful interfaces
- Highly durable (8 nines)
- 2+ PB raw capacity
DreamHost
![Page 30: A Storage Story #ChefConf2013](https://reader033.fdocuments.us/reader033/viewer/2022052905/5585678cd8b42a970b8b4fd4/html5/thumbnails/30.jpg)
DreamCompute
DreamCompute:
- Ceph Storage Cluster (RDB)
- Ubuntu Linux (12.04)
- Managed by Opscode Chef
- OpenStack
- Virtualized L2 and L3 networking
- Highly durable (8 nines)
- 3+ PB raw capacity
DreamHost
![Page 31: A Storage Story #ChefConf2013](https://reader033.fdocuments.us/reader033/viewer/2022052905/5585678cd8b42a970b8b4fd4/html5/thumbnails/31.jpg)
AutomateDreamHost
- Bootstrap cluster
- Packages and configuration
- Creates, Destroys and Encrypts OSDS
- Roles map to pdsh genders
- User and SSH key management
- Push monitoring configurations
![Page 32: A Storage Story #ChefConf2013](https://reader033.fdocuments.us/reader033/viewer/2022052905/5585678cd8b42a970b8b4fd4/html5/thumbnails/32.jpg)
Hard StuffDreamHost
Key management
Leader election
![Page 33: A Storage Story #ChefConf2013](https://reader033.fdocuments.us/reader033/viewer/2022052905/5585678cd8b42a970b8b4fd4/html5/thumbnails/33.jpg)
What we useDreamHost
- Attributes
- Environments
- Search
- No databags
![Page 34: A Storage Story #ChefConf2013](https://reader033.fdocuments.us/reader033/viewer/2022052905/5585678cd8b42a970b8b4fd4/html5/thumbnails/34.jpg)
EnvironmentsDreamHost
- Ceph package versions
- VIPs for API endpoints
- Package repository URI
- Ceph configuration data driven by attributes
![Page 35: A Storage Story #ChefConf2013](https://reader033.fdocuments.us/reader033/viewer/2022052905/5585678cd8b42a970b8b4fd4/html5/thumbnails/35.jpg)
Gated EnvironmentsDreamHost
Development
Staging
Production
![Page 36: A Storage Story #ChefConf2013](https://reader033.fdocuments.us/reader033/viewer/2022052905/5585678cd8b42a970b8b4fd4/html5/thumbnails/36.jpg)
Operational Feedback
- Continuous functional testing
- Metrics, metrics, metrics
- Dashboards
DreamHost
![Page 37: A Storage Story #ChefConf2013](https://reader033.fdocuments.us/reader033/viewer/2022052905/5585678cd8b42a970b8b4fd4/html5/thumbnails/37.jpg)
Chef Infrastructure
- Chef cluster per datacenter
- Private Chef from Opscode
- Erchef is awesome
- Migrating legacy automation
DreamHost
![Page 38: A Storage Story #ChefConf2013](https://reader033.fdocuments.us/reader033/viewer/2022052905/5585678cd8b42a970b8b4fd4/html5/thumbnails/38.jpg)
Chef for Ceph
- Prototyped Ceph cluster automation with Chef
- Adapted Ceph to ease configuration
- Pushed some automation down into Ceph
- Move towards being CM agnostic
- Simplify Chef recipes
DreamHost
![Page 39: A Storage Story #ChefConf2013](https://reader033.fdocuments.us/reader033/viewer/2022052905/5585678cd8b42a970b8b4fd4/html5/thumbnails/39.jpg)
Chef the network
- DreamCompute utilizes ODM switches
- Cumulus Networks provides Linux based OS
- Custom Chef omnibus builds for PPC
- Ohai networking!
DreamHost
![Page 40: A Storage Story #ChefConf2013](https://reader033.fdocuments.us/reader033/viewer/2022052905/5585678cd8b42a970b8b4fd4/html5/thumbnails/40.jpg)
Resiliency EngineeringDreamHost
- Amazon (GameDay)
- Etsy (GameDay)
- Google (DiRT)
![Page 41: A Storage Story #ChefConf2013](https://reader033.fdocuments.us/reader033/viewer/2022052905/5585678cd8b42a970b8b4fd4/html5/thumbnails/41.jpg)
Infra. a CodeDreamHost
- Bare metal servers configured by code
- Network devices configured by code
- Block storage configured by code
- Virtual networking configured by code
![Page 42: A Storage Story #ChefConf2013](https://reader033.fdocuments.us/reader033/viewer/2022052905/5585678cd8b42a970b8b4fd4/html5/thumbnails/42.jpg)
ThanksDreamHost
DreamHost
Sage Weil and Inktank
OpenStack Developers
Opscode