Scaling Your Database in the Cloud
-
Upload
rightscale -
Category
Technology
-
view
2.369 -
download
1
description
Transcript of Scaling Your Database in the Cloud
Scaling Your Database in the Cloud
July 21, 2011
Watch the video of this webinar
# 2
Your Panel Today
Presenting:• Uri Budnik: Director, ISV Partner Program, RightScale @uribudnik
• Cory Isaacson: CEO & Founder, CodeFutures @dbShards
• David Blinder: CTO, Family Builder
Q&A:• Jason Altobelli, Inside Sales Representative, RightScale
Please use the chat box window to ask questions anytime!
Webinar Recordings: www.rightscale.com/webinars
# 3
Agenda
• Introduction to RightScale• Introduction to CodeFutures• Live Demo• Live Q&A
Please use the chat box window to ask questions anytime!
# 4
RightScale Real Customers, Real Deployments, Real Benefits
• Managed Cloud Deployments for 4 Years• More than 30,000 users; launched over 2.7MM servers• Behind the largest production deployments on that cloud to
date
# 5
Complete Systems Management
# 6
RightScale: Core Focus Improved Agility
Reduce complexity with ServerTemplates™ Manage Systems, not Servers Orchestrate and Automate
Maintain Choice Multi-cloud Configuration Asset Marketplace ISV Partner Solutions
Control & Security User Access and Roles Cost Control and Allocation Complete Transparency
# 7
ServerTemplates: Built-to-Order Servers
VS.
Image bundling and maintenance
# 8
RightScripts in Multi-Cloud Marketplace• Two RightScripts you can use to analyze you application to
determine if its “shard-safe”1. Logging Driver for Native MySQL®
2. dbShards/Analyze Driver for JDBC
• Installed in your app server to gather SQL statistics. • Its an in-depth analysis of what is
needed to shard you database• Report lists each unique SQL statement
and how it will function once sharded• Run once and generate a report that
CodeFutures will review with you at
no charge
# 9
Introduction• Who I am
• Cory Isaacson, CEO of CodeFutures• Providers of dbShards• Author of Software Pipelines
• Partnerships:• Rightscale
• The leading Cloud Management Platform
• Leaders in database scalability, performance, and high-availability for the cloud• based on real-world experience with dozens of cloud-based applications• social networking, gaming, data collection, mobile, analytics
• Objective is to provide useful experience you can apply to scaling (and managing) your database tier…• especially for high volume applications• and an overview of dbShards technology
# 10
Challenges of cloud computing• Cloud provides highly attractive service environment
• Flexible, scales with need (up or down)• No need for dedicated IT staff, fixed facility costs• Pay-as-you-go model
• Cloud services occasionally fail• Partial network outages• Server failures
• by their nature cloud servers are “transient”
• Disk volume issues
• Cloud-based resources are constrained• CPU• I/O Rates
• the “Cloud I/O Barrier”
# 11
Typical Application Architecture
# 12
Scaling in the Cloud• Scaling Load Balancers is easy
• Stateless routing to app server• Can add redundant Load Balancers if needed• If one goes down
• failover to another
• Scaling Application Servers is easy• Stateless• Sessions can easily transition to another server• Add or remove servers as need dictates• If one goes down
• failover to another
# 13
Scaling in the Cloud• Scaling the Database tier is hard
• “Statefull” by definition (and necessity)• Large, integrated data sets
• 10s of GBs to TBs (or more)• Difficult to move, reload
• I/O dependent• adversely affected by cloud service failures• and slow cloud I/O
• If one goes down• ouch!
# 14
Scaling in the Cloud• Databases form the “last mile” of true application scalability
• Start with simple optimizations• implement a follow-on scalability strategy for long-term performance goals• and a high-availability strategy is a must
• Ensure your databases can failover• unplanned outages• and planned maintenance
• The best time to plan your database scalability strategy is now• don’t wait until it’s a “3-alarm fire”
# 15
Familybuilder
Innovator in Facebook applications Among first 500 apps worldwide David Blinder, CTO
# 16
All CPUs wait at the same speed…
The Cloud I/O Barrier
# 17
Database slowdown is not linear…
0 5 10 15 20 25 30 35 400
2000
4000
6000
8000
10000
Database Load Curve
Time
Exponential (Time)
Data File (GB)
Lo
ad T
ime
GB Load Time (Min)
.9 1
1.3 2.5
3.5 11.7
39.0 10 days…
# 18
Challenges apply to all types of databases
• Traditional RDBMS (MySQL, PostgreSQL, Oracle…)• I/O bound• Multi-user, lock contention• High-availability• Lifecycle management…
• backup/restore• schema changes• index maintenance
• NoSQL Databases (In-memory, Caching, Document)• Reliability, High-availability• Limits of a single server
• and a single thread
• Data dumps to disk• Replication• Lifecycle Management
# 19
Challenges apply to all types of databases
• No matter what the technology, big databases are hard to manage• elastic scaling is a real challenge• degradation from growth in size and volume is a certainty• application-specific database requirements add to the challenge
• Sound database design is key…• balance performance vs. convenience vs. data size
# 20
The Laws of Databases• Law #1: Small Databases are fast• Law #2: Big Databases are slow• Law #3: Keep databases small
# 21
What is the answer?
• Database sharding is the only effective method for achieving scale, elasticity, reliability and easy management• regardless of your database technology
# 22
What is Database Sharding?
• “Horizontal partitioning is a database design principle whereby rows of a database table are held separately... Each partition forms part of a shard, which may in turn be located on a separate database server or physical location.” Wikipedia
# 23
What is Database Sharding?
• Start with a big monolithic database• break it into smaller databases• across many servers• using a key value
# 24
The key to Database Sharding…
# 25
dbShards Architecture
# 26
Database Sharding… the results
# 27
Why does Database Sharding work?• Maximize CPU/Memory per database instance
• as compared to database size
• Reduce the size of index trees• speeds writes dramatically• reads are faster too• aggregate, list queries are generally much faster
• No contention between servers• locking, disk, memory, CPU
• Allows for intelligent parallel processing• Go Fish queries across shards
• Keep CPUs busy and productive
# 28
Breaking the Cloud I/O Barrier
# 29
Familybuilder
Top 50 Facebook Application 100,000 New Users Daily Doubled Users in 12 months to over 40MM David Blinder, CTO
# 30
Relational Sharding
Global Tables
Shard-Tree Root Table
Shard-Tree Child Tables
# 31
How Relational Sharding works
# 32
How Relational Sharding works• Shard key recognition in SQL
• SELECT * FROM customerWHERE customer_id = 1234
• INSERT INTO customer(customer_id, first_name, last_name, addr_line1,…)VALUES(2345, ‘John’, ‘Jones’, ‘123 B Street’,…)
• UPDATE customerSET addr_line1 = ‘456 C Avenue’WHERE customer_id = 4567
# 33
What about Cross-Shard result sets?
# 34
Cross-shard result set example• Go Fish (no shard key)
• SELECT country_id, count(*) FROM customerGROUP BY country_id
# 35
Moving to Database Sharding with dbShards
# 36
dbShards/Analyze• Review Database Schema • Define your initial shard strategy• Run dbShards/Analyze Driver
• on your app in a test environment• generate logs of all application SQL
• Generate dbShards/Analyze reports• with your data model• your shard strategy• your SQL logs as input
• Ensure your application is shard-safe• before you shard your database• and identify optimization opportunities
Demo
# 38
No-charge Shard Analysis• Drop-in dbShards/Analyze Drivers
• Native MySQL• JDBC• ODBC
• Available as Rightscale templates• search Multi-Cloud Marketplace for CodeFutures
• Logging Driver for Native MySQL®• dbShards/Analyze Driver for JDBC
• Run driver in your environment, with your app• ship us the logs, schema• a dbShards consultant take you through the analysis
• Find out exactly what it takes to shard your database• regardless of the technology you select
# 39
Wrap-up
• Database Sharding is the tool for scaling your database
• dbShards is a complete, drop-in sharding solution• Plug-compatible database drivers
• nothing between you and your database
• Intelligent agents for shard management, processing
• Database agnostic, pick the DBMS you prefer
• Use dbShards for existing applications• new ones too
• dbShards supports the entire Database Sharding infrastructure• Analyze, Shard, Manage• 24X7 Monitoring and Support for all customers
# 41
We Appreciate Your Time
Cory Isaacson:CodeFutures [email protected]://www.dbshards.com
Contacts
RIGHTSCALE: (866) [email protected] http://www.rightscale.com
More Info:
Webinar archive: RightScale.com/webinars
Whitepapers: RightScale.com/whitepapers
Free Edition: RightScale.com/free