Preparing your data for the cloud
-
Upload
inphina-technologies -
Category
Technology
-
view
1.592 -
download
3
description
Transcript of Preparing your data for the cloud
![Page 1: Preparing your data for the cloud](https://reader033.fdocuments.us/reader033/viewer/2022060110/5560b79bd8b42a033c8b4bcb/html5/thumbnails/1.jpg)
1
Preparing Your Data For Cloud
Narinder KumarInphina Technologies
![Page 2: Preparing your data for the cloud](https://reader033.fdocuments.us/reader033/viewer/2022060110/5560b79bd8b42a033c8b4bcb/html5/thumbnails/2.jpg)
2
Agenda
Relational DBMS's : Pros & Cons Non-Relational DBMS's : Pros & Cons Types of Non-Relational DBMS's Current Market State Applicability of Different Data-Bases in
different environments
![Page 3: Preparing your data for the cloud](https://reader033.fdocuments.us/reader033/viewer/2022060110/5560b79bd8b42a033c8b4bcb/html5/thumbnails/3.jpg)
3
Relational DBMS - Pros
Data Integrity ACID Capabilities High Level Query Model Data Normalization Data Independence
![Page 4: Preparing your data for the cloud](https://reader033.fdocuments.us/reader033/viewer/2022060110/5560b79bd8b42a033c8b4bcb/html5/thumbnails/4.jpg)
4
Relational DBMS - Cons
Scaling Issues By Duplication (Master-Slave)
By Sharding/Division (Not transparent)
Fixed Schema Mostly disk-oriented (Performance) May fair poorly with large data
![Page 5: Preparing your data for the cloud](https://reader033.fdocuments.us/reader033/viewer/2022060110/5560b79bd8b42a033c8b4bcb/html5/thumbnails/5.jpg)
5
Non-Relational DBMS - Pros
Scalability
Replication / Availability
Performance
Deployment Flexibility
Modelling Flexibility
Faster Development (?)
![Page 6: Preparing your data for the cloud](https://reader033.fdocuments.us/reader033/viewer/2022060110/5560b79bd8b42a033c8b4bcb/html5/thumbnails/6.jpg)
6
Non-Relational DBMS - Cons
Lack of Transactional Support Data Integrity is Application's responsibility Data Duplication / Application Dependent Eventually Consistent (mostly) No Standardization New Technology
![Page 7: Preparing your data for the cloud](https://reader033.fdocuments.us/reader033/viewer/2022060110/5560b79bd8b42a033c8b4bcb/html5/thumbnails/7.jpg)
7
RDBMS's and Cloud
![Page 8: Preparing your data for the cloud](https://reader033.fdocuments.us/reader033/viewer/2022060110/5560b79bd8b42a033c8b4bcb/html5/thumbnails/8.jpg)
8
Cloud Capable RDBMS
s
Almost every RDBMS can run in a IAAS Cloud Platform
![Page 9: Preparing your data for the cloud](https://reader033.fdocuments.us/reader033/viewer/2022060110/5560b79bd8b42a033c8b4bcb/html5/thumbnails/9.jpg)
9
Cloud Native RDBMS
![Page 10: Preparing your data for the cloud](https://reader033.fdocuments.us/reader033/viewer/2022060110/5560b79bd8b42a033c8b4bcb/html5/thumbnails/10.jpg)
10
Types of Non-Relational DBMS
Key Value Stores
Document Stores
Column Stores
Graph Stores
![Page 11: Preparing your data for the cloud](https://reader033.fdocuments.us/reader033/viewer/2022060110/5560b79bd8b42a033c8b4bcb/html5/thumbnails/11.jpg)
11
Key Value Data-Bases
Object is completely Opaque to DB Mostly GET, PUT & DELETE operations are
supported There may be limits on size of Objects
Inspired by Amazon Dynamo Paper
![Page 12: Preparing your data for the cloud](https://reader033.fdocuments.us/reader033/viewer/2022060110/5560b79bd8b42a033c8b4bcb/html5/thumbnails/12.jpg)
12
Key Value DataBases & Cloud
Project Voldemort
MemCachedDB Tokyo Tyrant
![Page 13: Preparing your data for the cloud](https://reader033.fdocuments.us/reader033/viewer/2022060110/5560b79bd8b42a033c8b4bcb/html5/thumbnails/13.jpg)
13
Document Data-Bases Object is not completely opaque to DB Every Object has it's own schema
FirstName="Bob", Address="5 Oak St.", Hobby="sailing".
FirstName="Jonathan", Children=("Michael,10", "Jennifer,8")
Can perform queries based on Object's attributes
Possible to describe relationships between Objects
Joins and Transactions are not supported Good for XML or JSON objects
![Page 14: Preparing your data for the cloud](https://reader033.fdocuments.us/reader033/viewer/2022060110/5560b79bd8b42a033c8b4bcb/html5/thumbnails/14.jpg)
14
Document Data-Bases & Cloud
![Page 15: Preparing your data for the cloud](https://reader033.fdocuments.us/reader033/viewer/2022060110/5560b79bd8b42a033c8b4bcb/html5/thumbnails/15.jpg)
15
Column-Store Data-Bases
Richer than Document Stores Multi-Dimensional Map
Tables Row Column Time-Stamp
Supports Multiple Data Types Usually use an Underlying DFS
Inspired by Google Big Table Paper
![Page 16: Preparing your data for the cloud](https://reader033.fdocuments.us/reader033/viewer/2022060110/5560b79bd8b42a033c8b4bcb/html5/thumbnails/16.jpg)
16
Column-Store Data-Bases & Cloud
![Page 17: Preparing your data for the cloud](https://reader033.fdocuments.us/reader033/viewer/2022060110/5560b79bd8b42a033c8b4bcb/html5/thumbnails/17.jpg)
17
Key Factors while Making a Choice
Application Architecture Requirements Platform choices Non-Functional Requirements
Consistency
Availability
Partition
Security
Data Redemption
![Page 18: Preparing your data for the cloud](https://reader033.fdocuments.us/reader033/viewer/2022060110/5560b79bd8b42a033c8b4bcb/html5/thumbnails/18.jpg)
18
Different Requirements = Different Solutions
![Page 19: Preparing your data for the cloud](https://reader033.fdocuments.us/reader033/viewer/2022060110/5560b79bd8b42a033c8b4bcb/html5/thumbnails/19.jpg)
19
Scenario 1
Feature First Corporate Data
Consistency Requirements
Business Intelligence
Legacy Application
RDBMS on Amazon Cloud, RackSpace (IaaS) or
Microsoft Azure/Amazon RDS (PaaS)
![Page 20: Preparing your data for the cloud](https://reader033.fdocuments.us/reader033/viewer/2022060110/5560b79bd8b42a033c8b4bcb/html5/thumbnails/20.jpg)
20
Scenario 2
Consumer Facing Application Big Files (Images, BLOB's, Files)
Geographically Distributed
Mostly writes
Not heavy requirement on Rich Queries
Key-Value Data Stores (Amazon S3, Project Voldemort, Redis)
![Page 21: Preparing your data for the cloud](https://reader033.fdocuments.us/reader033/viewer/2022060110/5560b79bd8b42a033c8b4bcb/html5/thumbnails/21.jpg)
21
Scenario 3
Hundreds Of Government Documents with different schemas Need to serve on Web
Data Mining
Document Data-Stores (Amazon SimpleDB, Apache Couche DB, MongoDB)
![Page 22: Preparing your data for the cloud](https://reader033.fdocuments.us/reader033/viewer/2022060110/5560b79bd8b42a033c8b4bcb/html5/thumbnails/22.jpg)
22
Scenario 4
Scale First Huge Data-Set
Analytical Requirements
Consumer Facing
High Availability over Consistency
Column Data-Stores (Google App Engine, Hbase, Cassandra)
![Page 23: Preparing your data for the cloud](https://reader033.fdocuments.us/reader033/viewer/2022060110/5560b79bd8b42a033c8b4bcb/html5/thumbnails/23.jpg)
23
Mix & Match of Earlier Scenarios
Polyglot Persistence RDBMS for low-volume
and high value
Key-Value DB for large files with little queries
Memcached DB for short-lived Data
Column DB for Analytics
![Page 24: Preparing your data for the cloud](https://reader033.fdocuments.us/reader033/viewer/2022060110/5560b79bd8b42a033c8b4bcb/html5/thumbnails/24.jpg)
24
CAP Theorem
![Page 25: Preparing your data for the cloud](https://reader033.fdocuments.us/reader033/viewer/2022060110/5560b79bd8b42a033c8b4bcb/html5/thumbnails/25.jpg)
25
Conclusions
One Size does not Fit all
Many choices
No-SQL DB's providing Alternatives
RDBMS serve useful purpose
![Page 26: Preparing your data for the cloud](https://reader033.fdocuments.us/reader033/viewer/2022060110/5560b79bd8b42a033c8b4bcb/html5/thumbnails/26.jpg)
26
www.inphina.comhttp://thoughts.inphina.com
![Page 27: Preparing your data for the cloud](https://reader033.fdocuments.us/reader033/viewer/2022060110/5560b79bd8b42a033c8b4bcb/html5/thumbnails/27.jpg)
27
References http://nosql-database.org/
http://www.drdobbs.com/database/218900502
http://perspectives.mvdirona.com/2009/11/03/OneSizeDoesNotFitAll.aspx
http://blog.nahurst.com/visual-guide-to-nosql-systems
http://blog.heroku.com/archives/2010/7/20/nosql/
http://www.vineetgupta.com/2010/01/nosql-databases-part-1-landscape.html
http://project-voldemort.com/
http://code.google.com/p/redis/
http://memcachedb.org/
http://aws.amazon.com/simpledb/
http://couchdb.apache.org/
http://www.mongodb.org
http://riak.basho.com/