NoSQL overview #phptostart turin 11.07.2011
-
Upload
david-funaro -
Category
Technology
-
view
4.178 -
download
5
description
Transcript of NoSQL overview #phptostart turin 11.07.2011
![Page 1: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/1.jpg)
NoSQL
David Funaro
Torino, 11 luglio 2011
PHP.TO.START
![Page 2: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/2.jpg)
What about me ?• sw engineer
• PHP developer (2002)
• Symfony Framework developer (2009)
• Mobile developer ( iOs / Symbian )
• Senior developer @ dnsee
• PHP user group Rome Founder
• Open Source contributor
![Page 3: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/3.jpg)
RDBMS
NOSQL
Other
Database - logical model
![Page 4: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/4.jpg)
Relational DB
• In the *70’s
• SQL ,relational algebra & set theory
• excellent for applications such as management( accounting, reservations, management staff)
![Page 5: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/5.jpg)
ACID
• Atomic
• Consistency
• Isolation
• Durability
Transactions work in the right mode if the database can satisfy this four properties:
![Page 6: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/6.jpg)
RDBMS
NOSQL
Other
Database - logical model
![Page 7: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/7.jpg)
RDBMS
Database - logical model
![Page 8: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/8.jpg)
RDBMS
Database - logical model
Key Value
Document Oriented
Column Oriented
Graph DB
NOSql
![Page 9: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/9.jpg)
NOSql !=
![Page 10: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/10.jpg)
NOSql !=
Not Only SqlOne Size fits all
![Page 11: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/11.jpg)
Historical IntroThe concept of “non relational database” is older than the “relational model” but has been resumed and improved
technology comes back
![Page 12: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/12.jpg)
New Requirements
![Page 13: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/13.jpg)
New Requirements
half *90’s
![Page 14: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/14.jpg)
New Requirements
half *90’s
with the new internet-based systems the Consistency and the Security of data are no longer enough
![Page 15: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/15.jpg)
New Requirements
half *90’s
with the new internet-based systems the Consistency and the Security of data are no longer enough
the new need is the Hight availability
![Page 16: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/16.jpg)
• distributed storage system
• scale file dimension up to Petabyte
Wide applicability
Scalability
High performance
High availability
![Page 17: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/17.jpg)
Google BigTable
• Web indexing
• Google Earth
• Google Finance
• Orkut
• Custom Search
• Google Docs
column - Oriented DB
![Page 18: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/18.jpg)
Amazon
• Relational model doesn’t fit requirements
• 10 of thousand of server around the world
• 10 Millions customers
High Reliability High scale
![Page 19: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/19.jpg)
Amazon Dynamo
• High Reliability
• High Scale
Key-Value Store Database
![Page 20: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/20.jpg)
New Trends
![Page 21: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/21.jpg)
Web Company
• Startup with explosive growth:
• DBMS open source
• v 1.0 - 1 node , becomes soon inadequate
• next version:
• Horizontal Partitioning (sharding)
• implement the node routing inside the application logic
![Page 22: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/22.jpg)
Web Company
• Re-implement inter-node query
• Handle inter-node transaction
• Node failure increasingly likely - less reliability - less availability
• “Hot” Data restructuring and data redistribuition becomes hard
![Page 23: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/23.jpg)
Solution
• Scalability, very simple operations, but on many nodes
• Performance, low latency
• Productivity
• Flexibility (data structure)
• Skill to distribute data on many nodes
} web Application
needs
![Page 24: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/24.jpg)
Compromise
• SQL Renounce
• less strict transactions
![Page 25: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/25.jpg)
Query Language
• SQL like
• map-reduce
• SparQL
• ...
Leave a standard query language like SQL, and embrace a different kind of query language based on the selected product
![Page 26: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/26.jpg)
CAP Theorem(2009)
• Consistency
• Availability
• Partition Tollerance
It’s impossibile to have all of them at the same time in a distributed system. You have to choose only two.
Eric Brewer
![Page 27: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/27.jpg)
Consistency
N2
N5
N4
N1
N6
tk
tk
tk
tk
• Strong: After the update completes any subsequent access will return the updated value.
• Weak: The system does not guarantee that subsequent accesses will return the updated value.
• Eventually: The storage system guarantees that if no new updates are made to the object eventually (after the inconsistency window closes) all accesses will return the last updated value.
![Page 28: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/28.jpg)
Consistency
N2
N5
N4
N1
N6
tk
tk
tk
tk
• Strong: After the update completes any subsequent access will return the updated value.
• Weak: The system does not guarantee that subsequent accesses will return the updated value.
• Eventually: The storage system guarantees that if no new updates are made to the object eventually (after the inconsistency window closes) all accesses will return the last updated value.
![Page 29: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/29.jpg)
Facebook Cassandra
• Key-Value store
• data model: BigTable
• infrastructure: Amazon-Dynamo
• Eventual Consistency
• High Availability
![Page 30: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/30.jpg)
Just find the right way to manage your data-set
Search Best Solution
![Page 31: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/31.jpg)
Technology Focus
context
purp
ose
Cos
t of
impl
emen
tatio
n
![Page 32: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/32.jpg)
choose bike => (climb the mountain)
![Page 33: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/33.jpg)
choose bike => (climb the mountain)
![Page 34: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/34.jpg)
choose bike => (climb the mountain)
![Page 35: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/35.jpg)
choose bike => (climb the mountain)
![Page 36: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/36.jpg)
choose bike => (climb the mountain)
Know available tools
![Page 37: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/37.jpg)
NOSql Families
![Page 38: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/38.jpg)
Key Value StoreOne Key -> One Value
it’s like an HASH
db knows information about “key” type (integer, float, ...), nothing about the value
very fast
‘name’ ‘david’=>
key value
![Page 39: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/39.jpg)
Key Value Store
• redis
• memcached
• dynamo
• voldemort
performance
Scalability
Flexibility
Complexity
Functionality
high
high
high
none
variabile(none)
![Page 40: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/40.jpg)
Document Oriented• key -> document
• structured document
• schema-less{ name: ‘david’, surname: ‘funaro’, age: ’18’, mail: { home : ‘[email protected]’, office: ‘[email protected]‘ }}
user_13 =>
key
document
![Page 41: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/41.jpg)
Document Oriented
performance
Scalability
Flexibility
Complexity
Functionality
high
variable (high)
high
low
variabile(low)
![Page 42: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/42.jpg)
Graph DB
• composed by Vertices and Edges
• Vertices connected by Edges
• Edge has a Label and Direction
• Edges and Vertices have Properties
![Page 43: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/43.jpg)
Graph DB
Funaro
dnsee
User_2David
User_1
User_3
User_3
friend
friend
friendsurnam
e
name
work
![Page 44: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/44.jpg)
Graph DB
• neo4J
• OrientDB
• infogrid
• VertexDB
performance
Scalability
Flexibility
Complexity
Functionality
variable
variable
high
high
graph theory
![Page 45: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/45.jpg)
Why NOSql
some case example
![Page 46: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/46.jpg)
A Graph RDBMS
id name salary
1 ale 200
2 marco 230
3 david 340
4 sergio 349
5 andre 200
id_1 id_22 43 13 43 21 55 35 2
FolloweeUsers
![Page 47: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/47.jpg)
A Graph RDBMS
id name salary
1 ale 200
2 marco 230
3 david 340
4 sergio 349
5 andre 200
id_1 id_22 43 13 43 21 55 35 2
FolloweeUsers
handled as BTree101
![Page 48: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/48.jpg)
A Graph RDBMS
Lookup david’s id [Log(N)]
N = # users
Look K Followees [Log(N)]
Get their names [K*Log(N)]
![Page 49: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/49.jpg)
Graph DB
Marco
Sergio
AndreaAle
David
Lookup David Log(N)
Lookup for Followees O(K)
![Page 50: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/50.jpg)
Benchmark
• 1 Million Vertex
• 4 Million Edge
• Scale Free Tolopogy
• Postgres VS Neo4J
• Both Hash and BTree
Deph RDBMS Graph
1
2
3
4
5
100ms 30ms
1000ms 500ms
10000ms 3000ms
100000ms 50000ms
N/A 100000ms
http://markorodriguez.com/2011/02/18/mysql-vs-neo4j-on-a-large-scale-graph-traversal/
![Page 51: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/51.jpg)
Schema
RDBMS NOSql - DocumentaleCREATE TABLE `pma_bookmark` ( `id` int(11) NOT NULL auto_increment, `name` varchar(255) NOT NULL default '', `surname` varchar(255) NOT NULL default '', `mobile` varchar(255) NOT NULL default '', `url` text NOT NULL,... `name` varchar(255) NOT NULL default '',... `telex` varchar(255) NOT NULL default '', `fax` varchar(255) NOT NULL default '', `office` text NOT NULL, PRIMARY KEY (`id`));
Schema Less
![Page 52: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/52.jpg)
Schema 2id name surname mobile url ... telex office telex ...
1
2
3
david funaro 3548 davidfunaro.com null null 3548631 null null
alessandro nadalin 3257 null null null 32458 5456 null
marco rossi 3548 null null null null 515648 null
too value set to NULL
user :{ name: david, surname: funaro, mobile : 3454, url: davidfunaro.com, office: 3423423,}
user :{ name: alessandro, surname: nadalin, mobile : 6262, office: 342343, telex: 3434}
user :{ name: marco, surname: rossi, telex: 3434}
Each Document has only the required fields
![Page 53: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/53.jpg)
Schema less
• flexibility to handle the data model fields
• the model can grow easily
![Page 54: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/54.jpg)
Performance====== SET ======
100007 requests completed in 0.88 seconds 50 parallel clients 3 bytes payload keep alive: 1
====== GET ====== 100000 requests completed in 1.23 seconds 50 parallel clients 3 bytes payload keep alive: 1
http://redis.io/topics/benchmarks
http://research.yahoo.com/files/ycsb-v4.pdf
![Page 55: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/55.jpg)
NOSql for PHP
✓Redis
✓MongoDB
✓CouchDB
✓Cassandra
✓Memcached
✴OrientDB
![Page 56: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/56.jpg)
OrientDB library for PHP
https://github.com/congow/Orient
A Set of tools to use and manage any OrientDB instance from PHP.
Orient includes:
•the HTTP protocol binding•the query builder•the data mapper ( Object Graph Mapper )
![Page 58: NoSQL overview #phptostart turin 11.07.2011](https://reader035.fdocuments.us/reader035/viewer/2022081519/5555d55ad8b42aaf158b4f02/html5/thumbnails/58.jpg)
credits
http://www.slideshare.net/ClaudioMartella/presentation-7398682?from=ss_embed http://www.slideshare.net/harrikauhanen/nosql-3376398 http://www.slideshare.net/ingdavidino/cmf-a-pain-in-the-f-phpday-05142011 http://it.wikipedia.org/wiki/Modello_relazionale http://www.slideshare.net/gabriele.lana/nosql-7405964 http://blog.indigenidigitali.com/l-ecosistema-nosql/ http://www.dia.uniroma3.it/~torlone/bd2/noSQL-1.pdf http://nosql-database.org/