SQL or NoSQL - how to choose
-
Upload
lars-thorup -
Category
Software
-
view
137 -
download
0
Transcript of SQL or NoSQL - how to choose
Lars Thorup
● Software developer/architect● JavaScript, C#● Test Driven Development
● Coach● Agile engineering practices
● Founder● BestBrains● Zealake● Triggerz
● @larsthorup
Agenda● My history with databases
● Databases - what are they good for?
● SQL and noSQL - what is the difference?
● Redis - a noSQL database
● Matching use cases to database systems
● Redis - datastructures and algorithms
My history with databases● Databases
● Pre 1980 - many competing database models● 1980-2010 - SQL dominates● 2010-now - many competing noSQL database
models
● Myself● 1990-2016 - SQL for administrative systems,
documents, e-commerce, music, collaboration tools, data analytics
● 2015 - Redis and Neo4J for social media
Databases - what are they good for?● Make data available
● Across the globe● From multiple computers● Across long time spans
● Prevent data loss
● Quickly search for, fetch data and update data
● Ensure consistency in data
Example database systems● SQL
● Relational: SQL Server, PostgreSQL, Oracle
● NoSQL● Key-value: DynamoDB, Berkeley DB, S3● Document: MongoDB, RethinkDB● Data structure: Redis● Graph: Neo4J● Columns: Cassandra, HBase
Example database use cases● Banks: accounts, owners, transactions
● Social media: posts, comments, ratings
● Caching: user sessions, generated pages
● Sales analytics: counts, sums, locations, averages, hierarchies
SQL and noSQL - what is the difference?● What kind of data do we store?
● How many machines do we use?
● Will there be type checking?
● Will we have to code the lookup algorithms?
● How do we prevent incosistent data?
Typical SQL database● Many small tables with lots of columns
● Single instance on a large server
● Explicit column types, referential constraints
● Advanced and efficient standard query language
● Transactions over complex updates
Typical NoSQL database● Collections of JSON documents
● Cluster of servers with shards and replicas
● Application may handle evolving document structures
● Specific low-level query language
● Single-update transactions
Categorizing a database system
SQL NoSQL
Impedance mismatch
Distribution
Schema
Query engine
tables and columns
server
explicit
optimizing manual
implicit
cluster
documents
Redis - one NoSQL database
SQL NoSQL
Impedance mismatch
Distribution
Schema
Query engine
tables and columns
server
explicit
optimizing manual
implicit
cluster
documents
Redis
Redis● REmote DIctionary Server, started in 2009
● Popular, fast, robust
● In-memory
● Single-threaded
● Many data types● dictionaries, lists, sets, sorted sets
● Other features● key expiry● publish - subscribe
Redis demo● string values (session count)
● dictionary values (session)
● list values (lucene index queue)
● sorted list values (front page posts)
● expiry (session)
● http://redis.io/topics/data-types-intro
Demo: string values● Example: Global objects
incr 'session:id'
set 'session:42' '{"name":"lars", "level": 5}'get 'session:42'
Demo: dictionary values● Example: Session object
hset 'session:42' info '{"name":"lars"}'hset 'session:42' level "5"
hgetall 'session:42'
hincrby 'session:42' level 1hget 'session:42' level
Demo: list values● Example: lucene indexing queue
rpush 'index:lucene' "42"rpush 'index:lucene' "105"rpush 'index:lucene' "7"
lrange 'index:lucene' 0 -1
lpop 'index:lucene'
Demo: sorted list values● Example: front page posts
zadd 'post:score' 17 "42"zadd 'post:score' 39 "43"zadd 'post:score' 22 "44"zincrby 'post:score' 1 "42"
zrevrange 'post:score' 0 -1zrevrank 'post:score' "42"
Demo: expiry● Example: session
set 'session:42' '{"name":"lars", "level": 5}'expire 'session:42' 20ttl 'session:42'get 'session:42'
Demo: simple transactions● Example: maintaining indexes
multiset 'session:42' '{"name":"lars", "level": 5}'hset 'session:by:name' "lars" "42"expire 'session:42' 20exec
Demo: not so simple transactions● Example: doing updates
watch 'session:42'get 'session:42'multiset 'session:42' '{"name":"lars", "level": 7}'exec
Redis - one NoSQL database
SQL NoSQL
Impedance mismatch
Distribution
Schema
Query engine
tables and columns
server
explicit
optimizing manual
implicit
cluster
documents
Redis
Trade-off: coding effort● SQL
● Distribution: sharding and clustering
● Impedance mismatch: Object-relational mapping
● Explicit schema: fixed, declared up-front, requires migrations
● NoSQL● Manual query optimization ● Difficult transactional safety● Implicit and dynamic schema
migrations