SQL or NoSQL - how to choose

25
SQL or NoSQL? Lars Thorup, Zealake September, 2016

Transcript of SQL or NoSQL - how to choose

SQL or NoSQL?Lars Thorup, Zealake

September, 2016

Lars Thorup

● Software developer/architect● JavaScript, C#● Test Driven Development

● Coach● Agile engineering practices

● Founder● BestBrains● Zealake● Triggerz

● @larsthorup

Agenda● My history with databases

● Databases - what are they good for?

● SQL and noSQL - what is the difference?

● Redis - a noSQL database

● Matching use cases to database systems

● Redis - datastructures and algorithms

My history with databases● Databases

● Pre 1980 - many competing database models● 1980-2010 - SQL dominates● 2010-now - many competing noSQL database

models

● Myself● 1990-2016 - SQL for administrative systems,

documents, e-commerce, music, collaboration tools, data analytics

● 2015 - Redis and Neo4J for social media

Databases - what are they good for?● Make data available

● Across the globe● From multiple computers● Across long time spans

● Prevent data loss

● Quickly search for, fetch data and update data

● Ensure consistency in data

Example database systems● SQL

● Relational: SQL Server, PostgreSQL, Oracle

● NoSQL● Key-value: DynamoDB, Berkeley DB, S3● Document: MongoDB, RethinkDB● Data structure: Redis● Graph: Neo4J● Columns: Cassandra, HBase

Example database use cases● Banks: accounts, owners, transactions

● Social media: posts, comments, ratings

● Caching: user sessions, generated pages

● Sales analytics: counts, sums, locations, averages, hierarchies

SQL and noSQL - what is the difference?● What kind of data do we store?

● How many machines do we use?

● Will there be type checking?

● Will we have to code the lookup algorithms?

● How do we prevent incosistent data?

Typical SQL database● Many small tables with lots of columns

● Single instance on a large server

● Explicit column types, referential constraints

● Advanced and efficient standard query language

● Transactions over complex updates

Typical NoSQL database● Collections of JSON documents

● Cluster of servers with shards and replicas

● Application may handle evolving document structures

● Specific low-level query language

● Single-update transactions

Categorizing a database system

SQL NoSQL

Impedance mismatch

Distribution

Schema

Query engine

tables and columns

server

explicit

optimizing manual

implicit

cluster

documents

Redis - one NoSQL database

SQL NoSQL

Impedance mismatch

Distribution

Schema

Query engine

tables and columns

server

explicit

optimizing manual

implicit

cluster

documents

Redis

Redis● REmote DIctionary Server, started in 2009

● Popular, fast, robust

● In-memory

● Single-threaded

● Many data types● dictionaries, lists, sets, sorted sets

● Other features● key expiry● publish - subscribe

Redis demo● string values (session count)

● dictionary values (session)

● list values (lucene index queue)

● sorted list values (front page posts)

● expiry (session)

● http://redis.io/topics/data-types-intro

Demo: string values● Example: Global objects

incr 'session:id'

set 'session:42' '{"name":"lars", "level": 5}'get 'session:42'

Demo: dictionary values● Example: Session object

hset 'session:42' info '{"name":"lars"}'hset 'session:42' level "5"

hgetall 'session:42'

hincrby 'session:42' level 1hget 'session:42' level

Demo: list values● Example: lucene indexing queue

rpush 'index:lucene' "42"rpush 'index:lucene' "105"rpush 'index:lucene' "7"

lrange 'index:lucene' 0 -1

lpop 'index:lucene'

Demo: sorted list values● Example: front page posts

zadd 'post:score' 17 "42"zadd 'post:score' 39 "43"zadd 'post:score' 22 "44"zincrby 'post:score' 1 "42"

zrevrange 'post:score' 0 -1zrevrank 'post:score' "42"

Demo: expiry● Example: session

set 'session:42' '{"name":"lars", "level": 5}'expire 'session:42' 20ttl 'session:42'get 'session:42'

Demo: simple transactions● Example: maintaining indexes

multiset 'session:42' '{"name":"lars", "level": 5}'hset 'session:by:name' "lars" "42"expire 'session:42' 20exec

Demo: not so simple transactions● Example: doing updates

watch 'session:42'get 'session:42'multiset 'session:42' '{"name":"lars", "level": 7}'exec

Redis demo questions

Redis - one NoSQL database

SQL NoSQL

Impedance mismatch

Distribution

Schema

Query engine

tables and columns

server

explicit

optimizing manual

implicit

cluster

documents

Redis

Trade-off: coding effort● SQL

● Distribution: sharding and clustering

● Impedance mismatch: Object-relational mapping

● Explicit schema: fixed, declared up-front, requires migrations

● NoSQL● Manual query optimization ● Difficult transactional safety● Implicit and dynamic schema

migrations

Questions!