Post on 21-Aug-2015
April 18, 2023
SQL vs. NoSQLMaking the right choice
© Copyright Dimension Data 218 April 2023
The contenders
SQL NoSQL
© Copyright Dimension Data 318 April 2023
SQL Databases
• RDBMS
• Standardized
• Mature
• Reliable
• Well understood
• Queryable
• ACID
© Copyright Dimension Data 418 April 2023
NoSQL scalability argument
• Scale-Up vs Scale-Out• Use of commodity hardware
• Locking / Latching
• Consistency over partitions
• Availability of partitions
• Referential integrity
Se-ries1
Cost of scaling
SQL NoSQL
© Copyright Dimension Data 518 April 2023
Other RDBMS / SQL Database drawbacks
• One-solution-fits-all• Slow for certain tasks
• ACID is not always needed
• ORM required
• Lack of flexibility• Rigid schema
• Management complexity
• Add-on solutions• XML-fields, Filestreams
• Full-text indexes
© Copyright Dimension Data 618 April 2023
CAP theorem (Brewer's theorem)
© Copyright Dimension Data 718 April 2023
NoSQL Use Cases
• Bigness / Avoid hitting the wall
• Massive write performance
• Write availability
• Fast key-value access
• Flexible schema and flexible datatypes
• Schema migration
• No single point of failure
• Generally available parallel computing
• Easier maintainability, administration and operations
• Programmer ease of use
• Use the right data model for the right problem
• Tunable CAP tradeoffs
© Copyright Dimension Data 818 April 2023
ACID Transactions
Atomicity
Consistancy
Isolation
Durability
© Copyright Dimension Data 918 April 2023
NoSQL ACID Trade-offs
• Dropping Atomicity lets you shorten the time tables (sets of data) are locked. MongoDB, CouchDB.
• Dropping Consistency lets you scale up writes across cluster nodes.Riak, Cassandra.
• Dropping Durability lets you respond to write commands without flushing to disk.Memcache, Redis.
© Copyright Dimension Data 1018 April 2023
NoSQL Database Main Types
• Key-Value Store• A basic dictionary design storing values under unique keys
• The database does not care about the structure of the value
• Examples:
• Memcache
• Riak
• Azure Blob Storage
• Good at:
• Handles size well
• Processing a constant stream of small reads and writes
• Fast
• Programmer friendly
© Copyright Dimension Data 1118 April 2023
NoSQL Database Main Types
• Column Store• A column is a tuple of 3 elements: unique name of value, a typed
value, timestamp
• Columns may be part of column families
• Columns need not appear in every record
• Example:
• Hbase
• Hypertable
• Cassandra
• Azure Table Storage
• Good at:
• Handles size well
• Stream massive write loads
• High availability
• Multiple-data centers
• MapReduce.
© Copyright Dimension Data 1218 April 2023
NoSQL Database Main Types
• Document Store• Use a unique key to store and retrieve a JSON document
• Documents are schemaless
• Metadata is added to the document to aid querying
• Indexing of documents and metadata speeds up retrieval
• Example:
• CouchDB
• MongoDB
• RavenDB
• Azure DocumentDB service (Preview)
• Good at:
• Natural data modeling
• Programmer friendly
• Rapid development
• Web friendly
• CRUD
© Copyright Dimension Data 1318 April 2023
NoSQL Database Main Types
• Graph Database• Uses graph structures with nodes, edges, and properties to represent
and store data
• Every element contains a direct pointer to its adjacent elements
• Example:
• AllegroGraph
• InfoGrid
• Neo4j
• Good at:
• Complicated graph problems
• Topographical data
• Fast
© Copyright Dimension Data 1418 April 2023
NoSql Database Type Comparison
Data Model Performance Scalability Flexibility Complexity Functionality
Key–Value Store
high high high none variable (none)
Column-Oriented Store
high high moderate low minimal
Document-Oriented Store
high variable (high) high low variable (low)
Graph Database
variable variable high high graph theory
Relational Database
variable variable low moderaterelational algebra
© Copyright Dimension Data 1518 April 2023
Things to consider when choosing
• Where are you starting from?
• What are you trying to accomplish?
• Things to Consider...• Your Problem
• Access pattern, scalability, consistency, durability
• Money
• Scaling, admins, license, operating cost
• Programming
• Flexible schema, JSON, REST, language, graphs
• Performance
• Reads, writes, consistency, workload, eventual consistency
• Features
• Cross datacenter, upgrades, indexes, persistence, tunability
• The vendor
• Viability, future direction, responsiveness, partnerships
© Copyright Dimension Data 1618 April 2023
Big Data – Petabyte range
Microsoft HDInsight
=
Hadoop as a service on Azure (+ .NET)
© Copyright Dimension Data 1718 April 2023
Hadoop components
© Copyright Dimension Data 1818 April 2023
Using Hadoop
© Copyright Dimension Data 1918 April 2023
Hadoop cluster size
Yahoo! wins with a massive 42000 node cluster
© Copyright Dimension Data 2018 April 2023
Questions
USE [Euricom]
SELECT [Question]
FROM [dbo].[FAQ]
WHERE [Answer] IS NULL
(0 row(s) affected)