What can we learn from NoSQL technologies?

Post on 04-Jul-2015

495 views 6 download

Tags:

description

I presented these slides for the first time at the Percona Live Conference 2013 in Santa Clara

Transcript of What can we learn from NoSQL technologies?

Ivan Zoratti

What Can We Learn FromNoSQL Technologies?Percona Live Santa Clara

V1304.01 Friday, 3 May 13

Who is Ivan

?Friday, 3 May 13

SkySQL

•Leading provider of open source databases, services and solutions

•Home for the founders and the original developers of the core of MySQL

•The creators of MariaDB, the drop-off, innovative replacement of MySQL

Friday, 3 May 13

NoSQL TechnologiesFriday, 3 May 13

PAGE

%SQL?

•SQL

•NoSQL

•NewSQL

5Friday, 3 May 13

PAGE

[Allegedly] Reasons to adopt NoSQL•Not all the needs for a database

fit with the relational model•Key/value stores?•Who needs ACID?•Who needs schemas?

•Relational databases cannot handle many modern workloads

•Scalability is an issue in general•RDBMSs are pretty inflexible•There is no elasticity

•Schemas and administration is too complicated, especially during the development phase

•SQL is unnecessarily complicated

•NOSQL = Not Only SQL

6Friday, 3 May 13

PAGE

NoSQL vs SQL

7

NoSQL•Schema-less (or dynamic schema)•Dynamic horizontal scaling•Good to store and retrieve a great quantity of

data•Great Flexibility•Full ACID not required - “BASE is better”•Basically Available, Soft state, Eventually

consistent•Objects: Collections, Documents, Fields•NoSQL DBs:•Key/Value•BigTable•Document•Graph

(My)SQL•Rigid Schema design•Static or no horizontal scaling•Good to store and retrieve data that has

relationship between the elements•Pretty inflexible•ACID as a given•Atomic, Consistent, Isolated, Durable

•Objects: Tables, Rows, Columns•SQL DBs:•Row-based•Columnar•Object Relational

Friday, 3 May 13

PAGE

Understanding the CAP Theorem

•CA•Synchronous Replication•Two Phase Commit•MySQL, ACID/RDBMSs

•CP•MongoDB, HBase, Redis, MemcacheD

•AP•Cassandra, Riak, CouchDB

8

C A

P

Friday, 3 May 13

PAGE 9Friday, 3 May 13

PAGE

The NoSQL Ecosystem

10Friday, 3 May 13

Friday, 3 May 13

PAGE

When is MySQL a good fit?

•Complex (but well defined) schema

•ACID and Consistency as a must

•Interaction/Integration with tools and applications that speak MySQL

•Typically “simple” data

•Data “limited” in size

•Application-based scalability

•Applications require “complex” queries (read: joins)

•Many developers / Few DBs

•In-house expertise

12Friday, 3 May 13

PAGE

When is NoSQL a good fit?•Schema-less for startup

applications

•Performance is more important than consistency and ACID features

•Documents, binary data and more

•Lots of data, unstructured

•Scalability and elasticity out of the box will solve lots of problems

•Applications mainly have “simple” queries (read: access to single tables, by key or simple conditions)

•One man job (for each module)

13Friday, 3 May 13

NewSQLFriday, 3 May 13

My[No]SQL CookbookFriday, 3 May 13

PAGE

Handler Socket

16

Handler Interface

Innodb MyISAM Other storage engines …

SQL Layer Handlersocket Plugin

Listener for libmysql

libmysql libhsclient

Applications

mysqld

client app

Friday, 3 May 13

PAGE

InnoDB and Memcached

17

http://dev.mysql.com/doc/refman/5.6/en/innodb-memcached-intro.html

Friday, 3 May 13

PAGE

Virtual Columns

•For InnoDB, MyISAM and Aria

•Column content is dynamically generated or materialised (but only from the row)

•PERSISTENT (stored) or VIRTUAL (generated)

18

CREATE TABLE t3 ( c1 int(11) NOT NULL AUTO_INCREMENT, c2 text, char_count int(11) AS ( LENGTH( c2 ) ) PERSISTENT, word_count int(11) AS ( LENGTH( c2 ) - LENGTH( REPLACE( c2, ' ', '' ) ) +1 ) PERSISTENT, PRIMARY KEY (`c1`)) ENGINE=InnoDB;

Friday, 3 May 13

PAGE

Dynamic Columns

•Implement a schema-less, document store

•Options for COLUMN_ CREATE, ADD, GET, LIST, JSON, EXISTS, CHECK, DELETE

•Nested colums are allowed

•Main datatypes are allowed

•Documents are <=1GB

19

CREATE TABLE assets ( item_name VARCHAR(32) PRIMARY KEY, dynamic_cols BLOB );

INSERT INTO assets VALUES ( 'MariaDB T-shirt', COLUMN_CREATE( 'color', 'blue', 'size', 'XL' ) );INSERT INTO assets VALUES ( 'Thinkpad Laptop', COLUMN_CREATE( 'color', 'black', 'price', 500 ) );

SELECT item_name, COLUMN_JSON( dynamic_cols ) FROM assets;+-----------------+----------------------------------------+| item_name | COLUMN_JSON(dynamic_cols) |+-----------------+----------------------------------------+| MariaDB T-shirt | {"size":"XL","color":"blue"} || Thinkpad Laptop | {"color":"black","warranty":"3 years"} |+-----------------+----------------------------------------+

Friday, 3 May 13

PAGE

Sphinx

•Available as storage engine SphinxSE or external search server

•Write operations through SphinxQL

•Joins with non-Sphinx tables are allowed

20

CREATE TABLE t1 ( id INTEGER UNSIGNED NOT NULL, weight INTEGER NOT NULL, query VARCHAR(3072) NOT NULL, group_id INTEGER, INDEX( query )) ENGINE=SPHINX CONNECTION = "sphinx://localhost:9312/test";

SELECT * FROM t1 WHERE query = 'test it;mode=any';

SELECT content, date_added FROM test.documents docs JOIN t1 ON ( docs.id = t1.id ) WHERE query = ‘one document;mode=any;;

Friday, 3 May 13

PAGE

Map/Reduce approach

•Available with InfiniDB and ScaleDB

•Experimental with MySQL Proxy and Gearman

21Friday, 3 May 13

PAGE

Additions to the core MySQL

•MySQL Cluster/NDB

•Galera

•ScaleDB

•Continuent

•ScaleBase

•ScaleArc

•CodeFutures

22Friday, 3 May 13

Things to Improve in MySQLFriday, 3 May 13

PAGE

Sharding - Sharding - Sharding!

24

SELECT ... FROM T4 WHERE ID BETWEEN X AND Y

The query is sent to all the shards

Friday, 3 May 13

PAGE

Eventual Consistency

25

DatabaseDatabase

Database Database Database

Client Applications

Communication Protocol Communication Protocol Communication Protocol Communication Protocol Communication Protocol

Outbound Protocol Outbound Protocol Outbound Protocol Outbound Protocol Outbound Protocol

binlog binlog binlog binlog binlog

Friday, 3 May 13

PAGE

Multiple Communication Protocols

26

192.168.0.10

MySQL Client

3306

JSON Client

80

ODATA Client

8080

192.168.0.20

192.168.0.1

192.168.0.30

Friday, 3 May 13

PAGE

Cassandra Storage Engine•Column Family == Table•Rowkey, static and dynamic

columns allowed•Batch key access supportSET cassandra_default_thrift_host = '192.168.0.10'

CREATE TABLE cassandra_tbl ( rowkey INT PRIMARY KEY, col1 VARCHAR(25), col2 BIGINT, dyn_cols BLOB DYNAMIC_COLUMN_STORAGE = yes ) ENGINE = cassandra KEYSPACE = 'cassandra_key_space' COLUMN_FAMILY = 'column_family_name';

27Friday, 3 May 13

PAGE

Connect Storage Engine•Any file format as MySQL TABLE:

•ODBC•Text, XML, *ML•Excel, Access etc.

•MariaDB CREATE TABLE options•Multi-file table•Table Autocreation

•Condition push down

•Read/Write and Multi Storage Engine Join

•CREATE INDEX

28

CREATE TABLE handoutENGINE = CONNECTTABLE_TYPE = XMLFILE_NAME = 'handout.htm'HEADER = yes OPTION_LIST = 'name = TABLE, coltype = HTML, attribute = (border=1;cellpadding=5)';

Friday, 3 May 13

PAGE

Join us at the Solutions Day

•Cassandra and Connect Storage Engine

•Map/Reduce approach - Proxy optimisation

•Multiple protocols and more

29Friday, 3 May 13

Thank You!

ivan@skysql.comizoratti.blogspot.com

www.slideshare.net/izorattiwww.skysql.com

Friday, 3 May 13