commodity hardware). Created in April 2005 by Damien Katz ...tjm/seminars/nosql2012/CouchDb.pdf ·...

34

Transcript of commodity hardware). Created in April 2005 by Damien Katz ...tjm/seminars/nosql2012/CouchDb.pdf ·...

Page 1: commodity hardware). Created in April 2005 by Damien Katz ...tjm/seminars/nosql2012/CouchDb.pdf · CouchDB (Couch is an acronym for cluster of unreliable commodity hardware). Created
Page 2: commodity hardware). Created in April 2005 by Damien Katz ...tjm/seminars/nosql2012/CouchDb.pdf · CouchDB (Couch is an acronym for cluster of unreliable commodity hardware). Created

CouchDB (Couch is an acronym for cluster of unreliablecommodity hardware).

Created in April 2005 by Damien Katz, former Lotus Notesdeveloper at IBM.

First released as an open source project under the GNUGeneral Public License.

In February 2008, it became an Apache Incubator projectand the license was changed to the Apache License.

Written in Erlang

Introduction

Page 3: commodity hardware). Created in April 2005 by Damien Katz ...tjm/seminars/nosql2012/CouchDb.pdf · CouchDB (Couch is an acronym for cluster of unreliable commodity hardware). Created

Independet FacebookDevelopersMobile Developers(iPhone,Android)Bloggers

Page 4: commodity hardware). Created in April 2005 by Damien Katz ...tjm/seminars/nosql2012/CouchDb.pdf · CouchDB (Couch is an acronym for cluster of unreliable commodity hardware). Created

NoSQL

Describes a broad range of databasetechnologies which are non-relational.

Focused towards solving specific problemswhich are not suited for relational storage.

Page 5: commodity hardware). Created in April 2005 by Damien Katz ...tjm/seminars/nosql2012/CouchDb.pdf · CouchDB (Couch is an acronym for cluster of unreliable commodity hardware). Created

A database designed to run on theinternet of today for today’s desktop-like applications and the connecteddevices through which we access theinternet.

If your data is truly relational, stickwith RDBMS

NoSQL

Page 6: commodity hardware). Created in April 2005 by Damien Katz ...tjm/seminars/nosql2012/CouchDb.pdf · CouchDB (Couch is an acronym for cluster of unreliable commodity hardware). Created

Relational Database

SQL

Tabular storage of data

Replacement for relational databases

Page 7: commodity hardware). Created in April 2005 by Damien Katz ...tjm/seminars/nosql2012/CouchDb.pdf · CouchDB (Couch is an acronym for cluster of unreliable commodity hardware). Created

JSON document-oriented DB (NoSQL)

database is made up collections

You can think of collections as tables from relationaldatabases

Collections are made up of zero or more documents.

You can think of documents as a row from relationaldatabases

Schema free: No need to design your tables, you can simplystart storing new values

Page 8: commodity hardware). Created in April 2005 by Damien Katz ...tjm/seminars/nosql2012/CouchDb.pdf · CouchDB (Couch is an acronym for cluster of unreliable commodity hardware). Created

No wasting storage on empty, or null fields.

Web Server /Application Server: Write a client side application that

talks directly to the Couch without the need for a server side middlelayer.

Having the database stored locally, your client side application can

run with almost no latency.

Data replication model: devices (like phones) that can go offline andhandle data sync for you when the device is back online.

Add attachments to documents

Scalable and fault tolerant

Page 9: commodity hardware). Created in April 2005 by Damien Katz ...tjm/seminars/nosql2012/CouchDb.pdf · CouchDB (Couch is an acronym for cluster of unreliable commodity hardware). Created

Use RESTful Interface to store JSON documents:

Data creation/replication/insertion, every management and

data task can be done via HTTP.

REST=Representational State Transfer

Use map/reduce query written in JavaScript

Faster than SQL because of using pointers instead of joints

Page 10: commodity hardware). Created in April 2005 by Damien Katz ...tjm/seminars/nosql2012/CouchDb.pdf · CouchDB (Couch is an acronym for cluster of unreliable commodity hardware). Created

CouchDB provides ACID semantics.It does this by implementing a form of Multi Version ConcurrencyControl (MVCC), meaning that CouchDB can handle a high volume ofconcurrent readers and writers without conflict.

Distributed Architecture with Replication CouchDB was designedwith bi-direction replication (or synchronization) and off-lineoperation in mind. That means multiple replicas can have theirown copies of the same data, modify it, and then sync thosechanges at a later time.

ACID Semantics

Page 11: commodity hardware). Created in April 2005 by Damien Katz ...tjm/seminars/nosql2012/CouchDb.pdf · CouchDB (Couch is an acronym for cluster of unreliable commodity hardware). Created

DocumentsJSON, or derivativesXML

Schema freeDocuments are independentNon relationalRun on large number of machinesData is partitioned and replicated amongthese machines

Page 12: commodity hardware). Created in April 2005 by Damien Katz ...tjm/seminars/nosql2012/CouchDb.pdf · CouchDB (Couch is an acronym for cluster of unreliable commodity hardware). Created

A document can contain any number of fields ofany length can be added to a document.

Fields can also contain multiple pieces of data.

Page 13: commodity hardware). Created in April 2005 by Damien Katz ...tjm/seminars/nosql2012/CouchDb.pdf · CouchDB (Couch is an acronym for cluster of unreliable commodity hardware). Created

1. FirstName=“Abo", Address=“Insinoorinkatu 60",Hobby=“swimming“

2. FirstName=“another Abo", Address=“Orivedenkatu 8",Children=(“unbornChild1”, -5", “unbornChild2”, -10",“unbornChild3”, -15").

Example

Page 14: commodity hardware). Created in April 2005 by Damien Katz ...tjm/seminars/nosql2012/CouchDb.pdf · CouchDB (Couch is an acronym for cluster of unreliable commodity hardware). Created

Some examples…

Large Data Sets

Web Related Data

Customizable Dynamic Entities

Persisted View Models

Page 15: commodity hardware). Created in April 2005 by Damien Katz ...tjm/seminars/nosql2012/CouchDb.pdf · CouchDB (Couch is an acronym for cluster of unreliable commodity hardware). Created

Each document has a unique field named“_id”Each document has a revision field named“_rev” (used for change tracking)

Page 16: commodity hardware). Created in April 2005 by Damien Katz ...tjm/seminars/nosql2012/CouchDb.pdf · CouchDB (Couch is an acronym for cluster of unreliable commodity hardware). Created

Java Script Object Notation

lightweight data storage format based on a subset of

JavaScipt syntax

eg:{"Subject": "ASF turns 10","Author": "ajith","PostedDate": "2012-10-20","Tags": [

"Apache Software Foundation","Open source"

],"Body": "Recently Apache Software Foundation became 12 years old."

}

Page 17: commodity hardware). Created in April 2005 by Damien Katz ...tjm/seminars/nosql2012/CouchDb.pdf · CouchDB (Couch is an acronym for cluster of unreliable commodity hardware). Created

REST• REST stands for REpresentational State Transfer• Uses existing HTTP verbs (GET, POST, PUT,

DELETE)• URL contains identifiers.

Page 18: commodity hardware). Created in April 2005 by Damien Katz ...tjm/seminars/nosql2012/CouchDb.pdf · CouchDB (Couch is an acronym for cluster of unreliable commodity hardware). Created

1. REST API– curl (unix like OS)– cURL (windows)– GET/PUT/POST/DELETE

Page 19: commodity hardware). Created in April 2005 by Damien Katz ...tjm/seminars/nosql2012/CouchDb.pdf · CouchDB (Couch is an acronym for cluster of unreliable commodity hardware). Created

cURL is an open source, command lineutility for transferring data to andfrom a servercURL supports all common Internetprotocols, including SMTP, POP3, FTP,IMAP, GOPHER, HTTP and HTTPSExamples:

curl –X GET http://www.bing.com/search?q=couchdb

Page 20: commodity hardware). Created in April 2005 by Damien Katz ...tjm/seminars/nosql2012/CouchDb.pdf · CouchDB (Couch is an acronym for cluster of unreliable commodity hardware). Created

Check server versioncurl http://localhost:5984

Create databasecurl –X PUT http://localhost:5984/albums

Delete databasecurl –X Delete http://localhost:5984/cds

Page 21: commodity hardware). Created in April 2005 by Damien Katz ...tjm/seminars/nosql2012/CouchDb.pdf · CouchDB (Couch is an acronym for cluster of unreliable commodity hardware). Created

Get a UUIDcurl http://localhost:5984/_uuids

Create documentcurl –X POST http://localhost:5984/albums-d “{ \”artist\” : \”The Decembrists\” }”–H “Content-Type: application-json”

Get document by IDcurl

http://localhost:5984/artists/a10a5006d96c9e174d28944994042946

Page 22: commodity hardware). Created in April 2005 by Damien Katz ...tjm/seminars/nosql2012/CouchDb.pdf · CouchDB (Couch is an acronym for cluster of unreliable commodity hardware). Created

What is a view?View in CouchDB context

A "show" that directly renders a document using JavaScript

MapReduceTwo types

Permanent viewIndexedJSON for the view is stored as a design document

Temporary viewSent via a HTTP POSTComputed on the fly

Creating a view using Futon

Page 23: commodity hardware). Created in April 2005 by Damien Katz ...tjm/seminars/nosql2012/CouchDb.pdf · CouchDB (Couch is an acronym for cluster of unreliable commodity hardware). Created

Takes data in and transforms it intosomething else.

Output is a key/value pair

Keys can be complex types

Like a .Select() in LINQ

Page 24: commodity hardware). Created in April 2005 by Damien Katz ...tjm/seminars/nosql2012/CouchDb.pdf · CouchDB (Couch is an acronym for cluster of unreliable commodity hardware). Created

Takes in a set of intermediate values and combinesthem into a single value.

Reduce needs to be able to accept results from themap function AND the reduce function itself.

Like an .Aggregate() in LINQ

Page 25: commodity hardware). Created in April 2005 by Damien Katz ...tjm/seminars/nosql2012/CouchDb.pdf · CouchDB (Couch is an acronym for cluster of unreliable commodity hardware). Created

var test=new [] {"1","2","3”};var mapped = test.Select(Int32.Parse);var reduced = mapped.Aggregate((sum,i)=>sum+=i);

Page 26: commodity hardware). Created in April 2005 by Damien Katz ...tjm/seminars/nosql2012/CouchDb.pdf · CouchDB (Couch is an acronym for cluster of unreliable commodity hardware). Created

"map": "function(doc){emit(doc._id, parseInt(doc.value));

}","reduce": "function(keys,values) {

return sum(values);}"

Page 27: commodity hardware). Created in April 2005 by Damien Katz ...tjm/seminars/nosql2012/CouchDb.pdf · CouchDB (Couch is an acronym for cluster of unreliable commodity hardware). Created

Map function (extracting data) is executed on every document in thedatabase.Emits key/value pairs (Can emit 0, 1, or more KeyValue pairs for eachdocument in the database)key/value pairs are then ordered and indexed by keyQuery types:

Exact: key = xRange: key is between x and yMultiple: key is in list (x,y,z)

Reduce functions(data aggregation)e.g. count, sum, group

Page 28: commodity hardware). Created in April 2005 by Damien Katz ...tjm/seminars/nosql2012/CouchDb.pdf · CouchDB (Couch is an acronym for cluster of unreliable commodity hardware). Created

Map function

Key, Value

Key, Value

Key, Value

Key, Value

All Documents

Query by KeyKey, Value

Key, Value

Page 29: commodity hardware). Created in April 2005 by Damien Katz ...tjm/seminars/nosql2012/CouchDb.pdf · CouchDB (Couch is an acronym for cluster of unreliable commodity hardware). Created

Futon is a simple web admin for managingCouchDB instances and is accessible athttp://127.0.0.1:5984/_utils/Used for setting server configurationAllows for database administration (create/delete,compact/cleanup, security)Allows for CRUD operations on documentsCreating and testing viewsCreating design documents

Page 30: commodity hardware). Created in April 2005 by Damien Katz ...tjm/seminars/nosql2012/CouchDb.pdf · CouchDB (Couch is an acronym for cluster of unreliable commodity hardware). Created
Page 31: commodity hardware). Created in April 2005 by Damien Katz ...tjm/seminars/nosql2012/CouchDb.pdf · CouchDB (Couch is an acronym for cluster of unreliable commodity hardware). Created

• Clients available for many languages– C, C#, Erlang, Java, JavaScript, Perl, PHP,

Python,Ruby & many more..

Page 32: commodity hardware). Created in April 2005 by Damien Katz ...tjm/seminars/nosql2012/CouchDb.pdf · CouchDB (Couch is an acronym for cluster of unreliable commodity hardware). Created
Page 33: commodity hardware). Created in April 2005 by Damien Katz ...tjm/seminars/nosql2012/CouchDb.pdf · CouchDB (Couch is an acronym for cluster of unreliable commodity hardware). Created

Replication and synchronization capabilities of CouchDBmake it ideal for using it in mobile devices, wherenetwork connection is not guaranteed but the applicationmust keep on working offline.

CouchDB is well suited for applications withaccumulating, occasionally changing data, on which pre-defined queries are to be run and where versioning isimportant (CRM, CMS systems, by example).

Master-master replication is an especially interestingfeature, allowing easy multi-site deployments.

Use cases & production deployments

Page 34: commodity hardware). Created in April 2005 by Damien Katz ...tjm/seminars/nosql2012/CouchDb.pdf · CouchDB (Couch is an acronym for cluster of unreliable commodity hardware). Created

Thank You!