Rolling With Riak
-
Upload
john-lynch -
Category
Technology
-
view
7.441 -
download
0
description
Transcript of Rolling With Riak
Web App Developers
ORMFocus on the AppSpeed of DevelopmentDB Agnostic(ish)
Fixed SchemaLimits design choicesMigration HellScaling at DB layer
Rails!
Shiny New Toys
Decades of research and best practicesAwesome ad-hoc query capabilityZillions of vendors/tools/libraries/code
Flexibility of schema-less designAbility to scale…. Web-scale
Web Scale Changing App Types
SocialGamesMarketing / Advertising
Freemium Business Models1M free => 10K paying customers
NoSQL Landscape
Pure Key/Value (Redis/Tokyo Cabinet/etc)
Key/Value+ (CouchDB/MongoDB/Riak)
BigTable Type (Hbase, HyperTable)
Choose wisely! No standard API.
(Good general overview can be found here: http://cattell.net/datastores/Datastores.pdf)
MongoDB
Popular with Ruby community Combines Key/Value with ability to do
Indexed Queries
Scaling MongoDB
Master, Slave, Replica Set, Replica Pair, Shard Server, Connection Pool, ack!
Scaling MongoDB
If all you want is NoSQL…
NoSQL on MySQL
Leverages all MySQL skills, tools, techniques, stability, dependability
If you want NoSQL + Scalability…
…not so much.
Riak
Developed by Basho.com Used on several large production sites Written in Erlang Distributed – Fault Tolerant Buckets – Keys – Values Values can be anything (json,binary,etc) Ruby & Rails Client (Ripple project @ Github)
Riak speaks HTTP
> curl –i http://host:8098/riak/bucket1/key1
HTTP/1.1 200 OK X-Riak-Vclock: awpcFAA== Content-Type: text/plain Content-Length: 9Last-Modified: Wed, 01 Se…Etag: 45364657
I am a value
Leverage existing HTTP infrastructure, tools, etc
Scaling Riak
Riak Riak Riak Riak Riak
Http Load BalancerVarnish (cache)
Standard HTTP Protocol
Rails Rails Rails Rails Rails
Scaling Riak (alt)
RiakRailsNginx
RiakRailsNginx
RiakRailsNginx
RiakRailsNginx
RiakRailsNginx
Http Load BalancerVarnish (cache)
Standard HTTP Protocol
Key Differentiator - Distributed Inspired by Amazon’s Dynamo Uses consistent hashing algorithim No “Master Node” No single point of failure Any node can service any request Automatically rebalances as nodes join Tunable CAP Properties
Consistency, Availability, Partition Tolerance
N R W
N = # of copies of the data R = # of nodes necessary to read W = # of nodes necessary to write
Tunable by the application, on a per-bucket and per-query basis
Riak cluster of 4 Physical Computers
Low Value Data (N=2 R=1 W=1)Logging
Web Content (N=4 R=1 W=4)Maximum availability and consistency
Financial Data (N=4 R=1 W=4 DW=4)DW is “Durable Write”
Network Split
Network Split
Network Split
Map/Reduce
Map steps run on each node Final reduce runs on single node
results = Riak::MapReduce.new(client). add(“albums”). map("function(v){ return [JSON.parse(v.values[0].data).title]; }", :keep => true).run
Links
Riak documents can have links to other documents, each link can be “tagged”
Link data is separate from doc data Easy URL access to walk these links
GET /riak/artists/TheBeatles/albums,_,_/tracks,_,1
When NOT to use Riak
Single machine Small scale or bog-standard apps Need rich ad-hoc indexed queries Need mature tools and libraries
Any questions?
(First round at Rock Bottom generously sponsored by Basho.com)