Scaling Web Applications With Cassandra Presentation

introduction to cassandraeben hewitt

september 29. 2010web 2.0 exponew york city

director, application architecture at a global corp

focus on SOA, SaaS, Events

i wrote this

@ebenhewitt

agendacontextfeaturesdata modelapi

nosql big datamongodbcouchdbtokyo cabinetredisriakwhat about?Poet, Lotus, Xindicetheyve been around foreverrdbms was once the new kid

innovation at scalegoogle bigtable (2006)consistency model: strongdata model: sparse mapclones: hbase, hypertableamazon dynamo (2007)O(1) dhtconsistency model: client tune-ableclones: riak, voldemort

cassandra ~= bigtable + dynamo

provenThe Facebook stores 150TB of data on 150 nodes

web 2.0

used at Twitter, Rackspace, Mahalo, Reddit, Cloudkick, Cisco, Digg, SimpleGeo, Ooyala, OpenX, others

cap theoremconsistencyall clients have same view of dataavailabilitywriteable in the face of node failurepartition toleranceprocessing can continue in the face of network failure (crashed router, broken network)

daniel abadi: pacelc

write consistencyread consistency

LevelDescriptionZEROGood luck with thatANY1 replica (hints count)ONE1 replica. read repair in bkgndQUORUM (DCQ for RackAware)(N /2) + 1ALLN = replication factor

LevelDescriptionZEROUmmmANYTry ONE insteadONE1 replicaQUORUM (DCQ for RackAware)Return most recent TS after (N /2) + 1 reportALLN = replication factor

cassandra propertiestuneably consistentvery fast writeshighly availablefault tolerantlinear, elastic scalabilitydecentralized/symmetric~12 client languages Thrift RPC API~automatic provisioning of new nodes0(1) dht big data

write op

Staged Event-Driven ArchitectureA general-purpose framework for high concurrency & load conditioningDecomposes applications into stages separated by queuesAdopt a structured approach to event-driven concurrency

instrumentation

data replication

partitioner smack-downRandom Preservingsystem will use MD5(key) to distribute data across nodeseven distribution of keys from one CF across ranges/nodes

Order Preservingkey distribution determined by tokenlexicographical orderingrequired for range queries scan over rows like cursor in indexcan specify the token for this node to usescrabble distribution

structure

keyspace~= databasetypically one per applicationsome settings are configurable only per keyspace

column familygroup records of similar kindnot same kind, because CFs are sparse tablesex:UserAddressTweetPointOfInterestHotelRoom

think of cassandra as row-orientedeach row is uniquely identifiable by keyrows group columns and super columns

column familyn= 42user=ebenkey123key456user=alisonicon=

nickname=The Situation

json-like notationUser {123 : { email: [email protected], icon: },

456 : { email: [email protected], location: The Danger Zone}}

0.6 example$cassandra f$bin/cassandra-cli cassandra> connect localhost/9160

cassandra> set Keyspace1.Standard1[eben][age]=29cassandra> set Keyspace1.Standard1[eben][email][email protected]> get Keyspace1.Standard1[eben'][age']=> (column=6e616d65, value=39, timestamp=1282170655390000)

a column has 3 partsnamebyte[]determines sort orderused in queriesindexedvaluebyte[]you dont query on column valuestimestamplong (clock)last write wins conflict resolution

column comparatorsbyteutf8longtimeuuidlexicaluuid

ex: lat/long

super columnsuper columns group columns under a common name

PointOfInterestsuper column familyCentral Park10017

Empire State Bldg

Phoenix Zoo85255desc=Fun to walk in.phone=212. 555.11212desc=Great view from 102nd floor!

PointOfInterest { key: 85255 { Phoenix Zoo { phone: 480-555-5555, desc: They have animals here. }, Spring Training { phone: 623-333-3333, desc: Fun for baseball fans. }, }, //end phx

key: 10019 { Central Park { desc: Walk around. It's pretty.} , Empire State Building { phone: 212-777-7777, desc: Great view from 102nd floor. } } //end nyc}ssuper columnsuper column familyflexible schemakeycolumn super column family

about super column familiessub-column names in a SCF are not indexedtop level columns (SCF Name) are always indexedoften used for denormalizing data from standard CFs

slice predicatedata structure describing columns to returnSliceRangestart column namefinish column name (can be empty to stop on count)reversecount (like LIMIT)

read apiget() : Columnget the Col or SC at given ColPath COSC cosc = client.get(key, path, CL);

get_slice() : Listget Cols in one row, specified by SlicePredicate: List results = client.get_slice(key, parent, predicate, CL);

multiget_slice() : Mapget slices for list of keys, based on SlicePredicate Map results = client.multiget_slice(rowKeys, parent, predicate, CL);

get_range_slices() : List returns multiple Cols according to a rangerange is startkey, endkey, starttoken, endtoken: List slices = client.get_range_slices( parent, predicate, keyRange, CL);

write apiclient.insert(userKeyBytes, parent, new Column(band".getBytes(UTF8), Funkadelic".getBytes(), clock), CL);

batch_mutatevoidbatch_mutate( map, CL)removevoidremove(byte[], ColumnPathcolumn_path,Clock,CL)

batch_mutate//create paramMap mutationMap = new HashMap();

//create Cols for MutsColumn nameCol = new Column("name".getBytes(UTF8),Funkadelic.getBytes("UTF-8"), new Clock(System.nanoTime()););Mutation nameMut = new Mutation();nameMut.column_or_supercolumn = nameCosc; //also phone, etc

Map muts = new HashMap();List cols = new ArrayList();cols.add(nameMut);cols.add(phoneMut);muts.put(CF, cols);//outer map key is a row key; inner map key is the CF namemutationMap.put(rowKey.getBytes(), muts);//send to serverclient.batch_mutate(mutationMap, CL);

raw thrift: for masochists only

pycassa (python)fauna (ruby)hector (java)pelops (java)kundera (JPA)hectorSharp (C#)

what aboutSELECT WHEREORDER BYJOIN ON GROUP

?

rdbms: domain-based model what answers do I have?

cassandra: query-based model what questions do I have?

SELECT WHEREcassandra is an index factory

USERKey: UserIDCols: username, email, birth date, city, stateHow to support this query?

SELECT * FROM User WHERE city = Scottsdale

Create a new CF called UserCity:USERCITYKey: cityCols: IDs of the users in that city.Also uses the Valueless Column pattern

Use an aggregate key state:city: { user1, user2}

Get rows between AZ: & AZ; for all Arizona users

Get rows between AZ:Scottsdale & AZ:Scottsdale1 for all Scottsdale usersSELECT WHERE pt 2

ORDER BYRows are placed according to their Partitioner:

Random: MD5 of keyOrder-Preserving: actual key

are sorted by key, regardless of partitionerColumns are sorted according to CompareWith or CompareSubcolumnsWith

is cassandra a good fit?you need really fast writesyou need durabilityyou have lots of data > GBs>= three serversyour app is evolvingstartup mode, fluid data structureloose domain data points of interest

your programmers can dealdocumentationcomplexityconsistency modelchangevisibility toolsyour operations can dealhardware considerationscan move dataJMX monitoring

thank you!@ebenhewitt

**************

Scaling Web Applications With Cassandra Presentation

Documents

Transcript of Scaling Web Applications With Cassandra Presentation